Software Clone Evolution Prediction

Babloo Kumar

Babloo Kumar

Varanasi, Uttar Pradesh

1 0
  • 0 Collaborators

Predicting the number of exact-match and near-miss clone sets in the upcoming versions of an open-source software application ...learn more

Project status: Published/In Market

Artificial Intelligence

Links [1]

Overview / Usage

During software evolution, there is a tendency of code fragments being copied or modified slightly in the same as well as subsequent versions, giving rise to exact-match and near-miss clones. Cloned code fragments have a bad effect on the software quality and maintenance. If we can detect these recurring code fragments and model them, it can be immensely helpful in the software maintenance activities. In this project, we mainly explored machine learning strategies for temporal analysis of software clone evolution using software metrics. The detection of clones in a large software system is challenging as it depends on the internal design of software modules and methods. Object-oriented metrics like DIT, NOC, WMC, LCOM, and Cyclomatic complexity can be used as good indicators of clone contents.

Methodology / Approach

In the first phase of the project, we extracted the exact-match and near-miss clones from an open-source software application and also the object-oriented metrics in each version of the software. modeled the number of clone sets in the different versions of the software as a time-series. We used machine learning methods for the multivariate time-series modeling to forecast the number of EMCS/NMCS in the upcoming versions of the software.

In the second phase, we used advanced machine learning algorithms for the time-series modeling of the clone datasets. We used the multi-objective genetic algorithm to train a feedforward neural network giving prediction intervals, optimizing the mean prediction interval width and prediction interval coverage probability. This model yielded better accuracy than the conventional ARIMA and backpropagation methods.

Technologies Used

CloneDr, Eclipse (Metrics plugin), R (Time series Forecasting), Matlab, Weka

Comments (0)