Professional Documents
Culture Documents
95
approaches are focused on two or three perspectives. Hammad et #vlsergey@gmail.com
#lharper5@kc.rr.com
al. propose an approach to analyzing the change propagation
between source code and UML classes [5]. Khan et al. analyze the Person
#ilias@lazaridis.com
change propagation between requirements and architectural
components [6]. Xiao et al. propose an approach to analyzing the
change impact between business process specifications and source #evol@cyram.com
code, to estimate the cost of a business process change in a service #fillg1@web.de
oriented business application [7]. Lehnert et al. present an #Bug_41508 #Bug_38718 #Bug_65102
#42bd9b01a3f5cafa
approach to change impact analysis among requirement models, 3c20186dd8b7608
#730ec55536b5ef14
architectural models and source code, by a set of predefined Bug
2bbace222c0f642
impact propagation rules [8]. Commit
96
In order to assess direct change impact between two software 5. EXPERIMENTS
elements, we identify proper dependency features from the
constructed linked data model. Taking requirement, class and 5.1 Setting
code as an example, their dependency features are listed in Table We designed four experiments upon open source project data,
1. Based on these features, we use logistic regression model to including a class-class change impact experiment, a requirement-
calculate the change impact degree between two software requirement change impact experiment, a requirement-class
elements. Different feature contributes differently, and therefore change impact experiment, and a requirement-class-code change
their weights vary. Thus change impact matrix FS is defined as impact experiment.
FS = X , (1) 1) Dataset
where Xi denotes the value of dependency feature, 0 or 1, on In order to evaluate our approach, we used software development
behalf of the existence of the dependency; , whose range is (0,1], data from two open source projects HtmlUnit and OpenRocket, as
denotes the weight on the feature. Through experiments with real shown in Table 2.
data, we can select the optimal weight on each feature. Table 2. Dataset.
Note that FS also represents a weighted change impact graph, Data HtmlUnit OpenRocket
with software elements as nodes, and change impacts as edges.
Number of requirement element 223 19
4.2 Change Impact Propagation in Multiple Number of code comments 2911 3164
Steps Number of bug reports 1685 18
The change will propagate further among heterogeneous software
artifacts. Based on the random walk theory on graph [11], we 111M 17.5M
Size of source code
design a change impact propagation algorithm in multiple steps. 852 classes 828 classes
A random walk on a graph is a special case of a Markov chain. In Number of developers 80 13
a Markov chain, the future state at time n+1 is based solely on the From these data, a small set of software engineering ontology was
current state at time n and its respective outgoing edges. Similarly, built, including 17 concepts, e.g., Person, Bug, Commit, and
change impact degree of the k+1th step is based solely on impact Requirement. Then 41762 data instances and about 1 million
degree of the kth step and the structural context of each node in the attribute links and instance links were extracted, allowing a large
FS graph. set of software engineering linked data to be constructed for the
Let MS(A, B) denote the change impact of element A on element four experiments.
B. We write a recursive equation for MS(A, B) on the k+1th step. 2) Comparison Method
, 0 There are only a few studies on multi-perspective change impact
, = | | (2) analysis. We selected Lehnert, a state-of-the-art change impact
, > 0
| | analysis method, for comparison. Lehnert outperforms many other
where |N(A)| is the number of As neighbors, Ni(A) the ith existing methods in [8].
neighbor of A, and d a decay factor during propagation. d is
3) Evaluation Metrics
between 0 and 1 and here we set d = 0.8 [12].
To evaluate the experimental results, we randomly selected 20%
This iterative process is recursively computed by the above
change impacted element set. Then ten skilled developers with
equation until |MSk+1(A, B) MSk(A, B)| < , where is a
more than 5 years of developing experience were asked to judge
tolerance factor defined for convergence judgment. We consider
the correctness of results. We used precision, recall, and f-
MS(A, B) = MSk+1(A, B) when |MSk+1(A, B) MSk(A, B)|< . Here
measure to measure the performance of our approach.
we set = 0.001 [12].
The multi-step change impact propagation algorithm is given as 5.2 Results and Analysis
follows. The experimental results are listed in Table 3 ~ Table 5.
Algorithm 1. Multi-Step Change Impact Propagation Table 3. Results of class-class change impact analysis on
HtmlUnit.
Input: A one-step change impact graph FS(V, E).
Output: A change impact matrix MS. Size of Set Number of Sets
Method: 1~5 1237
obtain the transition matrix T; 6~10 643
initialize MS0 to be an FS matrix; 11~15 192
k 0; 15~20 135
do MSk+1 T MSk ; >20 114
diagonal elements of MSk+1 1; In class-class change impact experiment on HtmlUnit project,
k k+1; there are 2321 change impact sets, as shown in Table 3. The
number of those with more than 2 classes is 2197, occupying a
while (max(|MSk MSk1 |)< ); larger proportion (94.66%). In those sets with 1 or 2 classes,
return MSk through deeper analyzing of class code, we find that most classes
97
have sole functions and their call graphs are very small, indicating between elements in multi-artifacts are established, including
that they rarely depend on the other classes. Most of the other requirements, classes, program files, bug reports, code committing
classes are utility classes or entry point ones. F-measure of this history, developer information and others. Then, a weighted
experiment is thus 94.2%. The experiment on the OpenRocket change impact matrix/graph is calculated by the dependency
project achieves the similar results. features extracted from the software engineering linked data, and
a random walk algorithm is designed to propagate the change
Table 4. Results of requirement-requirement change impact impact. Experimental results show that our approach is better than
analysis on HtmlUnit. the existing multi-perspective change impact analysis approaches,
Size of Set Number of Sets and it can propagate the change impacts across heterogeneous
artifacts stably.
1~2 153
3~4 119 7. ACKNOWLEDGMENTS
5~6 27 This research is supported by 973 Program in China (Grant No.
2015CB352203) and National Natural Science Foundation of
>6 8 China (Grant No. 61472242 and 61572312).
In requirement-requirement change impact analysis on the
HtmlUnit project, there are 307 change impact sets, as shown in 8. REFERENCES
Table 4. Since requirement specifications are preprocessed using [1] Arnold, Robert S. 2010. Software change impact analysis.
the NLP technique before analysis, and some specifications are Los Alamitos, CA: IEEE Computer Society Press.
incomplete or too simple, the F-measure of change impact [2] Sun X, Li B, Li B, et al. 2012. A comparative study of static
analysis is a little low (72.1%), and the size of change impact sets CIA techniques. In Proceedings of the Fourth Asia-Pacific
is small. The similar results are also obtained when the analysis Symposium on Internetware. 23-30.
was conducted on the OpenRocket project. [3] Fradet P, Le Mtayer D, Prin M.1999. Consistency
In requirement-class change impact analysis on the HtmlUnit checking for multiple view software architectures, In
project, there are 297 change impact sets and its F-measure is Proceedings of ESEC/FSE. 410-428.
88.7%, a value between those obtained from class-class and [4] Lehnert S. 2011. A review of software change impact
requirement-requirement analyses. The similar results are analysis, Tech. Report, Ilmenau University of Technology.
obtained in the requirement-class-code change impact experiment,
because HtmlUnit is developed in Java and thus there exists an [5] Maen Hammad, Michael L. Collard, and Jonathan I. Maletic.
almost one-one mapping from classes and program files. The 2009. Automatically identifying changes that impact code-to-
similar results are also obtained when the analysis was conducted design traceability. In Proceedings of the IEEE 17th
on the OpenRocket project. International Conference on Program Comprehension (May
2009). 2029.
Table 5. Method comparison. [6] Safoora Shakil Khan and Simon Lock. 2009. Concern tracing
HtmlUnit OpenRocket and change impact analysis: An exploratory study. In
Method Proceedings of the ICSE Workshop on Aspect-Oriented
Precision Recall F-measure Precision Recall F-measure Requirements Engineering and Architecture Design (May
Lehnert 87.9% 87.4% 87.6% 81.2% 80.1% 80.6% 2009). 4448.
[7] Xiao H, Quo J, Zou Y. 2007. Supporting change impact
Our 89.2% 88.2% 88.7% 88.5% 87.9% 88.2% analysis for service oriented business applications, In
approach Proceedings of International Workshop on Systems
Lehnert is the best multi-perspective change impact analysis Development in SOA Environments, in conjunction with
method that is publicly available. We compare it with our ICSE, 6-6.
approach in requirement-class-code change impact experiment. [8] Lehnert S. 2015. Multiperspective change impact analysis to
The comparison is shown in Table 5. support software maintenance and reengineering, Doctoral
Comparatively, Lehnert adopts a change impact rule based Thesis. University of Hamburg.
approach and takes different dependences with the same impact [9] Jain P, Hitzler P, Sheth A, et al. 2010. Ontology alignment
degree, while our approach extracts features from software for linked open data. In Proceedings of the Semantic Web
engineering linked data, and sets each feature with different ISWC, Springer Berlin Heidelberg, 402-417.
weight through an experiment on history data. Our approach also [10] Yuchen Zhang, Chengcheng Wan, Bo Jin, 2016. An
considers both the direct dependencies between two software Empirical study on recovering requirement-to-code links, In
elements, and indirect relations among persons, commits, bugs Proceedings of 17th IEEE/ACIS International Conference on
and others. Thereafter, our approach improves the precision of Software Engineering, Artificial Intelligence, Networking
prediction. Furthermore, the random walk algorithm for and Parallel/Distributed Computing. 121-126.
propagating the impacts can reduce false negative, allowing our
approach to outperform the Lehnert method in accuracy and [11] Lovsz, L, 1993. Random walks on graphs: a survey. Lecture
stability. Notes in Mathematics, 8(4):285-303.
[12] Li P, Li Z, He J, et al., 2009. Assessing the influence
6. CONCLUSION probability between objects: A random walker approach, In
In this paper we propose a general approach to multi-perspective Proceedings of IEEE Symposium on Computational
change impact analysis using linked data in software engineering. Intelligence and Data Mining, 25-3
Guided by software engineering ontologies, semantic links
98