You are on page 1of 20

Multiple-parameter coupling metrics for layered component-

based software
Abstract
Coupling represents the degree of interdependence between two software components.
Understanding software dependency is directly related to improving software
understandability, maintainability, and reusability. In this paper, we analyze the difference
between component coupling and component dependency, introduce a two-parameter
component coupling metric and a three-parameter component dependency metric. An
important parameter in both these metrics is coupling distance, which represents the
relevance of two coupled components. These metrics are applicable to layered
component-based software. These metrics can be used to represent the dependencies
induced by all types of software coupling. We show how to determine coupling and
dependency of all scales of software components using these metrics. These metrics are
then applied to Apache HTTP, an open-source web server. The study shows that coupling
distance is related to the number of modifications of a component, which is an important
indicator of component fault rate, stability and subsequently, component complexity.
1 Introduction

Component-based software development is a popular approach to improving the practice


of software engineering. Potential benefits of the approach include increased productivity
and quality, and decreased cost and time-to-market. Quite frequently, existing
components are not likely to be ready-to-use building blocks, especially in the case of
large-scale design-level reuse. Instead, these components need to be adapted and/or
modified to meet the specific requirements of the new product being developed.
Furthermore, just as the software product as a whole needs to be maintained, reused
software components also need to be periodically updated to meet new requirements or
changes in the operating environment. Therefore, software components also need to be
designed with maintainability as an important consideration. Hence, reusability and
maintainability are two important properties of a software component (Lim 1994; Frakes
and Succi 2001).

Components are building blocks of a software system. A product contains components of


different composition scales (Jonge 2004). A considerable amount of research has been
done to try to characterize reusable (Price and Demurjian 1997; Biggerstaff and Perlis
1989; Briand et al. 1994; Card and Glass 1990; Dandashi 2002) and maintainable (Berns
1984; Gibson and Senn 1989; Banker et al. 1993) software components. Both reusability
and maintainability are related to software dependency. From the dependency point of
view, if a software component is relatively independent, that is, if there are only a few
dependencies of this component on other components, it would be easy to understand,
maintain, and reuse.

Coupling represents the degree of dependencies between two software components.


Strong coupling between components strengthens the dependency of one component on
the others and increases the probability that changes in one component may affect the
other components and introduce regression faults (Kafura and Henry 1981; Selby and
Basili 1991; Troy and Zweben 1981), and accordingly have detrimental effects on
software maintenance. Strong coupling can also hamper software reuse. For example, if a
software component has many dependencies on other components, it may be impossible
to reuse this component in a new product without either (1) incorporating it together with
the dependent components, or (2) redesigning and reimplementing this component to
remove these dependencies. While option 1 may result in redundant reuse, option 2 may
result in changes to the component functionality. Hence, both these two approaches
defeat the intended purpose of component reuse (Harrison et al. 2000).

Dependencies between software components are not only associated with the type of
coupling between the components, but also upon the relevance of the two components.
Although the idea of interaction locality (increasing the coupling of relevant components
and decreasing the coupling of irrelevant components) is widespread and longstanding, it
has not been formalized and thoroughly studied. In this paper, we consider the relevance
(signified by the coupling distance measure) between two components as a factor that
affects the dependencies between them and propose two multiple-parameter coupling
metrics for layered component-based software systems.

The remainder of the paper is organized as follows: Sect. 2 reviews software coupling,
interaction locality, and coupling metrics. Section 3 describes the representation of
component dependency. We describe layered component structure in Sect. 4. Section 5
presents our coupling and dependency metrics. In Sect. 6, we show how to determine
component dependency for various kinds of coupling. Section 7 presents our application
studies on Apache HTTP. The conclusions, threats to validity, and future work appear in
Sect. 8.

2 Software coupling, interaction locality, and coupling metrics


Coupling represents the degree of interaction between two software components (classes,
modules, packages, or the like). There are many different types of coupling (Stevens et al.
1974; Page-Jones 1980; Offutt et al. 1993). All of them can be shown to fall into one of
the following four types: parameter coupling, external/file coupling, inheritance coupling,
and common coupling (Abdurazik 2007). This categorization was presented in the
context of object-oriented software systems. However, in view of component
dependency, the constructs of structured software are a proper subset of those of object-
oriented software, coupling in structured software systems can be represented using
parameter coupling, external/file coupling, and common coupling1; inheritance coupling
is specific to object-oriented software systems. The definitions of these types of coupling
are listed in Table 1. In object-oriented software, a class is considered to be the basic
manageable unit, while in structured software, a module is considered to be the basic
manageable unit. In Table 1, we refer both these basic units, in general, as modules.
Table 1 Definitions of various kinds of coupling (Abdurazik 2007)
Name Definition
Parameter Two modules have parameter coupling if one module invokes method
coupling of another module via parameter passing.
External/File Two modules have external/file coupling if they access the same
coupling external medium including external files.
Name Definition
Inheritance Two modules have inheritance coupling if one module is a descendant
coupling of another module.
Two modules have common coupling if they access the same global
Common coupling
variable.

Table 1 lists the definitions of coupling between two modules, the smallest scale
components. Usually, coupling between modules of two large-scale software components
is also used to represent the large scale component coupling (Bruegge and Dutoit 2004).
For example, if component C1 contains module A and module B , component C2 contains
module E and module F , and module A is parameter coupled to module E , we can say
component C1 is parameter coupled to component C2 .

Strong coupling means a high degree of dependency between software components.


Common coupling has been considered to be a strong form of coupling because it induces
strong dependencies between software components, making software components
difficult to understand, maintain, and reuse (Offutt et al. 1993; Yu et al. 2004).
Inheritance coupling is also considered as a strong form of coupling (Bruegge and Dutoit
2004; Hassoun et al. 2004) in the context of software maintenance and white-box reuse,
because any changes to a base class will affect all its derived classes. Parameter coupling
is usually considered as a weak form of coupling. Therefore, the degree of dependency
increases from top to bottom in Table 1 (Abdurazik 2007).

It has been observed that most of the complex systems in the world, from physical
systems such as atoms and stellar galaxies to social systems such as organizations and
governments, are modular and hierarchically structured. A large system may consist of
subsystems, which consists of subsystems, and so on, through several multiresolutional
layers. The interactions between subsystems tend to decrease as we go upward in the
hierarchy. This is called interaction locality (Simon 1969). Generally speaking,
interaction locality can minimize the energy for the system to operate and accordingly
stabilize the system. In software systems, interaction locality is expressed via a widely
accepted design principle: increasing the coupling of relevant components and
decreasing the coupling of irrelevant components.

Interaction locality should not be used in isolation by itself. Instead, it should be used
together with other two design principles, modularity and hierarchy (Yu and Ramaswamy
2007). Design modularity and hierarchy means the decomposition of the software system
into different layers of components in order to separate concern and reduce system
complexity. Interaction locality is then applied to assign interactions between these
components. Consider an ideal system that consists of components C1 and C2 , which in
turn contain modules, m1 through m4 . Figure 1a depicts the modular and hierarchical
structure of the system. Figure 1b depicts the interaction locality: high interactions exist
between relevant (lower level) modules and low interactions exist between irrelevant
(higher level) components.
Fig. 1 An ideal system with (a) hierarchical structure; and (b) interaction locality (Yu and
Ramaswamy 2007)

One the one hand, because different types of coupling have different effects on software
complexity, we can use the definitions of coupling in Table 1 to compare the degrees of
dependency between software components. Considerable research has been done in this
area to derive software dependency metrics, including (Briand et al. 1999; Chidamber
and Kemerer 1994; Basili et al. 1996; Card and Glass 1990). In these studies, software
dependency and complexity metrics are proposed and validated for both structured
software and object-oriented software. These metrics consider different types of
interactions between classes/modules, methods/functions, and attributes/variables.

On the other hand, the interaction locality design principle has also been widely accepted.
For example, Basili et al. (1996) validated the speculation made by Chidamber and
Kemerer (1994) that deep inheritance is more of a complication than shallow inheritance.
Lüer et al. (2001) proposed to increase component distance (reduce component
interactions) to increase component evolvability. Yu and Ramaswamy (2007) presented a
method to verify modularity, hierarchy, and interaction locality of a software design.
However, to the best of our knowledge, interaction locality has not been formalized and
generally used in the derivation of software metrics.

3 Component dependency representation

While coupling represents the degree of interactions between two components,


“coupling” by itself does not explicitly express the directionality of the dependency
between the two components. For example, the statement “Component C1 is parameter
coupled with component C2 ” does not explicitly specify whether component C1 depends
on component C2 or if component C2 depends on component C1 . Furthermore, the
definition of coupling is often associated with the relationship between only two
components. However, it frequently happens that one component may be coupled with
several components. Therefore, there is a need to explicitly define and formalize the
dependency relationship between components.

In our previous work (Yu 2007), we extended the concept of coupling and defined

component dependency as follows: Component depends on component if

changes made to have a direct effect on the behavior of (the word “direct”

means that the dependency is not via some third component). Component is called

the dependency-inducing component and component is called the dependent


component. In this paper, we continue to use this terminology.

With this notation, the dependency of a component can be represented with all its
dependent components. Here we utilize two notations to represent the dependency of one
component. The first is a graphical representation. This notation was first introduced in

(Yu 2007): A dashed arrow from component to component denotes that

component is dependent on component . The second notation is a matrix


representation. A two-column matrix is used; the first column lists the dependent
component names and the second column lists the coupling types. Suppose component
C1 is parameter coupled to (dependent on) component C2 and C3 . Figure 2 is a graphical
representation of the dependency of C1 . Table 2 is a matrix representation of the
dependency of C1 .

Fig. 2 Dependency of component C1 (graphical representation)


Table 2 Dependency of component C1 (matrix representation)
Dependent component Coupling type
C2 Parameter coupling
C3 Parameter coupling
4 Layered component structure

While there have been several definitions of software components (Brown 1997; Leavens
and Sitaraman 2000), in this paper, we consider a component from a logical perspective
and define it as an integral logical constitute (Mei et al. 2001). According to this
definition, all artifacts (classes, programs, packages, and so on) can be considered as
components. In a software system, there are two types of components: primitive
component and compound component. A primitive component is defined as the smallest
manageable unit (class in object-oriented software and module in structured software). A
compound component is composed of primitive components and/or other compound
components. Therefore, a software system can be represented by a component tree: the
leaf nodes are primitive components and the internal nodes are compound components,
with the primitive components at height 1. The height of a compound component can be
recursively defined as one plus the maximum height of its descendent components.

Consider the following example: a software product consists of seven modules: C1 , C2


,…, and C7 . Modules C1 , C2 , and C3 are composed to form one component called
Main . Modules C4 and C5 are composed to form a component called Input . Modules
C6 , and C7 are composed to form a component called Output . Input and Output are
composed to form a component called I/O . Main and I/O are composed to form the
product. This product can be represented as a component tree, as shown in Fig. 3. There
are seven components at height 1 ( C1 , C2 ,…, and C7) , three components at height 2 (
Input , Output , and Main) , one component at height 3 ( I/O) , and the root, Product , is at
height 4.

Fig. 3 A component tree example


In general, a software product can be represented as a component tree of height h with m
leaf nodes or primitive components PC 1, PC 2,…PC m . In the case of an object-oriented
software product, the primitive components are classes; in a structured software product,
they are modules. Each internal node or compound component CC i, 1 ≤ i ≤ n, is a
composition of primitive and/or compound components situated lower in the tree. A
generic component tree is shown in Fig. 4, in which PC represents a primitive component
and CC represents a compound component. The height of a component represents its
composition scale.

Fig. 4 Generic layered component tree structure


5 Coupling and dependency in layered component-based software
Figure 5 is a coupling and dependency example in a layered component-based software
product. It shows that the primitive component PC2 is dependent on primitive
components PC1 , PC3 , and PC4 . It also shows that the compound component CC1 is
dependent on compound components CC2 and CC3 .

Fig. 5 Graphical representation of component dependency

As mentioned in Sect. 2, the couplings defined in Table 1 reflect the interactions of two
software components. However, they do not reflect the structure of the product and could
not accurately represent the dependency of compound components. As in Fig. 5, PC2 is
dependent on PC1 , PC3 , and PC4 . Now suppose that these couplings are of the same
type (say, parameter coupling). If we consider the dependency of PC2 itself, there is no
difference among these couplings. However, if we consider the dependency of CC1 , the
coupling between PC2 and PC1 and between PC2 and PC3 are to be handled differently.
In a properly designed software system, related modules are composed into the same
component. The coupling between PC2 and PC1 does not affect the dependency of CC1 ,
but the coupling between PC2 and PC3 and between PC2 and PC4 may affect the
dependency of CC1 . Similarly, the couplings between PC2 and PC3 and PC2 and PC4
have different effects on the dependency of CC6 .

Therefore, traditional coupling definitions that consider only the type of dependency
between primitive components are insufficient to describe dependencies between
compound components. To consider the dependency between compound components, we
present a metric C (t, d) to measure the coupling between two components, where C
stands for coupling, with t representing the coupling type and d the coupling distance.
Thus, this metric has two parameters: coupling type and coupling distance. While
coupling type is determined by the nature of interactions between two software
components as defined in Table 1, coupling distance is determined by the relative
location of the two components in the component tree. Hence associated with any type of
coupling between two components, there is a corresponding coupling type and a coupling
distance. In the following subsections, we discuss how to represent component coupling
and component dependency by the coupling distance parameter.

5.1 Coupling representation of primitive components


In this subsection, we consider only primitive components, that is, the leaf components in
the component tree. First, we explain how to represent the coupling metric C (t, d). As
mentioned before, the coupling type is determined by the interactions between the two
modules. The coupling distance is defined below.

Definition 1 The coupling distance between two primitive components and is

the height of the lowest common ancestor of and in the component tree.

For example, in Fig. 5, supposing that all the coupling types are parameter coupling, the
lowest common ancestors for primitive components PC2 and PC1 , PC2 and PC3 , and
PC2 and PC4 , are CC1 , CC6 , and CC8 , respectively. The couplings between these
primitive components can be represented as C (parameter coupling, 2), C (parameter
coupling, 3), and C (parameter coupling, 4) respectively. Note that there may exist more
than one coupling between two modules. In this case, each of them is represented by the
corresponding C (t, d).

5.2 Dependency representation of primitive components


Next, we explain how to represent the dependency of a primitive component. In Sect. 3
we introduced the dependency of a component and represented it by a two-column matrix
without considering coupling distance. Here, we introduce a three-parameter matrix to
represent the dependency of a primitive component. The three parameters are dependent
component, coupling type, and coupling distance. Accordingly, the dependency of PC2 in
Fig. 5 can be represented by the matrix in Table 3.
Table 3 Dependency of primitive component PC2
Dependent component Coupling type Coupling distance
PC1 Parameter coupling 2
PC3 Parameter coupling 3
PC4 Parameter coupling 4
5.3 Dependency representation of compound components
As described in Sect. 2, dependency of a compound component is determined by the
dependencies of its underlying primitive components. Therefore, we can represent the
dependency of a compound component by using the dependencies of its primitive
components. To calculate the coupling distance of a compound component dependency,
we only need to consider the dependencies of its underlying primitive components whose
coupling distances are larger than the height of this compound component. If the coupling
distance of a dependency is smaller than or equal to the height of a compound
component, this dependency is within the same compound component. It does not affect
the dependency of the compound component. Using this approach, the dependency of
compound component CC1 in Fig. 5 can be represented as a three-parameter matrix
shown in Table 4. Coupling between PC1 and PC2 is not of concern here.
Table 4 Dependency of compound component CC1 (primitive component representation)
Dependent component Coupling type Coupling distance
PC3 Parameter coupling 3
PC4 Parameter coupling 4
Sometimes, we are interested in the coupling between two compound components. For
example, the primitive components are opaque and we cannot access (or we are not
interested in) the lower level components. In this case, we will treat the two coupled
compound components as the basic units, the height-1 components. Then, we need to re-
determine the tree structure of the software product. For example, in Fig. 5, if we
consider CC1 , CC2 , CC3 , CC4 , and CC5 as the basic unit, height-1 component, we can
redraw the component tree as shown in Fig. 6. We then can recalculate coupling distance.
Therefore, the coupling between CC1 and CC2 can be represented as C (parameter
coupling, 2), and the coupling between CC1 and CC3 can be represented as C (parameter
coupling, 3). The dependency of CC1 can be represented by the three-parameter matrix
shown in Table 5.

Fig. 6 Dependency tree with redefined height 1 components


Table 5 Dependency of compound component CC1 (specified dependent component
representation)
Dependent component Coupling type Coupling distance
CC2 Parameter coupling 2
CC3 Parameter coupling 3

The first column in Table 5 (dependent component) contains specified components,


which could be either primitive or compound. Both Table 4 and Table 5 contain the same
information about the dependency of component CC1 . We could use either
representation under appropriate situations. In the remaining of this paper, we use the
primitive component representation (Table 4) to represent the compound component
dependency.

5.4 Discussion

C (t, d), the two parameter representation of component coupling, has one advantage over
the traditional one parameter representation. The second parameter, coupling distance,
represents the relevance of two coupled components. Usually, in software design,
relevant functions (methods) are grouped into one module (class) and relevant modules
(classes) are grouped into one package, and so on. Therefore, larger values of distance
coupling are unfavorable than smaller values of distance coupling, because a larger value
normally represents the presence of coupling between two relatively unrelated
components. With respect to program comprehension and understandability, coupling
between related components is easier to understand. For component maintenance,
changes to a software component may have effects on other components due to
component coupling; a smaller distance coupling value is preferable to a larger distance
coupling value, because a smaller distance coupling value is indicative of localized
adverse effects, and thus, in a small scale component, which is easier to manage.

A distance-1 coupling implies the coupling is within a component and it does not affect
the independence of the component, which makes this component highly independent of
other components. A distance-2 coupling indicates that the coupling is between
components that have the same parent (one-height-up) component and hence can be more
relevant than other larger distance couplings.

Therefore, coupling distance, together with the coupling type specified in Table 1,
composes a valuable two parameter coupling metric, C (t, d). This metric not only can be
used to compare the degree of dependencies brought about by different types of coupling,
but can also be used to compare the degree of relevance of the same type of coupling. For
example, considering component CC1 in Fig. 5, we can infer that coupling between CC1
and CC2 is viewed more favorably over coupling between CC1 and CC3 , even though
they have the same coupling type (parameter coupling), because they have different
coupling distances.

In summary, our two-parameter coupling metric and three-parameter dependency metric


reveal the deep dependency relationships among components and are applicable to all
types of coupling between components of any scales.

6 Determination of dependencies

It is clear from the above discussions that software dependency is largely induced by the
presence of software coupling. It is easy to automatically determine parameter coupling
and inheritance coupling. Parameter coupling is induced via function calls or message
passing. For example, if module m1 invokes a function (method) implemented in module
m2 , we say m1 is parameter coupled to (dependent on) m2 . Dependencies induced by
inheritance coupling can be identified by language specific keyword or semantics. For
example, Java uses a keyword extend to represent class inheritance. If module m1 is
inherited from module m2 , we say m1 is inheritance coupled to (dependent on) m2 .

In contrast, dependencies induced by common coupling and external coupling are more
complicated. Common coupling between two modules is identified with the definition
and use of a global variable: a definition of a variable x is a statement that assigns a value
to x , such as x = 5; t he use of a variable x is a statement that utilizes the value of x , such
as if (x > 6) return. Because definitions can affect uses but uses cannot affect definitions,
dependencies between components induced by global variables are induced by the
definition–use relationship (Yu et al. 2004). For example, if module m1 uses a global
variable that is defined in module m2 , we say m1 is common coupled to (dependent on)
m2 .

External coupling between two modules is identified with the write and read operations
to the same external medium, including file, database, and so on. A write operation is to
change the content of the external medium and a read operation is to utilize the content of
the external medium. Because write operations might affect modules that read the same
external medium but read operations can not affect modules that read/write to the same
external medium, dependencies between components induced by external medium are
induced by the write–read relationship. For example, if module m1 reads a file and
utilizes the content that is written by module m2 , we say m1 is external coupled to
(dependent on) m2 .

7 Case studies of Apache HTTP


7.1 Overview
In this paper, we analyze Apache HTTP,2 an open-source software product. The Apache
HTTP is a project to develop and maintain an open-source web server for modern
operating systems including UNIX, Linux and Windows. Because Apache HTTP is
designed to run on different platforms, some code must be easily extensible and
customizable. However, to make the project manageable and the product maintainable,
the amount of code to be rewritten for different platforms must be minimized. In order to
solve this maintenance and reuse issue, Apache HTTP is created as a series of code
modules. Figure 7 shows the tree structure of Apache version 2.2, which can be
considered approximately as a height-4 layered system and the primitive components
(modules) are the “ .c ” files.

Fig. 7 Component tree structure of Apache HTTP 2.2


In Apache HTTP, there are six height-3 compound components, among which modules is
the most important one; it contains 17 height-2 compound components and 107 primitive
components. These components are expected to be reused in different platforms. Data
regarding the number of primitive components of Apache HTTP version 2.2 is provided
in Table 6.
Table 6 The data about Apache version 2.2
Height-3 component os test modules support srclib server
Number of modules 14 8 107 10 299 44

As mentioned before, the most important height-3 compound component in Apache is


modules , which contains platform independent functions. Therefore, to facilitate
software maintenance and reuse, we expect that modules would be designed with weak
dependencies. In this research, we use our method to study coupling within modules and
between modules and other components in order to understand the component
dependency of the entire Apache system.

Apache HTTP is structured software and is written in the C language, there is no


inheritance coupling (structured software does not have inheritance coupling). External
coupling is related with the write and read operations to the same file/database. Modern
programming no longer uses it to control program flow but to store and retrieve
permanent data. Therefore, we decided to study parameter coupling and common
coupling in Apache HTTP. These two types of coupling are most commonly apparent in
structured software. The coupling data is obtained via the source code cross reference
tool, lxr 3.
7.2 Component dependencies induced via parameter coupling
Apache version 2.2 is a height-4 tree structure and modules is a height-3 compound
component. Table 7 shows the dependency of modules induced by parameter coupling.
Multiple occurrences of the same dependent component are counted as 1 unique
dependent component. The coupling distances 1, 2, and 3 exist within primitive
components of modules; the coupling distance 4 exists between primitive component of
modules and primitive component of other height-3 compound components.
Table 7 Dependency of compound component modules induced by parameter coupling
Coupling Number of dependent Unique number of dependent
distance components components
1 23 19
2 86 21
3 13 3
4 881 67

Consider the dependency of a single primitive component in modules . A total of 107


primitive components in modules are dependent on 1,003 components, i.e., on average,
each primitive components invokes functions implemented in 9.37 components, most of
which belong to other height-3 compound components. According to the interaction
locality design principle, these distance 4 parameter couplings should be considered for
restructuring to reduce the coupling distance in order to reduce the system complexity.

Consider the dependency of modules as a whole. 107 primitive components in modules


are dependent on 67 components (Distances 1, 2, and 3 coupling does not affect the
dependency of modules as a whole). The fact that modules has less dependent
components (67) than its included number of components (107) indicates that modules is
well designed with respect to component dependency as a whole.

7.3 Component dependencies induced via common coupling

Because the determination of common coupling is associated with the definition-use


analysis of global variables, to simply the representation, we define a layer-L global
variable that induces distance-L common coupling for a height-H component tree, in
which 1 ≤ L ≤ H.

Definition 2 A layer-L (1 ≤ L ≤ H) global variable appears in primitive components with


the same height-L ancestor, thereby inducing a distance-L common coupling.
Because Apache version 2.2 is a height-4 tree structure, the highest layer of a global
variable could be layer-4. Table 8 summarizes the global variables appear in modules . A
layer-1 global variable appears within a single primitive component and induces a
distance-1 dependency. The functions that use the layer-1 global variable depend on the
functions that define the global variable within the same primitive component. However,
a distance-1 dependency does not induce dependencies between components. Similarly, a
distance-2 coupling does not affect the independence of the height-2 compound
component as a whole. Therefore, lower layer global variables are more favorable than
higher layer global variables.
Table 8 Global variables in Apache compound component modules
Layer-number 1 234
Number of global variables 120 7 0 20
First, we evaluate the dependency of the compound component modules . Since modules
is a height-3 component, only layer-4 global variables in Table 8 can affect the
dependency of modules . Among the 20 layer-4 global variables, only three variables
induce dependency of modules on other components (because only definition of a global
variable can affect the use of the global variable4). Table 9 presents the dependency of
component modules induced by global variables using matrix representation.
Table 9 Dependency of component modules induced by common coupling
Dependent component Coupling type Coupling distance
leader.c Common coupling 4
Mpm_winnt.c Common coupling 4
worker.c Common coupling 4
perchild.c Common coupling 4
threadpool.c Common coupling 4
prefork.c Common coupling 4
Table 10 shows the dependencies of all primitive (height 1) components in modules
induced by common coupling. Among 107 primitive components, only eight of them
have dependencies on other components via global variables. The first column lists the
eight dependency-inducing components in modules; the other three columns are the
matrix representations of the dependencies of each dependency-inducing component.
Table 10 Dependency of primitive components in modules via common coupling
Dependency-inducing Dependent Coupling
Coupling type
component component distance
Common
mod_win32.c mpm_winnt.c 4
coupling
Common
cache_storage.c mod_cache.c 2
coupling
Common
cache_util.c mod_cache.c 2
coupling
Common
locks.c 2
coupling
repos.c
Common
dbm.c 2
coupling
Common
mod_dav_lock.c locks.c 2
coupling
Common
mod_case_filter.c mod_case_filter_in.c 2
coupling
Dependency-inducing Dependent Coupling
Coupling type
component component distance
Common
mod_case_filter_in.c mod_case_filter.c 2
coupling
Common
leader.c 4
coupling
Common
worker.c 4
coupling
Common
mpm_winnt.c 4
coupling
mod_status.c
Common
perchild.c 4
coupling
Common
threadpool.c 4
coupling
Common
prefork.c 4
coupling

To summarize, in height-3 compound component modules of Apache version 2.2, there


are 107 primitive components. Among them, 99 components are independent with respect
to common coupling, six components have distance-1 common coupling and 2
components have distance-4 common coupling. Since common coupling, especially
larger distance common coupling is a barrier for software maintenance; from the
viewpoint of software maintenance, these eight components are potential targets for
restructuring to improve the maintainability of Apache.

Consider the design quality of modules , 93% (99/107) of its primitive components are
well designed from the viewpoint of common coupling: changes to other components will
not affect any of them via global variables; reuse of any of these 99 components does not
need to consider their dependencies on other components via global variables.

7.4 Validation study

We have presented a 2-parameter coupling metric and a 3-parameter component


dependency metric. Both metrics included coupling distance as an important parameter
for consideration. It was assumed that larger distance coupling values have more
detrimental effects than smaller distance coupling values with respect to software
understandability, maintainability and reusability.

In order to further validate this assumption, in this section we perform an empirical study
on Apache HTTP to investigate the relationship between component dependency
(represented with coupling distance) and the external properties of software products. The
empirical studies were performed on the primitive components of modules in Apache
HTTP version 2.2.
The coupling metrics presented in this paper are two dimensional and contains both
coupling type and coupling distance. To avoid the cross-effects of coupling type on the
results, we studied parameter coupling and common coupling separately.

First, we define two evaluation metrics, D parameter (parameter coupling distance) and D
common (common coupling distance). D parameter of a component equals to the sum of the
parameter coupling distances of all its dependent components and is expressed by the

formula: . D common of a component equals to the sum of the common coupling

distances of all its dependent components and is expressed by the formula: . The D
parameter and D common values of all 107 primitive components of modules in Apache HTTP
are calculated based on the inspection of source code of version 2.2 using lxr.

Second, we count the M value, which is the number of times these dependency-inducing
components have been modified, based on the change history of Apache HTTP. For this
measurement, the CVS log is used HTTP5 records a complete revision history of all the
components and is available online and supports easy data extraction. We used a self-
written Perl program to obtain the change record information for each of the 107
primitive components and count the number of times it is modified from its first version
to the current version 2.2.

Finally, we test the correlation between D parameter and M and D common and M. We expect to
find that a component with larger dependency value also has larger number of
modifications; therefore, we test the following null hypotheses.

H01: There is no linear relationship between the parameter coupling distance of a


component and the number of modifications made on this component.
H02: There is no linear relationship between the common coupling distance of a
component and the number of modifications made on this component.

To test these hypotheses, we need to calculate the correlation coefficient value that
indicates the strength of the relationship between the two variables: independent variable,
component dependency value (D parameter or D common), and dependent variable, the number
of modifications (M) made to the component, in the software revision history. Several
different correlation coefficients have been put forward, including Pearson’s correlation
coefficient and Spearman’s rank correlation coefficient (Nolan 1994). For Pearson’s
correlation coefficient to be valid, two variables should be normally distributed.
However, in this case, it is unlikely that either of these two variables has a normal
distribution. Therefore, we use Spearman’s rank correlation coefficient. If the rank
correlation coefficient proves to be statistically significant at the 0.05 level, we will reject
the null hypothesis.

The results of the hypothesis tests are in Table 11. The scatter plots showing the
relationship between parameter/common coupling distance and the number of
modifications are in Figs. 8 and 9. Figure 8 shows the measurements and Fig. 9 shows the
ranks of the measurements. Dashed linear trendlines are displayed in Fig. 9. In both tests,
the correlations are significant at the 0.01 level (two tailed). Therefore, we reject the null
hypotheses and conclude that there is significant linear correlation between dependency
value (D parameter or D common) of a component and the number of modifications (M) made to
this component.
Table 11 The results of hypothesis tests
Hypothesis Number of pairs of data Correlation coefficient Significance
H01 107 0.402 0.01
H02 107 0.299 0.01

Fig. 8 The scatter plot of the number of modifications of a component versus (a)
parameter coupling distance; and (b) common coupling distance

Fig. 9 The scatter plot of rank of number of modifications of a component versus (a) rank
of parameter coupling distance; and (b) rank of common coupling distance

It is worth noting that strong correlations between independent variables, coupling


distance and dependent variable, number of modifications; do not demonstrate the cause–
effect relationship. It only gives empirical evidence that these two variables are related
(either directly or through a third variable). Only an experiment where all other factors
are fixed can help us derive the cause–effect relationship. Such an experiment would
require that the only difference between components is coupling distance, and all other
factors such as code size, code structure, should remain the same. This can be done only
through a controlled experiment and not on a real world software product. Therefore, we
were not able to verify the causation in this study.

The number of modifications made to a component is related to the quality and the
complexity of the component. A component may be modified for various reasons. For
instance, an error found in one component and an improvement requirement on the
functionality of the component could result in a direct modification to the component.
Moreover, because components are interdependent, changes made to other components
could indirectly require modifications to this component. Therefore, we can assume that
the number of direct modifications represents the quality of the component, such as
stability, fault density; the number of indirect modifications represents the complexity of
the component. Note that a component is said to be complex when it is interrelated with
many other components; changes made to other components require corresponding
changes on this component.

The fault density and complexity of a component are directly related to its dependency.
Similar conclusions have been achieved in earlier work (Kafura and Henry 1981; Selby
and Basili 1991; Troy and Zweben 1981). In these studies, the relationship between
coupling type and software quality were established. Our empirical study further reveals
the relationship between coupling distance and software quality measures: larger distance
coupling values have more detrimental effects than smaller distance coupling values on
software quality, including understandability, maintainability and reusability.
8 Conclusions, threats to validity, and future research

In this paper, we proposed a coupling metric and a dependency metric for component-
based software. In both metrics, a new and potentially important parameter, coupling
distance, which measures the relevance between two coupled components, is used. If a
software system can be represented as a layered component tree structure, the coupling
distance can be determined easily from the heights of the two components in the tree. As
a case study, we evaluated the dependency of Apache version 2.2 based on parameter
coupling and common coupling. A validation study was performed and found linear
relations exist between coupling distance and component quality.

There are several threats to the validity of our study. One threat to internal validity is the
accuracy of data. To reduce this threat, we use both open-source and self-written tools to
extract coupling data and modification data in order to avoid manual counting mistakes.
Another internal threat is that we only investigated the parameter coupling distance and
common coupling distance. Due to the limitation of Apache HTTP, we did not investigate
the inheritance coupling distance and external coupling distance. Therefore, to reduce this
threat, we plan to study other software products, including object-oriented software to
validate the relationship between external/inheritance coupling distance and software
qualities. The third internal threat comes from the measurement: coupling data is obtained
from one specific version of Apache HTTP (version 2.2) while the modification data of a
component is obtained from all versions of Apache HTTP. To reduce this internal threat,
more coupling data on different versions of Apache HTTP should be obtained and
examined against the modification data.

One construct threat to validity is the construction of the tree structure of Apache HTTP.
Currently, we use the package structure to represent component structure, which might
not be a representative of the system architecture. Another construct threat to validity is
that our coupling analysis is only based on static analysis and we did not consider
dynamic run-time coupling/dependencies. In static analysis, we only considered acyclic
dependencies, i.e., sets of dependencies with no recursive references. During run-time,
recursive, or cyclic dependencies could exist between software components. The external
threat to validity is that the study performed on Apache HTTP is not representative of
other component-based software products. To reduce these threats, more studies with
dynamic analysis should be performed on other software systems.

Due to the observed importance of coupling distance, our studies have the following
impacts on software design metrics, which also aptly captures our future research
directions:
1. Object-Oriented Design: the measurement of class coupling could be refined by
integrating coupling distance. For example, the CBO metric presented by Chidamber
and Kemerer (Chidamber and Kemerer 1994) only considers the number of objects
coupled to a specified object. In fact, different objects may have different relevancies
to a specified object and by applying the coupling distance parameter, the CBO metric
could be refined and revalidated.
2. Structured Design: the measurement of architecture design could also be refined. For
example, Card and Glass (Card and Glass 1990) defined the structural complexity of a
specified module as the square of the fan-out of a module. Fan-out is the number of
modules that are directly invoked by this specified module. In this paper, we show that
different modules may have different relevancies to a specified module. Therefore,
new metrics for structural complexity, data complexity, and system complexity could
be derived if the coupling distance parameter is introduced within these measurements.
Acknowledgements This work was based in part, upon research supported by the
National Science Foundation (CNS-0619069, EPS-0701890 and OISE 0650939), Acxiom
Corporation (# 281539) and NASA EPSCoR Arkansas Space Grant Consortium (#
UALR 16804). Any opinions, findings, and conclusions or recommendations expressed
in this material are those of the author(s) and do not necessarily reflect the views of the
funding agencies. The authors would like to thank Professor Stephen R. Schach of
Vanderbilt University for his many suggestions. The authors would also like to thank the
anonymous reviewers for their valuable comments and suggestions which greatly
improved the earlier version of this paper.

References
Abdurazik, A. (2007). Coupling-based analysis of object-oriented software, Ph.D.
Dissertation, George Mason University. Available at:
http://www.ise.gmu.edu/~ofut/rsrch/aynur-dissertation.pdf.

Banker, R. D., Datar, S. M., Kemerer, C. F., & Zweig, D. (1993). Software complexity and
maintenance costs. Communications of the ACM, 36(11), 81–94.

Basili, V. R., Briand, L. C., & Melo, W. L. (1996). A validation of object-oriented design
metrics as quality indicators. IEEE Transactions on Software Engineering, 22(10), 751–
761.

Berns, G. M. (1984). Assessing software maintainability. Communications of the ACM,


27(1), 14–23.

Biggerstaff, T. J., & Perlis, A. J. (1989). Software reusability: Concepts and models (Vol.
1). New York, NY: ACM Press.

Briand, L. C., Daly, J. W., & Wüst, J. K. (1999). A unified framework for coupling
measurement in object-oriented systems. IEEE Transactions on Software Engineering,
25(1), 91–121.
Briand, L. C., Morasca S., & Basili V. R. (1994). Defining and Validating High-Layer
Design Metrics, Computer Science Technical Report Series, Vol. CS-TR-3301, University
of Maryland at College Park, College Park, MD.

Brown, A. W. (1997). Background Information on CBD, SIGPC, Vol. 1. No. 18.

Bruegge, B., & Dutoit, A. H. (2004). Object-oriented software engineering using UML,
patterns, and Java. Upper Saddle River, NJ: Pearson Prentice Hall.

Card, D. N., & Glass, R. L. (1990). Measuring software design quality. Upper Saddle
River, NJ: Prentice-Hall.

Chidamber, S., & Kemerer, C. (1994). A metric suite for object oriented design. IEEE
Transactions on Software Engineering, 30(6), 476–493.

Dandashi, F. (2002). Software engineering: theory, application and practice: A method for
assessing the reusability of object-oriented code using a validated set of automated
measurements. In Proceedings of the 2002 ACM Symposium on Applied Computing, pp.
997–1003.

Frakes, W. B., & Succi, G. (2001). An industrial study of reuse, quality, and productivity.
Journal of Systems and Software, 57(2), 99–106.

Gibson, V. R., & Senn, J. A. (1989). System structure and software maintenance
performance. Communications of the ACM, 32(3), 347–358.

Harrison, R., Counsell, S., & Nithi, R. (2000). Experimental assessment of the effect of
inheritance on the maintainability of object-oriented systems. Journal of System and
Software, 52(2–3), 173–179.

Hassoun, Y., Johnson, R., & Counsell, S. (2004). A dynamic runtime coupling metric for
meta-level architectures. In Proceedings of the Eighth Euromicro Working Conference on
Software Maintenance and Reengineering (CSMR’04), pp. 339–346.

Jonge, M. D. (2004). Multi-level component composition. 2nd Groningen Workshop on


Software Variability Modeling (SVM’04).

Kafura, D., & Henry, S. (1981). Software quality metrics based on interconnectivity.
Journal of Systems and Software, 2(2), 121–131.
Leavens, G., & Sitaraman, M. (2000). Foundations of component-based systems.
Cambridge, UK: Cambridge University Press.

Lim, W. (1994). Effects of reuse on quality, productivity, and economics. IEEE Software,
11(5), 23–30.

Lüer, C., Rosenblum, D. S., & van der Hoek A. (2001). The evolution of software
evolvability. In Proceedings of the 4th International Workshop on Principles of Software
Evolution, Vienna, Austria, September 2001, pp. 134–137.

Mei, H., Zhang, L., & Yang F. (2001). A software configuration management model for
supporting component-based software development. ACM SIGSOFT, 26(2), 53–58.

Nolan, B. (1994). Data analysis, an introduction. Cambridge, MA: Polity Press.

Offutt, J., Harrold, M. J., & Kolte, P. (1993). A software metric system for module
coupling. Journal of System and Software, 20(3), 295–308.

Page-Jones, M. (1980). The practical guide to structured systems design. New York:
Yourdon Press.

Price, M. W., & Demurjian, S. A. (1997). Analyzing and measuring reusability in object-
oriented design. In Proceedings of the 12th ACM SIGPLAN Conference on Object-
Oriented Programming, Systems, Languages, and Applications, pp. 22–33.

Selby, R. W., & Basili, V. R. (1991). Analyzing error-prone system structure. IEEE
Transactions on Software Engineering, 17(2), 141–152.

Simon, H. A. (1969). The architecture of complexity, the sciences of the artificial.


Cambridge, MA: MIT Press.

Stevens, W. P., Myers, G. J., & Constantine, L. L. (1974). Structured design. IBM Systems
Journal, 13(2), 115–139.

Troy, D. A., & Zweben, S. H. (1981). Measuring the quality of structured design. Journal
of Systems and Software, 2(2), 113–120.
Yu, L. (2007). Understanding component co-evolution with a study on Linux. Empirical
Software Engineering, 12(2), 123–141.

Yu, L., & Ramaswamy, S. (2007). Verifying design modularity, hierarchy, and interaction
locality using data clustering techniques. In Proceedings of the 45th ACM Southeast
Conference, Winston-Salem, NC, March 2007, pp. 419–424.

Yu, L., Schach, S. R., Chen, K., & Offutt, J. (2004). Categorization of common coupling
and its application to the maintainability of the Linux Kernel. IEEE Transactions on
Software Engineering, 30(10), 694–706.

You might also like