You are on page 1of 9

Planning Distributed Development through Software Architecture

Min Chen, Bharat Gorantla, Okeno Palmer, Lutz Wrage


Master of Software Engineering Program
Carnegie Mellon University
Pittsburgh, PA 15213, USA
{minc1, bgorantl, orp, lw}@andrew.cmu.edu

Abstract Moreover, it has a strong influence on the


communication patterns of the development teams
Coordination among distributed teams is the major because it determines the module dependencies of the
challenge in global software development projects system that will be developed by these teams.
because effective communication is difficult to achieve. Consequently, a well-designed architecture can be used
Software architecture, when designed appropriately, to study distribution, in the sense of suggesting which
helps to increase coordination while minimizing the software modules should be developed by which teams.
need for communication among teams. Software Software architecture is driven by functional
architecture for globally developed systems must take requirements, quality attributes, and constraints. Quality
into consideration the communication problems that are attributes are often addressed as non-functional
exacerbated by distance, different languages and requirements such as performance, security, and
cultures, time difference, and work environment. interoperability of the system. Distributability, on the
Furthermore, techniques to assign architecture elements other hand, is an attribute of the architecture that makes
to distributed teams must consider module the system suitable for development by distributed
dependencies, technologies, development effort, and teams.
end-to-end functionality for each development iteration. Architecture documentation is comprised of a
This facilitates project administration by providing the collection of views. The static, runtime, and physical are
necessary insight and flexibility when managing global the three main perspectives of a system that can be
development projects. represented with views. Module views contain classes,
functions, and interfaces to represent properties of the
1. Introduction system from the static perspective. Component-and-
connector (C&C) views encompass processes and
Global software development projects face sequence diagrams to represent the properties from the
coordination problems because of ineffective runtime perspective. Allocation views of computers,
communication and inter-dependencies among teams. devices, and networks are used to represent properties of
Effective communication is difficult to achieve in a the system from the physical perspective. Every view
distributed environment because of distance, different focuses on quality attributes of the architecture. For
languages and cultures, time difference, and work example, module views can be used to reason about
environment. Previous research suggests that the use of interoperability and portability; C&C views to address
well-designed architectures, processes, and tools help to performance and security; and allocation views to
coordinate the work of distributed teams while address disk capacity and memory properties.
minimizing the need of communication across these However, distributability is not addressed by any of
teams [1]. the views alone. Module views are useful when
Software architecture is defined as "the structure or analyzing inter-dependencies among modules to decide
structures of the system, which consist of elements, their which architecture elements can be assigned to which
externally visible properties, and the relationships among teams. C&C views are important when studying whether
them." It allows groups of people separated by the structures of the system are independently testable.
organizational, geographical, and even temporal Allocation views provide the assignment of modules to
boundaries to work cooperatively and productively [2]. teams.
This paper describes our experiences and lessons areas of communication, requirements elicitation, work
learned when working as a development team in a global assignments, and distributed implementation.
development project at Carnegie Mellon University
(CMU), and as the architecture team responsible for re- 2.1.1. Communication Issues. The project was
designing a software architecture suitable for distributed organized following a hub-and-spoke model. The client
development and proposing the assignation of had a central team, the hub, and the student teams were
architecture elements to distributed teams. arranged around the center. Student teams could directly
communicate with the central team, but all
communication between student teams had to be
2. Project Overview facilitated by the central team. The goal of this
organizational structure was to limit the number of
The project on which we base this paper was a Studio communication channels and thus control the
project in the Master of Software Engineering program communication overhead in the distributed environment.
at CMU. Our Studio team consisted of 4 graduate However, the software architecture was not well suited
students. This team was one of seven teams distributed for this model because it required close collaboration of
across the world who were chosen by the client to work student teams, for example, to define component
on a global development research project named "JOY". interfaces that needed to be shared among teams.

The goals of the JOY project were to:


• Learn from experimentation of processes in a 2.1.2. Requirements Elicitation Issues. The original
globally distributed context. That is, to experiment JOY software architecture was based on a set of
with processes and to observe the influence of architectural drivers that included high-level functional
requirements, architecture, and process requirements and a number of constraints. For example,
documentation on development in a globally the system had to be Web-based, and it had to update the
distributed environment. user interface asynchronously whenever certain
• Determine how having a stable set of requirements, a measured values changed. However, we found that many
sound architecture and a well documented of these requirements were not understood in enough
lightweight process can influence development in a detail to create a viable architecture. In addition, quality
distributed context. The focus of the project was on attribute requirements had not been documented at all.
observing the teams and distilling lessons learned for As a result, in the original architecture, the
distributed development. It was not to create a responsibilities of components and their interfaces were
product for use in a production environment. not defined in as much detail as the teams needed to
develop their assigned components.
The JOY project was planned for a duration of three
years; the first iteration started in September 2004 and Particularly the lack of documented quality attribute
was planned to end in August 2005. The client had requirements made it difficult to refine the component
prepared an initial, high-level software architecture of design consistently across development teams. We had
the system, and each team would develop one or more planned to use the attribute-driven design method, but
components of this architecture. The CMU team was this method requires this kind of information to be
mainly responsible for developing a basic component documented in the architecture [3].
called system object model. However, due to
coordination problems that arose, the CMU team was 2.1.3. Work Assignments. Each student team was
later given the responsibility of recreating the assigned a number of components to develop. One
architecture for the same system to make it suitable for criterion for assignment was a preference given by the
distributed development and proposing an initial work teams; other criteria were not explained to the teams.
unit assignment for the new distributed development However, in the course of the project it became obvious
teams based on the revised architecture. that the central team had not considered component
inter-dependencies to the degree necessary for success.
2.1. First Iteration Problems Incremental deliveries were organized by use cases
which led to situations where parts of a use case were to
The main problems we encountered while trying to be implemented in components developed by different
implement the original JOY architecture were in the teams but with the same delivery date.
2.1.4. Distributed Implementation Issues. The lack of that communication among teams may still be
detail in the original software architecture made it necessary to evolve interface specifications, but given
necessary for the student teams to add missing interface our architecture this will require significantly less
definitions to their assigned components. Adding or communication than to negotiate completely new
modifying a component interface is a change in the interfaces.
architecture that must be communicated to all 2) Make components independently testable. We
development teams that use this interface but there was accomplished this designing the run-time architecture
no process in place to manage this kind of change. as several communicating processes. Each process
The original strategy called for the central team to should be assigned to one team for development such
integrate components developed by the student teams. It that they all work on largely self-contained pieces.
did not work well because there were dependencies This tactic may increase the complexity of the overall
among components which were not immediately obvious architecture. Integration testing will be more complex
in the architecture, and the schedule did not take them compared to a monolithic system, but only to a
into account. limited degree because the JOY system is already
highly multithreaded.
2.1.5. First Iteration Results. In May 2005 it became
clear that the original approach would not result in the 3) Trade distributability for code sharing. Each time a
creation of a system that could demonstrate even simple team uses modules that are developed by other teams
end-to-end functionality. The code produced by the a strong dependency is introduced. The required
student teams constituted an incomplete set of coordination among the teams may offset any savings
components that did not fit well together. The central the shared modules can achieve. Our architecture
team realized that they had significantly underestimated contains two very similar small modules that will be
the effort to manage such a distributed development developed by independent teams.
project. This led to many of the problems because the
teams were assigned responsibilities that should have 4) Create more modules than there are distributed
been executed by the central team. The most prominent teams. More modules available for assignment to
mistake in this regard was that it was left to the student teams make planning more flexible. Work
teams to define component interfaces without exercising assignments can be adapted to the teams' technical
control over the content of the interfaces and over expertise and to schedule constraints. The trade-off is
interface changes. that the architecture becomes more complex with
each added module because module responsibilities
The next section describes our approach for designing and interfaces need to be specified.
an architecture suitable for distributed development. We
then focus on criteria and artifacts that help in assigning 4. Creating and Assigning Work Units
work to distributed teams.
The main effort of the architecture redesign was
3. Creating the New Architecture focused on dividing work among teams. At the time we
created the architecture, the central team only knew that
When analyzing the problems that caused the first there would likely be six student teams to implement the
iteration of the JOY project to fail, we identified several system. However, we had no information about the team
tactics we used to increase the distributability of our members' profiles, background, and availability. Hence,
software architecture. Some of these tactics increase the we compiled information for use by the project
architecture's level of detail, while others are related to management to assign work units to the distributed
the kind of artifacts included in the architecture teams. This information will also inform the milestones
documentation. in the project schedule. To accomplish this we executed
the following activities:
1) Address communication with interfaces. Our revised
architecture contains detailed interface definitions to
limit the amount of communication necessary to 1) Identify Criteria. Create a prioritized list of criteria to
coordinate distributed development efforts. In guide in assigning work units to teams.
addition we provide sequence diagrams that show 2) Identify and Create Artifacts. For each criterion from
how components and interfaces are used to achieve the previous activity, create artifacts that specifically
the system's overall functions. We realize that address this criterion. An artifact is a representation
representation tool that helps communicating and
analyzing a concept. 4.2 Artifacts
3) Propose Work Unit Assignment. Create a work unit
assignment that takes all criteria into account. As we The artifacts presented in this section are those we
did not have enough information about the teams' created to address the criteria of the previous section. All
level of expertise, our work assignment will serve as the modules listed below are part of the detailed
a starting point that the central team will adapt. architecture. High-level modules are composed from one
or more of the modules listed below.

4.1 Criteria 4.2.1. Dependency View. A Design Structure Matrix


(DSM) was used to create the dependency view based on
Table 1 shows the criteria decreasing priority order. the modules of the JOY system. The DSM is a tool that
These criteria are based on the customer’s preferences. displays the relationships between elements of a system
In addition, to balance the development teams' in a compact, visual, and analytically advantageous
workload and resources, the architecture team estimated format [4]. The DSM method is an information exchange
the development effort required for each module. In an model that allows the representation of complex module
ideal situation the architect would have information relationships in order to determine a sensible sequence
about the teams, such as size, expertise and availability for the modules being modeled.
to accurately estimate the effort. In the JOY project the
profiles of the development teams were unknown, so the The DSM in Table 2 is a module-based binary matrix
architecture team made the assumption that no team which rows and columns are headed with the complete
would have significant experience in any of the required list of modules of the system. This is the final DSM as
technologies. The result of these considerations is shown the result of using a manual DSM partition algorithm to
in the development effort view. determine the optimum module development sequence.
Marks in the matrix explain if there are dependencies
Table 1. Work Units Assignment Criteria
ID Criterion Description Artifacts
C1
Allow teams to Each team will be responsible The modules, as designed by the CMU team, are created with
focus on for a functionality to reduce high cohesion and low coupling in mind. Hence, the resulting
specific the need for communication modules are packages of functionalities. The module view of
functionalities with other teams. the architecture helps to reason about this criterion.

C2
Create JOY There should be a functional To have a functional system in each release, the development
using system by the end of each teams have to implement elements of the architecture that will
incremental release. fulfill functional requirements. Prioritized use cases are one
development artifact used to decide the order in which the modules have to
be implemented.
C3
Create JOY Before developing each An artifact that shows the module dependencies along with the
using bottom up module, other modules it possible timelines will be the best for this criterion. The CMU
development. depends on should already be team used a Design Structure Matrix (DSM) to create a
developed or under module dependency view of the JOY architecture. A timeline
development. view was derived from the optimized matrix.
C4
Allow teams to Save time by assigning work It is necessary to know technologies that are required to
focus on few units to teams according to implement each of the modules of the JOY system. Hence, the
technologies. their expertise. Assign architecture team created the technology view that maps each
modules to teams to minimize module to the technologies that are required in its
the number of teams that need implementation.
to learn a technology.
Table 2. Design Structure Matrix
Number of modules
this depends on
Module ID 1 3 14 4 5 6 15 9 11 16 12 7 10 13 8 2
PS 1 0
CP 3 0
CFG 14 0
VC 4 1 1
DA 5 1 1
CD 6 1 1
COV 15 1 1
AM 9 1 1 1 1 1 5
PD 11 1 1 1 3
AP 16 1 1 1 3
ARE 12 1 1 1 1 4
HED 7 1 1 2
AD 10 1 1 2
LRE 13 1 1 1 1 4
RED 8 1 1 1 3
LO 2 1 1 1 1 4

Number of dependants 6 2 1 1 10 2 1 3 2 2 1 1 1 1 0 0

Legend

1 in bold = strong dependencies


1 regular = weak dependencies

there are dependencies among the modules and, if so, For more information about DSM and partition
which kind of dependency. A ”1“ in bold represents algorithms, refer to Yassine’s paper [5].
strong dependency, and a “1” in regular font represents
a weak dependency. If module A has a strong 4.2.2. Timeline View. The timeline view (Table 3) is
dependency on module B, A needs a correct derived from the dependency view. It is based solely on
implementation of module B to run correctly. the strong dependencies to determine in which period a
Each row represents all the modules required for the module can be developed. Weak dependencies were not
implementation of the module corresponding to that row. taken into consideration because development teams can
Similarly, reading down a specific column shows which create stubs to simulate functions of the required module
modules depend on, or use, the module corresponding to if it is not ready. Stubs cannot be used to replace
that column. For example, the PS module depends on no modules in strong dependencies.
other modules, but 6 modules depend on it. This view partitions the project into four time
periods. These do not necessarily have the same
The order of the modules in the DSM shows a duration. The earliest possible section indicates the
possible sequence of development. The first three period in which a module can be developed right after all
modules (PS, CP, CFG) can be developed in parallel as its required modules are ready. The latest possible
they do not depend on each other and not on modules section indicates the last period in which a module can
outside this group. Following this group, there are 11 be developed without delaying the development of its
modules (VC through LRE) which depend on the dependants, if any.
modules in the first group as well as on other modules Some modules have to be developed in the same
within this group. Some of these modules can be period for both options. The names of these modules are
developed in parallel; others have to be developed in shown in bold. For example, the VC module has to be
sequence. Finally, the last two modules (RED, LO) do developed in Period 1 regardless if the option is earliest
not have any dependants, and they can be developed in or latest possible.
parallel as they do not depend on each other.
Table 3. Timeline View
Option Period 1 Period 2 Period 3 Period 4
Earliest PS VC DA CE HED AD
Possible CP COV AM PD RED ARE
CFG AP LO LRE
Latest VC DA AM AP LO
Possible CFG CE PD LRE
CP ARE COV
AD PS
HED RED

The timeline view does not take into consideration


system functionality and size of the modules. That is, 4.2.4. Development Effort View. The Development
developing the modules in the suggested order does not Effort View (Table 5) shows the relative effort required
guarantee an incremental development. In addition, the to implement each module. The effort estimation must
view does not show if development of a module can span take into account:
several periods. • Technologies involved in the implementation of the
module
4.2.3. Technology View. The Technology View shown • Complexity of the design of the module
in Table 4 provides information about the technologies • Amount of functionality contained in the module
that each module will require for its implementation. The • Expertise and experience of developers
purpose of this view is to identify how many
technologies a development team will have to learn for The relative effort is measured as high, medium, and
its work units. low. That is, a module assigned high effort requires more
For the JOY system we identified five technologies effort than a module assigned medium, but two modules
the development teams will use in its implementation. As with high effort may or may not take the same amount of
we assume that the teams do not have experience in any time. We estimate that modules within a category will
of these, the number of technologies a team uses should require similar amounts of time for development. We
be limited to no more than three. could not provide a more precise effort because we did

Table 4. Technology View


Technology
Module ASP.NET AJAX .NET Remoting C# Programming NHibernate
PD × × × ×
AD × × × ×
HED × × ×
RED × × ×
LO × × ×
PS × ×
CP × ×
VC ×
CP × ×
COV × ×
ARE × ×
LRE × ×
AP × ×
CFG × ×
DA × × ×
AM ×
effort because we did not have the necessary information timeline view, technology view, and the development
about the teams’ level of experience and expertise. effort view, a more informed way of assigning work in a
distributed environment can be done. The work
The purpose of this view is to help balance the assignment view shown in Table 6 merges information
workload of the work units. from all four views and provides a work assignment
distribution that factors in module dependencies,
Table 5. Development Effort View sequence of implementation of the modules,
Module Effort technological expertise required, and estimated
PS High development effort.
LO Low The work unit assignment view takes the form of a
grid. Information from the grid is read as follows:
CP Medium 1) Modules assigned to specific teams. This information
VC Low is contained in the white area of the grid. Teams are
DA High specified across the top of the grid in columns.
CE High Modules assigned to each team are then specified in a
HED Medium cell in a grid column below the team name.
RED Medium
2) Period of time in which a module must be
AM Medium implemented. Periods are expressed in multi-colored
AD Medium rows of the main grid. Modules are placed in cells
PD High that intersect the team name and the period in which
ARE Low the module must be implemented. Periods do not
necessarily have the same length. The duration
LRE Low
depends on the estimation of the development teams
CFG Medium based on their low-level design.
COV Low 3) Expected functionality at the end of a period based
AP Medium on modules implemented in that period. This
information is expressed to the right on the main grid
4.3. Results of Work Unit Assignments which expectations for each period included in the
same color and row as the targeted period. Reading
Based on information from the dependency view, this information vertically from top to bottom shows
how JOY's functionality will grow across periods.

Table 6. Work Unit Assignment View


FSS Adapter,

Presentation

Presentation

Presentation
Subscribe

Manager
Adapter

High-level
Publish

Rules,

Rules,

Module
Data

Team 1 Team 2 Team 3 Team 4 Team 5 Team 6 Expected Functionality (End of Period)
1. PS (H) 3. CP (M) 4. VC (L)
Basic communication from FSS to JOY and viceversa
Period 1 14. CFG (M)
without persistence
15. COV (L)
9. AM (M) 5. DA (H)
Full communication from FSS to JOY including
Period 2
persistence
16. AP (M) 6. CE (H) 11. PD (H)
Period 3 10. AD (M) 8. RED (M) 7. HED (M) User Interface
2. LO (L)
12. ARE (L) 13. LRE (L)
Period 4 Rule Engines

Technology required by each team


C# X X X X X X
.Net Remoting X X X X X X
ASP.Net X X X
AJAX X X
NHibernate X

Back-end Front-end
across periods. • As soon as teams are identified for the project,
4) Technologies that each team has to know to provide them with information about the technologies
implement their modules. At the bottom of the main they need to know. This will help in the productivity
grid, the technologies that each team needs to know and speed of development of the project as teams will
to implement their assigned modules are specified. become familiar with their development tools before
Teams need to know several technologies for their set the start of the project.
of assigned modules and each technology is specified • Documentation of the back-end functionality can be
in a single row below the main grid. A marked cell focused on intensively during periods 3-4 by teams 1-
that intersects the team name with a technology 3 as they will have “free time” based on the current
specifies which technology that team needs to know work distribution.
to implement their set of modules. • Teams 4-6 will have to coordinate more because they
5) System's high-level modules that each team will be work on the Presentation module and Rules module.
involved in implementing. This information is Ideally, these two modules should be implemented by
expressed horizontally above the main grid. The two teams instead of three; hence, more coordination
intersection of team name and high-level module is required and expected. The central team should
name shows which teams are involved in the facilitate the communication channel among these
implementation of the details of a high-level teams and provide the guidelines for user interface
module(s). design in order to achieve the same look and feel
6) Length of time in terms of periods that teams will throughout the system.
need to be involved in the project. This information
is expressed in the form of empty cells in the main
grid. An empty cell means that a team is not needed 5. Conclusion
during that period of development. Reading this
information vertically indicates how long a team is Software architecture by itself cannot assure the
needed for the duration of the project across the success of a distributed development project. An
periods. Reading this information horizontally gives integrated project that incorporates a project plan, a
no information as the size of modules is not development process, a product integration process, a
specified. configuration management environment, and a software
7) Time available for teams to learn technologies they architecture that all support distributed development
need to know. This information is shown in the main must be in place for the success of a distributed
grid. The existence of empty cells read vertically in a development effort.
team name column indicates that a team has no A project plan for a distributed development project
modules assigned to them in a period. Given that the needs to be structured based on the module
technologies a team needs is known, then a team can dependencies. This is a direct consequence of
use this time to learn a technology before they start dependencies present in the software architecture. A plan
work on their modules. that is based on the software architecture will be more
able to accommodate planning across teams and consider
Several important observations can be made from the integration and communication that needs to occur
work assignment view and interpreting the information between teams. The scheduling and organization of
embedded in this view. These observations are as teams and resources in the project plan can be deduced
follows: from the time line and development effort views defined
• During the first 2 periods, the system's implemented earlier in this article.
functionality will include no development of the user A distributed development environment directly
interface. Modules that constitute the back-end affects the elements and relationships defined in a
functionality will all be implemented. software architecture. During the design of the JOY
• The front-end will be developed in the last 2 periods architecture, the main elements affected by the
of development. distributability requirement were the relationships
• All teams need to learn C# and .NET Remoting. between elements of the architecture because they are
closely related to communication between the teams that
Based on these observations, we could make the implement the elements.
following suggestions: In creating a software architecture, the architect must
consider work assignment as one of the important
allocation views of the software architecture. Work
assignment will be based on (1) skill set of development [2] P. Clemens et al., Documenting Software Architecture,
teams, (2) development team availability, (3) feature list Addison Wesley, United States, February 2004.
requirements, (4) incremental delivery schedule, and (5)
[3] L. Bass et al., Software Architecture in Practice, Addison
module dependencies.
Wesley, United States, October 2004.

[4] T.R. Browning, “Applying the Design Structure Matrix to


6. Acknowledgements System Decomposition and Integration Problems: A Review
and New Directions”, IEEE Transactions on Engineering
We would like to thank Prof. Jim Herbsleb, Tony Management, Vol. 48, No. 3, August 2001, pp. 292-306.
Lattanze, Cliff Huff, and Felix Bachmann for supporting
[5] A. Yassine, “An Introduction to Modeling and Analyzing
us in this project with their knowledge, experience, and
Complex Product Development Processes Using the Design
guidance. Structure Matrix (DSM) Method”, Quaderni di Management
(Italian Management Review), www.quaderni-di-
7. References management.it, No.9, 2004.

[1] J. Herbsleb and R. Grinter, “Splitting The Organization


and Integrating the Code: Conway's Law Revisited”,
International Conference on Software Engineering, 1999.

You might also like