You are on page 1of 42

Module 4 Recovering the

Architecture
Architecture Recovery?
 We frequently need to reason architecturally
about existing systems.
 We need to be able to:
 analyze architectures
 (re)document architectures
 identify architectural dependencies
Architecture Conformance?
 Question: If my architecture was designed
with a particular property in mind, does the
property hold for my target system?
 (Probable) Answer: Who knows?
Why Conformance?
 Architectures are frequently undocumented
or documentation is not current.
 The architecture of the implemented system
must conform to the architectural design.
 Otherwise, architectural drift and erosion are
inevitable.
 But conformance (by hand) is a dreary task.
Momentary Philosophical Break
 What is software architecture?

 Question: Is it a mass delusion that we all


(willingly) participate in?

 Answer: Maybe . . .
What Is This?

And this?
Architecture as Mass Delusion
 Question: What is a layer?
 Answers:
 At worst a delusion.
 At best a convention.
 Nothing that exists in what we actually build (e.g.
no layer construct in a programming language).
 Nothing that is enforceable.
And Yet . . .
 Architectures use these abstractions
regularly.
 Architects think and plan and analyze in
terms of these abstractions.
 So
we should support these abstractions and
connect them to what we do build.
So What Is Architecture Recovery?
 A process in which hypotheses are generated
and tested
 These hypotheses are ideally the inverse of
the mappings developed during design
 The reconstructor must add information
during the reconstruction process
 This depends on the reconstructors available
information and bias
Reconstruction Process

Source Code,
Documentation

Extraction Source Extraction

Source Model,
Fusion Selected Architecture Views
Architecture View
Reconstruction
Composition

Architectural Views,
Styles, Patterns, Drivers
View Extraction
 Apply whatever tools are available or
appropriate for a given target system:
 parsers (Imagix, SNiFF+, CIA, rigiparse,
Understand)
 AST-based analyzers (Gen++, Refine)
 lexical analyzers (LSME)
 profilers (gprof)
 code instrumentation
 ad hoc (grep, perl)
Static Views Alone are Insufficient
 Much architecture-related information may be
extracted statically from source code,
compile-time artifacts and design artifacts.

 Some architecturally relevant information


may not, due to late binding:
 polymorphism
 function pointers
 run-time parameterization
Extraction: Source Model
Source model consists of:
 Collection of source elements
 e.g. files, functions, variables, classes, methods
 Set of relations among elements
 function calls function, function accesses variable
 Attributes
 function calls function N times
Typical Elements and Relations
 Some elements and relations relevant to
software architecture:
 function calls function
 file contains function
 class has_subclass class
 class has_friend class
 process communicates_with process
 process writes file
 ...
Model Manipulation - Motivation
 Architectural elements are not explicitly
represented in source code.
 user interface, repository, cache, etc. are
usually tangled collections of objects
 architectural constructs are realized by many
mechanisms in an implementation
 Therefore, we need to reconstruct
architectural elements.
Model Manipulation
 Interpretive and interactive.
 Not automatic.
 How?
 Query-based pattern matching for:
 typing
 clustering
 Direct manipulation
Model Manipulation
 Primary mechanism for manipulation is the
application of patterns, constructed using a
query language.

 Examples:
 identify types
 aggregate local variables with functions
 aggregate members with classes
 compose architecture-level elements
Example: VANISH

White noise:
a typical source
model.
Apply Generic Patterns

Aggregate
functions and
local variables;
classes and
members; classes
and files.
Apply Application-Specific Patterns

Identify the layers


from the Arch
model.
Apply User Interaction

Hide utilities.
The Final Product

As designed: As implemented:
Dialogue Dialogue

Functional Logical Functional Logical


Core Adapter Interaction Core Adapter Interaction

Functional Presentation Functional Presentation


Core Core
Evaluation - The Rules
 Do they capture the architecture concisely?

SELECT tName SELECT tSubclass


FROM components FROM has_subclass
WHERE tName=vanish-
tName=vanish-xforms.cc WHERE tSuperclass=Presentation;
OR tName=PrimitiveOp print Logical_Interaction $fields[0];
OR tName=Mapping
OR tName=MappingEditor SELECT tName
. . . FROM components
OR tName=InputValue WHERE tName=Presentation
OR tName=Point OR tName=BSpline
OR tName=VEC OR tName=Colour;
OR tName=MAT print Logical_Interaction $fields[0];
OR ((tName ~ Dbg$ OR tName ~
Event$) AND tType=Class);
print Dialogue $fields[0];

BAD GOOD
Evaluation - The View

 Is it suitable for communication?

 Can it answer questions about the system?

 Does it exhibit good conceptual integrity?


Evaluation - The Architecture

 Perform analyses.

 Look for opportunities for:


 reengineering
 reuse
 documentation
View Extraction and Fusion
 Motivation:
 Source code extraction is error prone
 Extractors will frequently produce partially
overlapping data
 Limited # of views derivable from source artifacts
alone
View Extraction and Fusion - 2
 Source code analysis is insufficient:
 late binding
 system topology information

 This information is available in artifacts other


than source models.
 Therefore, multiple extracted views are
required.
View Extraction and Fusion - 3
 A complete view of a system requires views to
be fused:
 different views provide complementary information
 users need to navigate among views
 one view can improve another

 Fusion involves reconciliation of views.


Fusion for View Improvement
 Static extraction tools produce false
negatives and false positives.
 Dynamic extraction tools rarely (if ever)
produce false positives, but produce many
false negatives.
 Idea: Leverage the high quality dynamic
views to improve the low quality static views.
Fusion Example
Static Extraction Dynamic Extraction
InputValue::GetValue ArithmeticOp::Compute
InputValue::SetValue AttachOp::Compute
List::[] ...
List::attachr StringOp::Compute
List::detachr InputValue::GetValue
List::length InputValue::InputValue
PrimitiveOp::Compute InputValue::SetValue
InputValue::~InputValue
List::List
Need to map between List::getnth
dynamic and static views of List::length
PrimitiveOp List::[]
List::~List
View Improvement Results

Added to Fused View Removed from Fused View


InputValue::InputValue ArithmeticOp::Compute
InputValue::~InputValue AttachOp::Compute
List::List ...
List::~List StringOp::Compute
List::getnth

This technique can dramatically reduce false negatives:


e.g. from 45 to 18 in one 1,000 line source file
Disambiguation of Function Calls
 In a multi-process application name clashes
are inevitable.
 Fusion of a build dependency view with a
static calls view disambiguates these.
 Query:
 for each function, what executable is it built into?
 prepend the executable name to the function
name
IPC / Function Pointers / File Access
 The topology of many applications is not
determined until run time.
 Important information is frequently passed via
files.
 None of this is derivable from calls
information.
Example 1

Dialogue Application

Presentation
Example 2

IPC
Example 3

File
access
Instrumentation
 If information (such as file accesses, IPC) is
not available through an extractor, we
instrument the code.
 e.g. for this example we instrumented:
 all calls to fopen
 all IPC calls
Guidelines for Reconstruction
 Obtain a high-level view of the architecture
before starting reconstruction:
 helps to identify what to look for
 helps to identify what source information to extract
(what information is architecturally relevant?)

 Use least effort extraction


 can we use lightweight (lexical, perl) or
heavyweight extraction (full-blown parsing)?
Guidelines for Reconstruction
 The reconstructor and architect need to work
closely together when generating the
abstractions
 try out different hypotheses and generate different
views

 It is important that the maintainers and


developers are involved in the reconstruction
process at an early stage
 they can help to identify what to look for
Applications of Architecture Recovery
 Redocumenting architectures for physics simulation systems.
 Understanding architectural dependencies in embedded machine
control software to enable reengineering of the software.
 Evaluating conformance of a satellite ground station systems
implementation to its reference architecture.
 Reconstructing and evaluating the product line potential of several
automotive systems
 Reconstructing a Network Management System
 Reconstructing a Command and Control System to identify component
dependencies for migration to services in an SOA
 Reconstructing an embedded automotive system to examine
dependencies between components
Why Architecture Reconstruction
 Reasons for doing Architecture Recovery:
 Helps in understanding a system
 Assists with documenting a system
 Checking conformance of documentation
 Identifies undocumented information
 Identifies dependencies for reengineering
 Identifies dependencies for mining and reusing
components
 Provides assistance for architecture analysis
Conclusions
 Having an architecture is great for
communication and analysis, but:
 documented architectural decisions and
implemented architectural decisions drift apart

 Architecture reconstruction is a process of


rediscovering how architectural decisions
were implemented
 its a lot of work, but worth it!

You might also like