Sas 02

Compactly Representing
First-Order Structures for

Static Analysis
Tel-Aviv University
Roman Manevich
Mooly Sagiv
I.B.M T.J. Watson

Ganesan Ramalingam
John Field
Deepak Goyal
Motivation
TVLA is a powerful and general abstract

interpretation system
Abstract interpretation in TVLA
Operational semantics is expressed with

first-order logic formulae
Program states are represented as
sets of Evolving First-Order Structures
Space is a major bottleneck
Desired Properties
Sparse data structures
Share common sub-structures
Inherited sharing
Incidental sharing due to program invariants
But feasible time performance
Phase sensitive data structures
Outline
Background
First-order structure representations
Base representation (TVLA 0.91)

BDD representation
Empirical evaluation
Conclusion
First-Order Logical Structures
Generalize shape graphs

Arbitrary set of individuals
Arbitrary set of predicates on individuals
Dynamically evolving
Usually small changes
Properties are extracted by evaluating first

order formula: v1 , v: x(v1) n(v1, v)
Join operator requires isomorphism testing
First-Order Structure ADT
Structure : new() /* empty structure */

SetOfNodes : nodeSet(Structure)
Node : newNode(Structure)
removeNode(Structure, node)
Kleene eval(Structure, p(r), <u1, . . . ,ur>)
update(Structure, p(r), <u1, . . . ,ur>, Kleene)
Structure copy(Structure)
print_all Example
/* list.h */
typedef struct node {
struct node * n;
int data;
} * L;
/* print.c */
#include list.h
void print_all(L y) {
L x;
x = y;
while (x != NULL) {
/* assert(x != NULL) */
printf(elem=%d, xdata);
x = xn;
}
}
print_all Example
n=
x = y
x(v) := y(v)
copy(S0) : S1
nodeset(S0) : {u1, u}
eval(S0, y, u1) : 1
update(S1, x, u1, 1)
eval(S0, y, u) : 0
update(S1, x, u, 0)
S0
u1
y=1
n=
u
sm=
n=
S1
u1
y=1
x=1
n=
u
sm=
print_all Example
n=
while (x != NULL)
precondition : v x(v)
x = x n
focus : v1 x(v1) n(v1, v)
x(v) := v1 x(v1) n(v1, v)
S1
u1
u
n=
x=1
sm=
y=1
n=
S2.0
u
sm=
u1
y=1
n=
S2.1
u1
y=1
n=1
u
x=1
n=
S2.2
u1
y=1
n=1
u.1
x=1
n=
n=
u.0
sm=
Overview and Main Results

1.
Two novel representations of first-order

structures
2.
3.
New BDD representation

New representation using functional maps
Implementation techniques
Comparison of different representations

Space is reduced by a factor of 410
New representations scale better
Base Representation
(Tal Lev-Ami SAS 2000)
Two-Level Map :
Predicate (Node Tuple Kleene)
Sparse Representation
Limited inherited sharing by
Copy-On-Write
BDDs in a Nutshell (Bryant 86)
Ordered Binary Decision Diagrams

Data structure for Boolean functions
Functions are represented as (unique) DAGs
x1
x2
x3
x1
x2
x2
x3
0
x3
0
x3
1
x3
1
BDDs in a Nutshell (Bryant 86)
Ordered Binary Decision Diagrams

Data structure for Boolean functions
Functions are represented as (unique) DAGs
Also achieve sharing across functions
x1
x1
x2
x3
x2
x3
x3
x3
1
Duplicate Terminals
x1
x2
x2
x3
x3
Duplicate Nonterminals
x2
x3
0
Redundant Tests
Encoding Structures Using Integers
Static encoding of
Dynamic encoding of nodes
Predicates
Kleene values
0, 1, , n-1
Encode predicate ps values as
ep(p).en(u1). en(u2) . . en(un) . ek(Kleene)
BDD Representation of Integer Sets
Characteristic function
S={1,5}
1=<001>
=
5=<101>
(x1x2x3) (x1x2x3)
x2
x1
x2
x3
BDD Representation of Integer Sets
Characteristic function
S={1,5}
1=<001>
=
5=<101>
(x1x2x3) (x1x2x3)
x2
x1
x2
x3
1
BDD Representation Example

n=
S0
u
u1 n=
sm=
y=1
S0

n=
S0
u
u1 n=
sm=
y=1
S0
S1
x=y
n=
S1
u1
u
n=
x=1
sm=
y=1

n=
S0
u
u1 n=
sm=
y=1
S0
S1
x=y
n=
S1
u1
u
n=
x=1
sm=
y=1
x=xn
n=
S2.2
u1
y=1
n=1
u.1
x=1
n=
n=
u.0
sm=
S2.2

n=
S0
u
u1 n=
sm=
y=1
S0
S1
x=y
n=
S1
u1
u
n=
x=1
sm=
y=1
x=xn
n=
S2.2
u1
y=1
n=1
u.1
x=1
n=
n=
u.0
sm=
S2.2
Improved BDD Representation
Using this representation directly

doesnt save space
Observation
Our heuristics
Node names can be arbitrarily remapped without

affecting the ADT semantics
Use canonic node names to encode nodes
Increases incidental sharing
Reduces isomorphism test to pointer comparison
4-10 space reduction
Reducing Time Overhead
Current implementation not optimized
Expensive formula evaluation
Hybrid representation
Distinguish between phases:

mutable phase Join immutable phase
Dynamically switch representations
Functional Representation
Alternative representation for first-order structures

Structures represented by maps from integers to
Kleene values
Tailored for representing first-order structures
Achieves better results than BDDs
Techniques similar to the BDD representation
More details in the paper
Empirical Evaluation
Benchmarks:
Cleanness Analysis (SAS 2000)

Garbage Collector
CMP (PLDI 2002) of Java Front-End and Kernel
Benchmarks
Mobile Ambients (ESOP 2000)
Stress testing the representations
We use relational analysis

Save structures in every CFG location
Space Results
450
402.8
400
350
300
Base
OBDD total
Functional
250
200
187.7
168.2
150
100
50
51.6
12.8
5.5
22.7 16.7
12.9
9.6
0
JFE
KERNEL
CA
MA
GC
Abstract Counters
Ignore language/implementation details

A more reliable measurement technique
Count only crucial space information

Independent of C/Java
Abstract Counters Results

45,000,000
40,000,000
35,000,000
30,000,000
Base
OBDD
Functional
25,000,000
20,000,000
15,000,000
10,000,000
5,000,000
0
JFE
KERNEL
CA
MA
GC
Trends in the
Cleanness Analysis Benchmark
600
500
564
505
400
Base
OBDD
Functional
300
200
100
0
74
54
42
50
1
10
Whats Missing from this Work?
Investigate other node mapping heuristics

Compactly represent sets of structures
Time optimizations
Conclusions
Two novel representations of first-order structures
Implementation techniques
New BDD representation

New representation using functional maps
Normalization techniques are crucial
Comparison of different representations

Space is reduced by a factor of 410
New representations scale better
Conclusions
The use of BDDs for static analysis

is not a panacea for space saving
Domain-specific encoding crucial for saving space

Failed attempts
Original implementation of Veiths encoding
PAG
The End

Sas 02

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sas 02

Uploaded by

Copyright:

Available Formats

Compactly Representing

First-Order Structures for

I.B.M T.J. Watson

TVLA is a powerful and general abstract

Operational semantics is expressed with

Space is a major bottleneck

Sparse data structures

Share common sub-structures

But feasible time performance

Phase sensitive data structures

Base representation (TVLA 0.91)

First-Order Logical Structures

Generalize shape graphs

Usually small changes

Properties are extracted by evaluating first

First-Order Structure ADT

Structure : new() /* empty structure */

update(Structure, p(r), <u1, . . . ,ur>, Kleene)

Overview and Main Results

Two novel representations of first-order

New BDD representation

Comparison of different representations

BDDs in a Nutshell (Bryant 86)

Ordered Binary Decision Diagrams

BDDs in a Nutshell (Bryant 86)

Ordered Binary Decision Diagrams

Also achieve sharing across functions

Encoding Structures Using Integers

Dynamic encoding of nodes

Encode predicate ps values as

ep(p).en(u1). en(u2) . . en(un) . ek(Kleene)

BDD Representation of Integer Sets

BDD Representation of Integer Sets

BDD Representation Example

BDD Representation Example

BDD Representation Example

BDD Representation Example

Improved BDD Representation

Using this representation directly

Node names can be arbitrarily remapped without

4-10 space reduction

Reducing Time Overhead

Current implementation not optimized

Expensive formula evaluation

Distinguish between phases:

Alternative representation for first-order structures

Cleanness Analysis (SAS 2000)

Stress testing the representations

We use relational analysis

Ignore language/implementation details

Count only crucial space information

Abstract Counters Results

Whats Missing from this Work?

Investigate other node mapping heuristics

Two novel representations of first-order structures

New BDD representation

Comparison of different representations

The use of BDDs for static analysis

Domain-specific encoding crucial for saving space

You might also like