Professional Documents
Culture Documents
at the Bottom
Shankar Balachandaran
Dept. of Computer Science and Engineering
Indian Institute of Technology Madras
shankar@cse.iitm.ac.in
Who Said This?
1946 in UPenn
Measured in cubic ft.
ENIAC on a Chip
0.05
0.03
83 86 89 92 95 98 01 04 07
Rocket
1000
Nuclear Nozzle
Reactor
100
Hot Plate
10
8086
8008 P6
4004 8085 Pentium®
286 386
486
8080
1
1970 1980 1990 2000 2010
Productivity Gap
“How many gates
can I get for $N?”
Complexity
How to Handle Complexity?
VLSI Design Flow
Specifications
X = AB;
Y = CD;
Z=
X+Y;
Design Automation
Tools are used at every step
Manual intervention is still required
Tools do not scale up very well
Many problems are NP-Complete
Theory vs Practice
Before Tools
Faggin laid out 4004 by hand
Drawn on paper and photographed
Demagnified 500 times smaller
Seymour Cray relied on humans to build
supercomputers
Women used to cut wires by hand
Delicate handwork was necessary
Almost no verification or validation
Chips may not function properly
Market may return products
Design Goals Over Time
speed/area speed
speed/power/reliability
area Speed +
Power
power reliable
low power ultra-low power
?
Yield
Manufacturability
Power Area
Portable application – Cost of the die.
PCS, wireless, etc.
DAC Panel – Where Should the
R&D Money be Spent?
Variability/Litho/Mask/Fab Low Power/Leakage
Power Delivery/Integrity Tool/Flow Enhancements/OA
IP Reuse/Abstraction/SysLevel Design DSM Analysis
P&R and Opt Others (Lotto)
100%
80%
60%
40%
20%
0%
Intel IBM Synopsys TUE- Cadence STMicro
Magma
Required Advance in Design System Architecture
Yesterday 1000nm Today 180nm
Today 130nm Tomorrow 50nm
Functional
Software SPEC Performance
Logic Design Hw/Sw Perf. Testability
Design Optimization Model Verification
Functional
Cockpit
Verification
RTL SW Auto-Pilot
RTL SW Optimize Analyze
Opt Hw/Sw Comm. Perf.
Synthesis SW Timing
EQ check
Synthesis Logic Hw/Sw Power
+ Timing Analysis Circuit Data Noise
Equivalence checking
File
File
Multiple design files are converged into one efficient Data Model
Timing Analysis Disk accesses are eliminated in critical methodology loops
MASKS Verification of Function, Performance, Testability and other design
Performance criteria all move to earlier, higher levels of abstraction followed by
File Verification equivalence checking and
assertion driven design optimizations
Testability Industry Standard interfaces for data access and control
MASKS Verification Incremental modular tools for optimization and analysis
New Table 8
Advances in Software
Compiler can control sequence of instructions
to mitigate power
Reorder instructions so that the number of bit
changes between successive instructions is
minimized
Aggressive optimization to prevent Cache
miss
Accurate branch prediction to avoid pipeline
flush and refill
Garbage Collection – Efficient turning off
Loop Level Power Control
Instruct hardware to turn itself off when not
needed
Header Header
Loop Loop
Loop Loop Body-
Body-
Body-I Body-I I I
Loop Header
Turn off Body-
Loop
Loop II
Body-
Body-II II
Loop
Body-II
Turn off
Turn off
Design Automation Problems
Interconnect delay dominates system
performance
Consumes 70% of clock cycle
Multiple clock cycles required to cross chip
whether 3 or 15 not as important as fact of
“multiple” > 1
Correct by construction methodology
Avoid iterative Logic Synthesis, Placement and
Routing loop
Prevention is better than cure
DA – Complexity Issues
Silicon Complexity + Design Complexity
Design convergence – Abstract what’s beneath
Prevent instead of analyze + verify
Many issues become first class citizens
Unify
Keep the database and models unified
Tight integration of synthesis with layout issues
Cost issue
Reuse of IP blocks
Design Convergence
What must converge?
Timing, Logic…………..
Provide predictable back-end
Correct by construction – “assume” then “enforce”
Constraints and assumptions are sent downstream
Not much goes upstream
Construct by Correction
Concurrently Optimize Logic + Layout
Elimination of concerns
Reduce degrees of freedom
Partition the design into globally asynchronous and locally
synchronous modules
Interconnect Complexities
Blocks cannot grow in size
Designing them becomes difficult
Small blocks mean more wires
Local wires
Occurrence Rate
(Normalized)
Global wires
~0.5
wirelength
die size
Placement Needs
More hierarchical than flat
Support placement of partial designs
Incremental model
Construct by correct model
Characterizable for synthesis
Placement + Synthesis ????
buffering
resizing
cloning
Router Needs
Hierarchical
Scalable
Should not break down for large number of
nets/large area of routing
Incremental, controllable, well-characterized
Detunable (e.g., coarse/quick routing), ...
Also……..
Degrees of freedom
Wire widths/spacings, shielding/interleaving,
driver/repeater sizing
Router empowered to perform small logic resyntheses
Change in search mechanisms
Iterative ripup/reroute replaced by “atomic topology
synthesis utilities”
Construct entire topologies to satisfy constraints in arbitrary
contexts
Combinatorics
Millions of cells
Millions of moves
Orientation
Pin position
Multiple orthogonal objectives
Divide and Conquer
Divide the Problem into Smaller Sub-Problems
Solve Each of these Separately
Stitch together the Solutions of the Sub-Problems
30% 400W
20% 88W
12W
10%
0%
2000 2002 2004 2006 2008
Power vs Energy
Power is height of curve
Watts Lower power design could simply be slower
Approach 1
Approach 2
time
Approach 2
time
Design Space
Constant Variable
Throughput/Latency Throughput/Latency
Sleep Transistors
+ Variable
Leakage + Multi-VT Multi-Vdd
VT
Variable VT
Leakage Control
Vdd 0 1 1 0 1 0
High Vt Vdd
Low Vt Variable
Vt
Logic
Logic
High Vt
Substrate or SOI
back gate Vt
control
Dynamic Thermal Management
Trigger Mechanism:
When do we enable Initiation Mechanism:
DTM techniques?
How do we enable
technique?
Response Mechanism:
What technique do we
enable?
Crosstalk Induced Errors
Transition on an adjoining signal causes
unintended logic transition
Symptom: chip fails (repeatably) on certain
logic operations
Coupling C
Victim net
Wire R
Wire R
Grounded C Coupling C
3
Delay
2.5
1.5
0.5
0
0 1 2 3 4 5 6
Risetime of interferer / Risetime of victim
Electrical Optimization
1 3
2
4
5
Consider All Issues in SP&R
Physical … proximity of the signal
Temporal … noise event occurs timing
window
Critical … is the path important
Electrical … driver strength vs pin cap
Congestion Estimation
More congested regions will have more crosstalk
IR Drop
• Voltage drop in supply lines from currents drawn
by cells
• Symptom: chip malfunctions on certain vectors
• Biggest problem: what's the worst-case vector?
Electromigration
Power supply lines fail due to excessive
current
Symptom: chip eventually fails in the field
when wire breaks
Cause for EM
hillock
voids
Layout Uncertainties
0.18µ
Layout 0.25µ
200
150
100
50
0
180nm 130nm 90nm 70nm
MEBE S Data Vo lu m e vs. Tech n o lo g y No d e
Photolithographic Process
optical
mask
oxidation
UMC Taiwan
Two New Paradigms
Design for Manufacturability
Make manufacturer’s life easier
Correct by construction
Close interaction with the fabs
Design for Yield
Consider various yield issues during design
Cells with high yield statistics are better for library
Strict design practices
How Can You Help?
Broad research area
Various background required
Computer Science
Algorithms, Databases, Graphics, Visualization
Electrical Engineering
Circuit designers – Digital, Analog, Mixed, RF
Material Science
Physics
Mathematics
Think Big
By thinking about small things
Skill Sets Required
Front end design
C, C++, Perl
VHDL or Verilog
Good Understanding of Digital Circuit
fundamentals
CAD Tool Designer
Good programming skills
Optimization Theory
Algorithms, Data Structures
Skill Sets Required (contd.)
ASIC Designers
VLSI tools
Digital Circuits
Design Styles – ASIC vs FPGA vs …
Circuit Designers
Spice, Layout
CMOS, Low power design
Everyone
Applications – DSP, Wireless Communication, Data
Communication
Computer Architecture, Digital Circuits
Publications in reputed conferences and journals
Potential Employers
Front end designers and ASIC designers
Endless List
Architects
Intel, AMD, ARM, Xilinx, Altera ………
CAD Tool Designers
Synopsys, Cadence, Magma, Mentor Graphics, Synplicity
Circuit Designers
TI, Broadcom, Motorola …………
Material Science
AMAT, STM, TI …………..
Academia
For Teachers
Many skills required
Not enough time
Develop courses around these skill sets
Take summer breaks to work for companies
Update your knowledge base
Homework is the best cure
Famous Quotes
IBM founder T. J. Watson in 1945
“I don’t think there will ever be a market for more
than 5 computers in this world.”
Ken Olson, president of Digital Equipment
Corp. 1977
There is no reason anyone would want a
computer in their home.
There is Plenty of
Room at the Bottom
Questions?
References
Course Notes of EE6325 CMOS VLSI Design
at UTD
ITRS 2003
ICCAD 2000 Tutorial on Current Issues in
CAD
MUSIC 2005
Jan Rabaey’s Course Notes
Other web resources for the historical
aspects of VLSI
Thank You