You are on page 1of 14

ASIC vs FPGA

S. Mancini

Plan Introduction
ASICs FPGA Modles de cots

Problmatique

FPGA ou ASIC ?

Mthodologie de conception Durcissement aux radiations SoC Bilan

Sur quels critres fonder son choix ? Quels sont les points communs et diffrences des mthodes de conception ?

1- ASIC vs FPGA

2- ASIC vs FPGA Introduction

S. Mancini

Plan Introduction
ASICs FPGA Modles de cots

Les familles Les ASICs (Application Specic Integrated Circuit) se dcomposent en plusieurs familles : Full Custom
Les masques des transistors sont dessins.

Standard cells
Le circuit est un assemblage de cellules places/routes.

Mthodologie de conception Durcissement aux radiations SoC Bilan

Gate array
Une mer de portes est route.

Embedded Gate array


Cest un Gate array avec des macro-blocs complexes (RAM).

3- ASIC vs FPGA

4- ASIC vs FPGA Introduction- ASICs

S. Mancini

Technology Trends Evolution des technologie


Process Generation
250nm 250nm

Plan Introduction
ASICs FPGA Modles de cots


99 ITRS (International Technology Roadmap for Semiconductors)

180nm 180nm

130nm

00 ITRS 01 ITRS

100nm 90nm

Leading Foundry

65nm

1997 1997 1999 1999 2001 2001 2003 2003

Year
1

2005 2005

2007 2007

2009 2009

1st April, 2003

UK Design Forum

Technologie 90 nm

430 KPortes/mm2 SRAM 1.6 1.2 mm2 par Mbit DRAM 0.5 mm2 par Mbit 6 9 couches de mtal
6

Mthodologie de conception Durcissement aux radiations SoC Bilan

5- ASIC vs FPGA Introduction- ASICs

6- ASIC vs FPGA

S. Mancini

Principe Proposer des circuits gnriques recongurables volont. Ils sont constitus de matrices de cellules recongurables et dun rseau dinterconnexion. Principaux vendeurs :
Actel Altera Atmel Cypress Lattice Minc QuickLogic Xilinx

Technologies de programmation Les trois principales technologies de programmation sont :


Q

SRAM

RW Data

Flash

Les technologies diffrent par : La technologie de mmorisation de la conguration Le type de cellules lmentaires
7- ASIC vs FPGA Introduction- FPGA

Anti-fusibles

Recongurable dynamiquement Technologie standard Perte de conguration la mise hors tension

S. Mancini

8- ASIC vs FPGA Introduction- FPGA

S. Mancini

Technologies de programmation Les trois principales technologies de programmation sont : SRAM


Grille flottante

Technologies de programmation Les trois principales technologies de programmation sont : SRAM


Antifusible

Flash

Conserve la conguration Circuit autonome Technologie non-standard

Flash Encombrement minimal Anti-fusibles Non reprogrammable Technologie spcique

Anti-fusibles
Pr o A SI C P L U S F la s h F a m il y F P GA s

Pr oA S I C PL U S A r c hi t e c t u r e

The proprietary ProASIC architecture granularity comparable to gate arrays.

PLUS

provides

the Embedded Memory Configurations section on page 21 for more information.


Fla sh S wit ch

The ProASICPLUS device core consists of a Sea-of-Tiles (Figure 1). Each tile can be configured as a 3-input logic function (e.g., NAND gate, D-Flip-Flop, etc.) by 9- ASIC vs FPGA programming the appropriate Flash switch interconnections (Figure FPGA Introduction- 2 on page 6 and Figure 3 on page 6). Tiles and larger functions are connected with any of the four levels of routing hierarchy. Flash switches are distributed throughout the device to provide nonvolatile, reconfigurable interconnect programming. Flash switches are programmed to connect signal lines to the appropriate logic cell inputs and outputs. Dedicated high-performance lines are connected as needed for fast, low-skew global signal distribution throughout the core. Maximum core utilization is possible for virtually any design. ProASICPLUS devices also contain embedded two-port SRAM blocks with built-in FIFO/RAM control logic. Programming options include synchronous or asynchronous operation, two-port RAM configurations, user defined depth and width, and parity generation or checking. Please see

Unlike SRAM FPGAs, ProASICPLUS uses a live on power-up ISP Flash switch as its programming element. In the ProASICPLUS Flash switch, two transistors share the floating gate, which stores the programming information. One is the sensing transistor, which is only used for writing and verification of the floating gate voltage. The other is the switching transistor. It can be used in the architecture to connect/separate routing nets or to configure logic. It is also used to erase the floating gate (Figure 2 on page 6).
Logi c Ti le
Sensing

S. Mancini
Floating Gate Switch In

Pr o A S I C P L U S F la s h F a m il y F P GA s

10- ASIC vs FPGA Introduction- FPGA

S. Mancini

The logic tile cell (Figure 3 on page 6) has three inputs (any or all of which can be inverted) and one output (which can connect to both ultra-fast local and efficient long-line routing resources). Any three-input, one-output logic function (except a three-input XOR) can be configured as one tile. The tile can be configured as a latch withFigure 2 Flash Switch clear or set or as a flip-flop with clear or set. Thus, the tiles can flexibly map logic and sequential gates of a design.

Switching

Actel (ProAsic)

Word Switch Out

Actel (Axcelerator)
A x c e le r a t o r F a m il y F P GA s

SuperCluster

TX

TX RX B

TX RX

TX

A x c e le r a t o r F a m il y F P G

C C
Local Routing In 1 Efficient Long-Line Routing
RAMC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD RD SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC SC HD SC SC SC SC SC SC SC SC SC SC SC SC SC SC

R
RX RX

RAM Block 256x9 Two-Port SRAM or FIFO Block


In 2 (CLK)

4k RAM/ FIFO 4k RAM/ FIFO

RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC HD RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC RAMC

In 3 (Reset)

I/Os

Figure 3 Core Logic Tile


Rou ti ng Res our ces

Switch in
1

Logic Tile RAM Block

The routing structure of ProASIC devices is designed to provide high performance through a flexible four-level hierarchy of routing resources: ultra-fast local resources, Test efficient long-line resources, high speed very long-line resources, and high performance global networks. The ultra-fast local resources are dedicated lines that allow the output of each tile to connect directly to every input of the eight Mot surrounding tiles (Figure 4 on page 7). distances and higher fanout connections. These resources vary in length (spanning 1, 2, or 4 tiles), run both vertically and horizontally, and cover the entire ProASICPLUS device (Figure 5 on page 7). Each tile can drive signals onto the efficient long-line resources, which can in turn, access every input of every tile. Active buffers are inserted automatically by routing software to limit the loading effects due to distance and fanout.

Grille flottante PLUS

The high-speed very long-line resources, which span the entire device with minimal delay, are used to route very long or very high fanout nets. (Figure 6 on page 8). The high-performance global networks are low skew, high Switch fanout nets that are accessible from external pins or from internal logic (Figure 7 on page 9). These nets are typically used to distribute clocks, resets, and other high fanout nets requiring a minimum skew. The global networks are implemented as clock trees, and signals can be introduced at any junction. These can be employed hierarchically with signals accessing every input on all tiles.

Chip Layout

4k RAM/ FIFO 4k RAM/ FIFO

SC CoreSC SC Tile

256x9 Two Port SRAM The efficient long-line resources provide routing for longer or FIFO Block

I/O Structure (See Figure 6)

Switch out

Figure 2 Axcelerator Family Interconnect Elements

Figure 1 The ProASICPLUS Device Architecture

Flash

Logic Modules
Actels Axcelerator family provides two types of logic modules, the register cell (R-cell) and the combinatorial Figure 6 AX Device Architecture (AX1000 shown) 1 cell (C-cell). The AX C-cell can implement more than 4,000 In addition, every SRAM block has an embedded FIFO Table 1 Number of Core Tiles per Device combinatorial functions of up to 5 inputs (Figure 3 on control unit. The control unit allows the SRAM block to be Device Number of Core Tiles page 5). The C-cell contains carry logic for even more configured as a synchronous FIFO without using core logic efficient implementation of arithmetic functions. With its AX125 1 regular tile modules. The FIFO width and depth are programmable. The small size, the C-cell structure is extremely AX250 4 smaller tiles FIFO also features programmable ALMOST-EMPTY synthesis-friendly, simplifying the overall design as well as AX500 4 regular tiles (AEMPTY) and ALMOST-FULL (AFULL) flags in addition to reducing design time. AX1000 9 regular tiles the normal EMPTY and FULL flags. The embedded FIFO AX2000 16 regular tiles The R-cell contains control unit also contains the counters necessary for thea flip-flop featuring asynchronous clear, asynchronous preset, generation of the read and write address pointers as well as and active-low enable control signals (Figure 3 on page Embedded Memory control circuitry to prevent metastability and erroneous 5). The R-cell registers feature programmable clock polarity selectable on a operation. The embedded SRAM/FIFO blocks can be As mentioned earlier, each core tile has either three (in a register-by-register basis. This provides additional flexibility cascaded to create larger configurations. smaller tile) or four (in the regular tile) embedded SRAM (e.g., easy mapping of dual-data-rate functions into the blocks along the west side, and each variable-aspect-ratio FPGA) while conserving valuable clock resources. The clock SRAM block is 4,608 bits in size. Available memory source for the R-cell can be chosen from the hard-wired configurations are: 128x36, 256x18, 512x9, 1kx4, 2kx2 or 12- ASIC vs FPGA clocks, the routed clocks, or the internal logic. 4kx1 bits. The individual blocks have separate read and Introduction- FPGA configured with different bit widths Two C-cells, a single R-cell, and two Transmit (TX) and two write ports that can be Receive (RX) routing buffers form a Cluster, and two on each port. For example, data can be written in by 8 and Clusters comprise a SuperCluster (Figure 4 on page 5). read out by 1. The embedded SRAM blocks can be initialized Each SuperCluster contains an independent Buffer module, at power up via the device JTAG port (ROM emulation which supports automatic buffer insertion on high-fanout mode). nets by the place-and-route tool, minimizing system delays while improving logic utilization.

Circuit APA100
System Gates v3.1 Tiles (Registers) RAM 1 000 000 56 320 198 kBit
6

Circuit AX2000
2 000 000 10 752 21 504

The logic modules within the SuperCluster are arrange that two combinatorial modules are side by side, givin CCR CCR pattern to the SuperCluster. This CC pattern enables efficient implementation (minimum de of 2-bit carry logic for improved arithmetic performa (Figure 5 on page 5). The AX architecture is fully fracturable, meaning that if or more of the logic modules in a SuperCluster are used particular signal path, the other logic modules are available for use by other paths.

PLL Clocks

2 88

v3.1

System Gates R-Cells C-Cells

RAM PLL Clocks

338 kBit 8 4

11- ASIC vs FPGA Introduction- FPGA

S. Mancini

At the chip level, SuperClusters are organized into c tiles, which are arrayed to build up the full chip. Each c tile consists of an array of 336 SuperClusters and four SR blocks (176 SuperClusters and 3 SRAM blocks for AX250). The SRAM blocks are arranged in a column on west side of the tile (Figure 6 on page 6). For example, AX1000 is composed of a 3x3 array of 9 core t Surrounding the array of core tiles are blocks of I/O Clus and the I/O bank ring (Table 1 on page 6).

S. Mancini

Advanced v1.5

Advanced v1.5

tent of ured as alue in

Device XC2VP2 XC2VP4 XC2VP7

Columns 4 4 6 8 8 10 12 14 16 18

Blocks 12 28 44 88 136 192 232 328 444 556

in Kb 216 504 792 1,584 2,448 3,456 4,176 5,904 7,992 10,008

in Bits 221,184 516,096 811,008 1,622,016 2,506,752 3,538,944 4,276,224 6,045,696 8,183,808 10,248,192
Configurable Logic Blocks (CLBs)
The Virtex-II Pro configurable logic blocks (CLB) are organized in an array and are used to build combinatorial and synchronous logic designs. Each CLB element is tied to a switch matrix to access the general routing matrix, as shown in Figure 23.
COUT TBUF TBUF
R

write

XC2VP20 XC2VP30 XC2VP40 XC2VP50 XC2VP70 XC2VP100 XC2VP125

Functional Description: FPGA

ge)

Figure 43 shows the layout of the block RAM columns in the XC2VP4 device.
DCM
RocketIO TM Functional DCM Serial Transceivers Description: FPGA

Xilinx (Spartan 3/Virtex II)


Slice X1Y1

A CLB element comprises 4 similar slices, with fast local feedback within the CLB. The four slices are split in two columns of two slices with two independent carry logic chains and one common shift chain.

Slice Description
Each slice includes two 4-input function generators, carry logic, arithmetic logic gates, wide function multiplexers and two storage elements. As shown in Figure 24, each 4-input function generator is programmable as a 4-input LUT, 16 bits of distributed SelectRAM+ memory, or a 16-bit variable-tap shift register element.

Altera (Apex/Stratix)

2_050901

Configurable Logic Blocks (CLBs)

endent All conion.

BRAM Multiplier Blocks

The Virtex-II Pro configurable logic blocks (CLB) are organized in an array and are used to build combinatorial and synchronous logic designs. Each CLB element is tied to a CLBs switch matrix to access the general routing matrix, as shown in Figure 23.

A CLB element comprises 4 similar slices, with fast local Slice feedback within the CLB. The four slices are split in two colX1Y0 umns of two slices with two independent carry logic chains COUT Switch and one common shift chain.
Matrix

Slice Description Slice

SHIFT CIN

RAM16

ORCY MUXFx

CLBs

COUT TBUF TBUF Slice X1Y1 Slice X1Y0 SHIFT Slice X0Y1 Slice X0Y0 CIN

Each slice includes X0Y14-input function generators, carry two logic, arithmetic logic gates, wide function multiplexers and Slice two storage elements. As shown in Figure 24, each 4-input Fast X0Y0 function generator is programmable as a 4-input LUT, 16 Connects bits of distributed SelectRAM+ memory, or a to neighbors 16-bit variable-tap shift register element. CIN
DS083-2_32_122001

SRL16 LUT G RAM16 MUXF5 SRL16 LUT F CY Register/ Latch CY Register/ Latch

CLBs

CLBs

CLBs

PPC405 CPU Switch


Matrix

COUT

Reset

RAM16

Figure 23: Virtex-II Pro CLB Element 1


ORCY MUXFx SRL16 CY Register/ Latch

Arithmetic Logic
DS083-2_31_122001

ute)

Fast Connects to neighbors


DS083-2_32_122001

LUT G RAM16

CIN MUXF5 SRL16 LUT F CY Register/ Latch

Figure 24: Virtex-II Pro Slice Configuration

NIT_xx egister serted ailable config-

Figure 23: Virtex-II Pro CLB Element

Arithmetic Logic

DCM

RocketIO TM Serial Transceivers

DCM
DS083-2_11_010802
DS083-2_31_122001

ized in column Blocks,

18-Bit x 18-Bit Multipliers


Introduction

Figure 43: XC2VP4 Block RAM Column Layout 1 Circuit Spartan 3 VirtexII

Figure 24: Virtex-II Pro Slice Configuration 1

Logic Cells 74 880 125 136 Slices 33080 55 616 A Virtex-II Pro multiplier block is an 18-bit by 18-bit 2s comRAM 2,5 Virtex-II plement signed multiplier. MBit Pro 11 MBit devices incorporate
many embedded multiplier blocks. These multipliers can be associated with an 18 Kb block SelectRAM+ resource or can be used independently. They are optimized for 13- ASIC vs FPGA high-speed operations and have a lower power consumpIntroduction- FPGA an 18-bit x 18-bit multiplier in slices. tion compared to

Circuit Mult. (18x18) Clock man. P

Spartan 3 104 4 0

VirtexII 556 12 4 PPC

Circuit LEs RAM Mult. (9x9) PLL P

Apex II (EP2A70) 67 200 1 Mbit 4

Stratix (EP1S80) 79 040 7 Mbit 176 12

Excalibur (EPXA10) 38 400 3 Mbit ? ARM922T

he total r each ks are ual-port

24

www.xilinx.com 1-800-255-7778

S. Mancini

14- ASIC vs FPGA Introduction- FPGA


DS083-2 (v2.7) June 2, 2003 Advance Product Specification

S. Mancini

www.xilinx.com 1-800-255-7778

DS083-2 (v2.7) June 2, 2003 Advance Product Specification


24 www.xilinx.com 1-800-255-7778 DS083-2 (v2.7) June 2, 2003 Advance Product Specification

Plan Introduction
ASICs FPGA Modles de cots

Cots des FPGAs Exemple de prix unitaires pour de grandes quantits :


Socit Altera Altera Altera Xilinx Xilinx Xilinx Actel Actel Rfrence EP20K200 (Apex 20k) EP1S80 EPXA1 (Excalibur ARM) XC3S1000 (Spartan 3) XC2V8000 (Virtex II) XC2VP100 (Virtex II Pro) APA1000 (ProAsic+) AX2000 (Axcelerator) Prix 340 $ 800 $ 27 $ 200 $ 8000 $ 11000 $ 400 $ 630 $

Mthodologie de conception Durcissement aux radiations SoC Bilan

15- ASIC vs FPGA

16- ASIC vs FPGA Introduction- Modles de cots

S. Mancini

Sajoute

Outils de CAO EEPROMs externes Troix composantes :

Cot des ASICS

Cot de conception
Ingnieurs Outils de CAO 500 000 $ par an.

NRE (Non-Recurring Engineering Charges)


Cots de fabrication incompressibles (masques, . . . ) 50 000 $, jusqu 1,5 M$ pour wafer 300 mm techno 90 nm

Cot unitaire

Cot de fabrication unitaire 0.2 $ par mm2 Un wafer 300 mm (90000 mm2)= 300 $

Les gate-arrays rduisent les NRE.

18

17- ASIC vs FPGA Introduction- Modles de cots

S. Mancini

Total Unit Cost (

$400,000 $300,000 $200,000 $100,000 $5 10 50 # of Units 100 150 FPGA Cost ASIC Cost

Device Only Cost (ASIC includes NRE) Units FPGA Cost ASIC Cost 5 $ 16,000 $ 350,150 ASIC includes NRE) 10 $ 32,000 $ 350,300 Cost ASIC Cost 50 $ 160,000 $ 351,500 320,000 $ 353,000 16,000 100 $ 350,150 $ 480,000 $ 354,500 32,000 150 $ 350,300 $

Device + EDA Tools Estimate (ASIC includes NRE) FPGA EDA $ 82,000 Simulation+Synthesis+FPGA Place&Route ASIC 3,200 Each EDA $ 343,000 Simulation+Synthesis+Timing+ATPG FPGA $
FPGA NRE $ -

Comparaison
ASIC Cost $ 693,300 $ 694,500 $ 696,000 $ 697,500 $ 699,000 $ 700,500

Les circuits multi-projets Plusieurs projets/circuits sont faits sur le mme wafer pour partager les NRE.

60,000 20,000 80,000

$ $ $

Donnes : systme de 250K portes


351,500 353,000 354,500
FPGA/ASIC Cost vs Units (250KGates)
$600,000 Total Unit Cost (US$) $500,000 $400,000 $300,000 $200,000

ASIC $ 30 Each Units $ FPGA NRE $FPGA Cost 3,200 Each ASIC 350,000 114,000 FPGA NRE 10 $ $ ASIC 50 $ $ 30 242,000 Each 100 $ ASIC NRE $ 350,000 402,000 150 $ 562,000 200 $ 722,000 250 $ 882,000

FPGA $ FPGA NRE $ ASIC $ ASIC NRE $

3,200 Each 30 Each 350,000

NRE ($)

Cot unitaire ($) 3 200 30


FPGA/ASIC Cost vs Units (250KGates)

FPGA ASIC

350 000
FPGA Cost ASIC Cost

FPGA/ASIC Cost vs Units (250KGates)


$600,000 Total Unit Cost (US$) $500,000 $400,000 $300,000 $200,000 $100,000
$100,000 $5 10 50 # of Units 100 150

$1,000,000
1

Total Unit Cost (US$)

$800,000 FPGA Cost ASIC Cost

$600,000 FPGA Cost $400,000 ASIC Cost $200,000 $10 50 100 150 200 250 # of Units

Device + EDA Tools Estimate (ASIC includes NRE) $FPGA EDA $ 82,000 Simulation+Synthesis+FPGA Place&Route 50 100 150 ASIC EDA $ 5 343,000 10 Simulation+Synthesis+Timing+ATPG Units 10 50 100 150 200 250 FPGA Cost $ 114,000 $ 242,000 $ 402,000 $ 562,000 $ 722,000 $ 882,000 ASIC Cost $ 693,300 $ 694,500 $ 696,000 $ 697,500 $ 699,000 $ 700,500

# of Units

FPGA $ FPGA NRE $ ASIC $ ASIC NRE $

3,200 Each 30 Each 350,000

Europractice : AMI Semiconductor 0,35 m CMOS 680 Euro/mm2 CMP : STMicroelectronics 0,18 m CMOS HCMOS8D 990 Euro/mm2

Cot 1du circuit

1 . . . et la CAO

s Estimate (ASIC includes NRE) DA $ 82,000 Simulation+Synthesis+FPGA Place&Route FPGA/ASIC Cost vs Units (250KGates) DA $ 343,000 Simulation+Synthesis+Timing+ATPG ASIC Cost 18- ASIC vs FPGA $800,000 $ 693,300 Introduction- Modles de cots $ 694,500 $600,000 $ 696,000 $400,000 $ 697,500 $200,000 $ 699,000 $ 700,500 $Total Unit Cost (US$) 10 50 $1,000,000

http ://www.altera.com/products/devices/cost/cst-cost_step1.jsp

Cost 14,000 42,000 02,000 62,000 22,000 82,000

FPGA $ FPGA NRE $ ASIC $ ASIC NRE $

3,200 Each 30 Each 350,000


FPGA Cost ASIC Cost

S. Mancini

19- ASIC vs FPGA Mthodologie de conception

S. Mancini

100

150

200

250

# of Units

Plan
Total Unit Cost (US$)

FPGA/ASIC Cost vs Units (250KGates)

Flot de conception

$1,000,000 $800,000 $600,000 $400,000 $200,000 $10 50 100 Mthodes 150 200 250 communes # of Units Spcicit des ASICs Spcicit des FPGAs Le prototypage : FPGA vers ASIC Exemple de projet multi-plateforme : LEON

Introduction Mthodologie de conception


FPGA Cost ASIC Cost

Spcification VHDL (RTL) Vecteurs de test Simulation Placement routage Simulation

Synthse

Simulation

Programmation FPGA

non

Validation

oui

Validation

oui

Validation Fabrication ASIC

Durcissement aux radiations SoC Bilan


20- ASIC vs FPGA 21- ASIC vs FPGA Mthodologie de conception- Mthodes communes

S. Mancini

Plan Introduction Mthodologie de conception


Mthodes communes Spcicit des ASICs Spcicit des FPGAs Le prototypage : FPGA vers ASIC Exemple de projet multi-plateforme : LEON

Synthse directe Les descriptions un "haut" niveau dabstraction des blocs fonctionnels sont transformes en cellules standards.
VHDL
Entity Synthse
e e2 e3
1

NETLIST
s

Placement Routage

Durcissement aux radiations SoC Bilan


22- ASIC vs FPGA

LAYOUT

Pas de circuits spciques de type RAM/CAM, PLL


23- ASIC vs FPGA Mthodologie de conception- Spcicit des ASICs

S. Mancini

Composants "prcaractriss"-IP Les circuits complexes sont proposs sous la forme de macro-blocs.
VHDL
Entity
e1 e2 e3

Le Back-End Le placement/routage se dcompose en plusieurs tapes :


Placement Insertion test Insertion arbre dhorloge Routage des horloges Routage complet Analyse de timing Vrication (DRC, LVS, simulation post placement/routage, ...)
Application-Specic SOC Multiprocessors
CAB MPEG MBS + VIP1 + VIP2 ICP1 + ICP2 + MMI 1394 T-PI MSP3

NETLIST
s

Les fondeurs proposent des modles de simulation et des masques (vue abstraite). La synthse se fait par instanciation de bote noire.

RAM

LAYOUT

Les blocs fonctionnels peuvent tre dcomposs et placs/routs sparement


IP

Conditional access (MSP1 + MSP2)

M-PI TM32 PR3940

chronously connected to the same clock in another chiplet, we phase-aligned these clocks and analyzed the signal paths to meet timing constraints. We achieved clock alignment by tweaking the clock insertion delays, using aligners in the clock module. Similarly, we made the clock trees as structurally identical as possible. As part of the physical design process, we met design completion and manufacturability goals by implementing techniques such as design rule checks, antenna xes, track lling, and doubling of vias wherever possible. Figure 4 shows the layout plot for the Viper designs initial version. Table 3 summarizes the major design parameters.

Figure 4. Layout of Viper (PNX8500).

PNX8500 (philips)
1
Value TSMC 0.18 m, six metal layers About 35 million 1.2 million instances, or 8 million gates 243 instances, 750-Kbit memory

Table 3. Design statistics. Parameter Process technology Transistors Instances CPUs Memories

La physique des interconnexions doit tre prise en compte.


Acknowledgments
2 (TriMedia TM32 and MIPS PR3940) 50 82 TM32: 200 MHz; PR3940: 150 MHz; SDRAM: 143 MHz Power Supply voltage Package 4.5 W 1.8-V core and 3.3-V I/O BGA456 Peripherals Clock domains Clock speed

WE HAVE LEARNED much from the Viper design experience and trust it will guide us in the future, particularly since the next-generation SOC designs are significantly more complex, calling for still higher levels of integration. Some of our current activities, in addition to regular chip-development tasks, are investigating more efficient on-chip bus architectures and better design-reuse methodologies. I

We thank the Viper management and design teams for their hard work, particularly chief architects Gert Slavenburg and Lane Albanese, without whose foresight and leadership the project never would have been successful.

References
1. S. Rathnam and G. Slavenburg, An Architectural Overview of the Programmable Multimedia Processor, TM-1, Proc. 41st IEEE Computer Society Intl Conf. (COMPCON 96), IEEE CS

24- ASIC vs FPGA Mthodologie de conception- Spcicit des ASICs

S. Mancini

25- ASIC vs FPGA Mthodologie de conception- Spcicit des ASICs

chiplet timing, clock matching, and I/O timing analysis. To achieve timing closure, we made engineering change orders to the netlist after routing. Following each manipulation step, formal verication ensured that the modied netlist was functionally equivalent to the one after test insertion. We aligned all clock domains having synchronous chiplet crossings. For example, if the memory interface clock in one chiplet was syn-

Press, Los Alamitos, Calif., 1996, pp. 319-326. 2. D. Paret and C. Fenger, The I2C Bus, John Wiley & Sons, New York, 1997.

S. Mancini

Santanu Dutta is a design engineering manager at Philips Semiconductors in Sunnyvale, California. His research interests include design of high-performance

30

IEEE Design & Test of Computers

Plan Introduction Mthodologie de conception


Mthodes communes Spcicit des ASICs Spcicit des FPGAs Le prototypage : FPGA vers ASIC Exemple de projet multi-plateforme : LEON

Modles dentres Les vendeurs de FPGA proposent des outils propritaires pour utiliser les FPGAs : Saisie de schmatique Langages de description spciques AHDL - Altera ABEL - Xilinx La synthse peut tre ralise par des outils tiers (Leonardo, Synplicity, Synopsys, etc ...).

Durcissement aux radiations SoC Bilan


26- ASIC vs FPGA

27- ASIC vs FPGA Mthodologie de conception- Spcicit des FPGAs

S. Mancini

Placement/routage Le placement/routage est ralis par des outils propritaires. Ces outils permettent : dallouer les blocs fonctionnels dextraire une analyse de timing

Utilisation des ressources

? Comment utiliser les ressources des FPGAs ?


Instanciation directe
Primitives (macro-cells, RAM, etc ...) Bibliothques de macrofonctions Selon les outils de synthse ces instances ne peuvent pas tre synthtises de faon classique
Main Enveloppe Macro

Synthse

Lacroissement de complexit des FPGA impose lutilisation de mthodologies hirarchiques.

Enveloppe

Placement Routage

Description de haut niveau/ infrence


Les synthtiseurs dtectent les blocs complexes. Exemple : RAM, multiplieurs, etc ...

28- ASIC vs FPGA Mthodologie de conception- Spcicit des FPGAs

S. Mancini

29- ASIC vs FPGA Mthodologie de conception- Spcicit des FPGAs

S. Mancini

Plan Introduction Mthodologie de conception

Principe On utilise des FPGAs pour valider la conception dun ASIC. Il existe des plateformes dmulation gnriques de

Mthodes communes grandes complexit (Aptix, Quickturn, . . . ). Spcicit des ASICs Spcicit des FPGAs Accroissement de la vitesse de simulation Le prototypage : FPGA vers ASIC Exemple de projet multi-plateforme : LEON Solutions for Wireless Communications and Image Processing

Durcissement aux radiations SoC Bilan


FPCB user freehole area with 1,920 routable pins 30- ASICaccommodates a wide variety vs FPGA of prototyping components Modular low-skew clock circuits (8) FPIC Programmable Interconnect Components (3) provides software-controlled interconnect and diagnostic probing Microcontroller configures all programmable hardware, performs system self-test and stores data for stand-alone configuration

Pas de vrication temporelle


User-controlled power supply voltage selection and monitoring to support advanced prototyping components today and tomorrow

Larchitecture de lmulateur peut tre inadapte au projet

I/O cable connectors (20) with interleaved grounds provide flexible connection to target systems 31- ASIC vs FPGA Mthodologie de conception- Le prototypage : FPGA vers ASIC

S. Mancini

Modular hard-wired buses for high-fanout bi-directional nets

utions for Wireless Communications and Image Processing

Exemple : Aptix
User-controlled power supply voltage selection and monitoring to support advanced prototyping components today and tomorrow I/O cable connectors (20) with interleaved grounds provide flexible connection to target systems

Plan
Board-edge I/O

freehole area routable pins a wide variety g components

ular low-skew ck circuits (8)

System Explorer MP3CF hardware


Modular hard-wired buses for high-fanout bi-directional nets

Introduction Mthodologie de conception


Mthodes communes Spcicit des ASICs Spcicit des FPGAs Le prototypage : FPGA vers ASIC Exemple de projet multi-plateforme : LEON

e Interconnect provides softerconnect and nostic probing

The System Explorer MP3CF is optimized for prototyping DSP-based pipelined designs with moderate requirements for System Explorer MP3CF hardware interconnect between prototyping components. The MP3CF architecture provides 1 maximum performance for interconnect architecture System Explorer MP3CF prototypes orer MP3CF is optimized incorporating fixed-pin prototyping comDSP-based pipelined derate requirements for ponents such as CPUs, DSPs, memory ween prototyping compoF architecture provides cards, etc. Use the MP3CF for building mance for prototypes ed-pin prototyping comhigh-speed prototypes of wireless commuCPUs, DSPs, memory nication and digital-imaging applications. e MP3CF for building
USER COMPONENT HOLES FPGA FPGA FPGA FPGA FPGA FPGA REGION #1 REGION #2 REGION #3 FPGA FPGA FPGA FPGA FPGA FPGA

ler configures ble hardware, m self-test and or stand-alone configuration

System Explorer MP3CF interconnect architecture


Board-edge I/O

USER COMPONENT HOLES FPGA FPGA FPGA FPGA FPGA FPGA

REGION #1 FPGA FPGA

REGION #2 FPGA FPGA

REGION #3 FPGA FPGA

types of wireless commutal-imaging applications.

One-to-one connections between FPIC Device and component pins

One-to-one connections between FPIC Device and component pins

FPIC #1

140

FPIC #2

140

FPIC #3

FPIC #1

140

FPIC #2

140

FPIC #3

GLOBAL INTERCONNECT LINES All component pins in a given region connect through one FPIC device
140

GLOBAL INTERCONNECT LINES All component pins in a given region connect through one FPIC device
140

Component pins in different regions connect through two FPIC devices

Component pins in different regions connect through two FPIC devices

Durcissement aux radiations SoC Bilan


2
33- ASIC vs FPGA

a commitment to create real-time prototypes of

bile phone designs. Prototypes are the only way

r algorithms 32- ASIC vs FPGA by testing actual voice transmission

Mthodologie de conception- Le prototypage : FPGA vers ASIC opted the Aptix solution because it provides a

Nokia made a commitment to create real-time prototypes of all its new mobile phone designs. Prototypes are the only way
5

bug environment while maintaining our objective

erification.

7 LEON-2 Users Manual by 3testing actual voice transmission 9 to validate our algorithms

S. Mancini Version 1.0.19

al Staff, Nokia (San Diego, CA)

quality. We adopted the Aptix solution because it provides a productive debug environment while maintaining our objective 4 of real-time verification. Stelios Podimatis
Member of Technical Staff, ASIC Engineering, Nokia (San Diego, LEON processor CA)
Local ram FPU Debug Support Unit
6

1.4 Functional overview

A block diagram of LEON-2 can be seen in gure 1.

Architecture de LEON

3 5

Cibles technologiques RAM infre instancie instancie instancie instancie instancie instancie instancie PADS infrs infrs instancis instancis instancis instancis infrs infrs

Integer unit
CP I-Cache D-Cache Local ram PCI

4 6

Ethernet

MMU AMBA AHB

AHB Controller

Debug Serial Link

Timers Memory Controller AMBA APB

IrqCtrl AHB/APB Bridge

UARTS I/O port

8/16/32-bits memory bus

Technologie Modle comportemental Xilinx VIRTEX/2 FPGA Atmel ATC18/25/35 UMC FS90A/B UMC 0.18 um CMOS TSMC 0.25 um w. Artisan rams Actel Proasic FPGA Actel AX anti-fuse FPGA

PROM

I/O

SRAM

SDRAM

Rfrences 1: LEON-2 block diagram : http ://www.gaisler.com 1 Figure


34- ASIC vs FPGA Mthodologie de conception- Exemple de projet multi-plateforme : LEON

S. Mancini The LEON integer unit implements the full SPARC V8 standard, including all multiply and divide instructions. The number of register windows is congurable within the limit of the SPARC standard (2 - 32), with a default setting of 8. To aid software debugging, up to four watchpoint registers can be congured. Each register can cause a trap on an arbitrary instruction or data address range. If the debug support unit is enabled, the watchpoints can be used to enter debug mode.
1.4.2 Floating-point unit and co-processor The LEON model does not include an FPU, but provides a direct interface to the Meiko FPU

1.4.1 Integer unit

35- ASIC vs FPGA Mthodologie de conception- Exemple de projet multi-plateforme : LEON

S. Mancini

Organisation du projet
cache
syncram
virtex2_syncram RAMB16_S36

Exemple de code
cachemem.vhd tech_map.vhd entity cachemem is entity syncram is ... ... dtags0 : syncram port map (... inf : if INFER_RAM generate ... u0 : generic_syncram generic map ( ... hb : if (not INFER_RAM) generate atc1 : if TARGET_TECH = atc18 generate u0 : atc18_dpram generic map (... ... tech_act18.vhd pragma translate_off entity hdss2_512x32cm4sw0 is ... architecture behavioral of hdss2_512x32cm4sw0 is ... pragma translate_on entity atc18_syncram is ... id0 : hdss1_128x32cm4sw port map (... ...
37- ASIC vs FPGA Durcissement aux radiations

proasic_syncram RAM256x9SST generic_syncram Code VHDL atc18_syncram


hdss1_128x32cm4sw0

Les mmoires instancies sont la fois : Des botes noires pour la synthse Les entits sont considres comme des cellules de la bibliothque. Des descriptions comportementales pour la simulation Elles peuvent tre fournies par le vendeur de RAM.
36- ASIC vs FPGA Mthodologie de conception- Exemple de projet multi-plateforme : LEON

S. Mancini

S. Mancini

Plan Introduction Mthodologie de conception Durcissement aux radiations


Durcissement des ASICs Durcissement des FPGAs

Single Event Upset (SEU) Une particule peut faire changer dtat les lments de mmorisation (Latch, registres, SRAM, . . . ) .
e

SoC Bilan

Select Select s e 0

38- ASIC vs FPGA

39- ASIC vs FPGA Durcissement aux radiations

Single Event Transient (SET) La circuiterie combinatoire peut tre altre : Une erreur linstant dchantillonnage peut tre mmorise Larbre dhorloge gnre des fronts parasites
DQ

Latchup

gnd P+ N N P P N+ Caisson N

Substrat P

Clk

Clk D Q

D Clk Q

SET sur la donne

SET sur lhorloge

40- ASIC vs FPGA Durcissement aux radiations

S. Mancini

41- ASIC vs FPGA Durcissement aux radiations

Substrat N

N+

N P+ Caisson P

gnd

vdd

S. Mancini

vdd

S. Mancini

Plan Introduction Mthodologie de conception Durcissement aux radiations


Durcissement des ASICs Durcissement des FPGAs

Principales mthodes Utilisation de technologies : Sur-mesures


Dissipation des charges (dimensionnement, capacits) Filtrage temporel (retard+vote) Isolation des transistors Cellules intra-redondantes

SoC Bilan

Standards
TMR Codes correcteurs derreur Auto-test

42- ASIC vs FPGA

43- ASIC vs FPGA Durcissement aux radiations- Durcissement des ASICs

S. Mancini

Les registres TMR : Triple Modular Redundancy SRAM

Les mmoires

Vote

Standard Des codes correcteurs derreurs protgent les donnes stockes. Des bits supplmentaires sont ncessaires. Spciques Les bits dun mot sont spatialement spars. La surface est accrue.

CLK

(S)DRAM
Les SEU acclrent la dcharge des points mmoire. On peut accrotre le taux de rafrachissement.

Les registres doivent tre loigns pour ne pas subir le mme dfaut. Il doivent tre mis jour par la valeur corrige.
44- ASIC vs FPGA Durcissement aux radiations- Durcissement des ASICs

S. Mancini

45- ASIC vs FPGA Durcissement aux radiations- Durcissement des ASICs

S. Mancini

Mthodologies de durcissement Mthodes automatiques


Technologies spciques Les cellules durcies sont utilises au lieu des cellules standards. Atmel propose ATC18RHA. la technologie durcie 0.18

Introduire des technologies dauto-test dans les circuits.

TMR la synthse classique est suivie dune modication de netlist. Cela peut tre fait par des scripts des outils de synthse ou par modication des chiers rsultats.

Utilisation de gate-array durcis Par conception


46- ASIC vs FPGA Durcissement aux radiations- Durcissement des ASICs 49

S. Mancini

Plan Introduction Mthodologie de conception Durcissement aux radiations


Durcissement des ASICs Durcissement des FPGAs

Origine des disfonctionnements Les lments des FPGAs qui sont susceptibles de provoquer des disfonctionnements : Registres des cellules RAM embarque La conguration est sensible aux SEU
La SRAM peut tre altre (XC2VP125 : 43 Mbits de conguration) Les Anti-fusibles peuvent claquer Les EEPROM peuvent changer dtat

SoC Bilan

La logique gnrique gnre des SET


Logique dinterconnexion Arbre dhorloge

47- ASIC vs FPGA

48- ASIC vs FPGA Durcissement aux radiations- Durcissement des FPGAs

S. Mancini

Les lments de conguration externe (pour les FPGAs de type SRAM) doivent aussi tre protgs.

Remdes Les FPGAs sont plus dlicats durcir : Les registres et la RAM
Ce sont les mmes mthodes que les ASICs.

La conguration
Adopter des technologies moins sensibles aux SEUs
Les anti-fusibles sont moins sensibles que les SRAM/EEPROM

Vrier la conguration
Utilisation de la conguration partielle des FPGAs pour vrier les cellules automatiquement.

Insrer de lauto-contrle des calculs


Insrer des squences connues dans les calculs pour vrier les rsultats ROM de squences et rfrences LFSR Une dtection de faute provoque la reconguration du FPGA.
52 49- ASIC vs FPGA Durcissement aux radiations- Durcissement des FPGAs

S. Mancini

Mthodologie de durcissement Il est possible dimplanter des TMR de faon transparente. Pour les FPGAS dActel, Synplify permet dimplanter directement : des Flip-op combinatoire des TMR des Flip-op combinatoire avec TMR En VHDL, cela se fait laide dattributs :
architecture top of top is attribute syn_radhardlevel of top : architecture is "tmr_cc" ; ... attribute syn_radhardlevel of counter_q : signal is "tmr" ; ...

Composants spciques Actel propose des circuits rsistants aux radiations : Programmation par anti-fusibles rsistants Sans registres Avec des registres durcis
D CLK CLK Voter Gate
R T 5 4 S X - S R a d To l e r a n t F PG A s f o r S p a c e A p p l i c a t i o n s

Les registres sont faits avec des lments combinatoire

To achieve the SEU requirements, the D flip-flop in the RT54SX-S R-cell is enhanced (Figure 3). Both the master and slave latches are actually implemented with three latches. The feedback path of each of the three latches is voted with the outputs of the other two latches. If one of the three latches is struck by an ion and starts to change state, the voting with the other two latches prevents the change from feeding back and permanently latching. Care was taken in the layout to ensure that a single ion strike could not affect more than one latch.

Figure 4 is a simplified schematic of the test circuitry that has been added to test the functionality of all the components of the flip-flop. The inputs to each of the three latches are independently controllable so the voting circuitry in the feedback paths can be exhaustively tested. This testing is performed on an unprogrammed array during wafer sort, final test and post burn-in test. This test circuitry cannot be used to test the flip-flops once the device has been programmed.

CLK

CLK

CLK CLK

CLK CLK

Les latchs sont spares pour ne pas subir les mmes rayonnements.
Figure 3 RT54SX-S R-Cell Implementation of D Flip-Flop Using Voter Gate Logic
1
D Q

50- ASIC vs FPGA Durcissement aux radiations- Durcissement des FPGAs

S. Mancini

51- ASIC vs FPGA Durcissement aux radiations- Durcissement des FPGAs

Tst1

Voter Gate Tst2

S. Mancini

Tst3 Test Circuitry

CLK

Figure 4 R-Cell Implementation Test Circuitry

Efcacit des durcissements Quelques circuits dActel : LRH1280 0.8 m ( A1280 )


Flip Flop Flip Flop (CC) TMR GEO SEU 106 107 1010

Plan Introduction Mthodologie de conception Durcissement aux radiations SoC


Rappels sur les SoCs Etude comparative

RTAX 0.15 m (AX 0.15 m S-cell=TMR)


Famille AX RTAX SRAM Registre LETTH GEO SEU LETTH GEO SEU 1, 4 3.107 3, 36 > .. > 2, 89 106 1, 4 1010 (EDAC) > 37 < 1010 Pas de SEL pour LET = 120 MeV-cm2/mg

Bilan

LETTH en MeV-cm2/mg GEO SEU= erreur/bit/jour en orbite gostationnaire


52- ASIC vs FPGA SoC

53- ASIC vs FPGA

S. Mancini

Constituants des SoCs Les technologies actuelles permettent de mettre sur un mme circuit : ASIC Processeurs Mmoire (SRAM et DRAM) Bus systmes Analogique SoC=System on Chip. Les circuit programmables permettent le mme type de ralisation : les SoPC (System on Programmable Chip).
54- ASIC vs FPGA SoC- Rappels sur les SoCs 55- ASIC vs FPGA SoC- Rappels sur les SoCs

Un SoPC : Excalibur (Altera)

S. Mancini

S. Mancini

Plan Introduction Mthodologie de conception Durcissement aux radiations SoC


Rappels sur les SoCs Etude comparative

Les microprocesseurs Ils sont disponibles selon les besoins. ASIC


Prcaractriss
Optimiss par les fondeurs sous licence.

Synthtisables
Modles disponibles de haut niveau pour la synthse. Certaines parties doivent tre adaptes la technologies.

Paramtrables
Les processeurs sadaptent aux besoins de lapplication : Taille et type des caches Mcanismes systmes (TLB, adressage virtuel, . . . ) Co-processeurs

Bilan

FPGA

Performances : MIPS 32 bits = 300 MHz

56- ASIC vs FPGA

57- ASIC vs FPGA SoC- Etude comparative

S. Mancini

Les microprocesseurs On trouve deux type de processeurs :


Synthtisables

Les bus Les technologies sont adaptes aux besoins. ASIC


Esclave Esclave Esclave

ASIC

Modles gnriques (ex Leon) ou processeur fournis par vendeurs de FPGAs (ex : NIOS (Altera), MicroBlaze (Xilinx)). Ressources utilises : RAM double port, CAM. Performance 50 MHz

Esclave

Esclave

Esclave

Mux

Mux

Mux

La limitation des ressources impose des processeurs simples. Intgrs dans les FPGA
ExempleExcalibur ARM (Altera), Virtex II Pro (Xilinx)

Matre

Matre

Matre

Matre

Bus Trois-tats Bus multiplexeurs FPGA et peuvent cohabiter dans un mme circuit.

FPGA

Performance 300 MH

Leurs caractristiques sont ges.

58- ASIC vs FPGA SoC- Etude comparative

S. Mancini

59- ASIC vs FPGA SoC- Etude comparative

S. Mancini

Les bus
Avalon Bus Specification

La mmoire
The Avalon bus module is generated automatically by the SOPC Builder, so that the system designer is spared the task of connecting the bus and peripherals together. The Avalon bus module is very rarely used as a discrete unit, because the SOPC Builder will almost always be used to automate the integration of processors and other Avalon bus peripherals into a system module. The designers view of the Avalon bus module usually is limited to the specific ports that relate to the connection of custom Avalon peripherals.

ASIC

La technologie est impose par les ressources Les bus trois-tats sont peu recommands (et mme souvent impossibles).
Note that the Avalon bus module (an Avalon bus) is a unit of active logic that takes the place of passive, metal bus lines on a physical PCB. (See Example 2). In this context, the ports of the Avalon bus module could be thought of as the pin connections for all peripheral devices connected to a passive bus. The Avalon Bus Specification Reference Manual defines only the ports, logical behavior and signal sequencing that comprise the interface to the Avalon bus module. It does not specify any electrical or physical characteristics of a physical bus.

ASIC

FPGA

Pour conomiser la logique, larbitrage peut tre fait au niveau de chaque esclave : les ls dinterconnexions Bus Avalon sontLes CPUs embarqus imposent des bus sysnombreux.
Figure 2. Avalon Bus Module Block Diagram - an example system The Avalon bus module provides the following services to Avalon peripherals connected to the bus:

FPGA

Les mmoires sont disponibles sous forme de blocs pr-caractriss. ROM et RAM sont gnres selon les besoins. Les technologies actuelles permettent la cohabitation de plusieurs types de mmoires (SRAM, SDRAM, associatives, . . . ). Les ROMs sont cres sur-mesures.
UMC propose des bibliothque et gnrateurs de SRAM.
http ://www.umc.com/english/design/b_1.asp

Altera Corporation

tmes.

60- ASIC vs FPGA SoC- Etude comparative

S. Mancini

61- ASIC vs FPGA SoC- Etude comparative

Performances 0,13 m : SRAM 1K x 16 access time = 1,1 ns


S. Mancini

La mmoire Les FPGAs fournissent des blocs de mmoire lmentaires ( 4 KOctets). Ils peuvent tre assembles pour former de grandes quantits. Les ROMs sont synthtises en circuits combinatoires. Pas de SDRAMs.
Xilinx XC2VP125 (Virtex II Pro) (0,13 m ) 556 blocs de SRAM de 18Kbits = 10,008 Kbits
Congurations Timings
62- ASIC vs FPGA SoC- Etude comparative

Horloges multiples Les ASICs permettent des architectures de domaines dhorloges complexes. Des FIFOs asynchrones adaptes permettent les changements de domaines : les mta-stabilites sont rsolues. Chaque domaine dhorloge a son arbre dhorloge propre.
Application-Specic SOC Multiprocessors
CAB MPEG MBS + VIP1 + VIP2 ICP1 + ICP2 + MMI 1394 T-PI Conditional access (MSP1 + MSP2)

ASIC

ASIC

FPGA

FPGA

16K x 1 bit 8K x 2 bits Setup 0,4 0,5

4K x 4 bits 2K x 9 bits Prop 1,5 1,8

1K x 18 bits 512 x 36 bits

SelectRAM CLB

Clk min 1,3 1,4 S. Mancini


63- ASIC vs FPGA SoC- Etude comparative

82 horloges dans le PNX8500 (Philips).

MSP3

M-PI TM32 PR3940

chronously connected to the s another chiplet, we phase-aligne and analyzed the signal paths to constraints. We achieved clock tweaking the clock insertion dela ers in the clock module. Similarly clock trees as structurally identic As part of the physical design p design completion and manufact by implementing techniques such checks, antenna xes, track lling of vias wherever possible. Figure 4 out plot for the Viper designs init Table 3 summarizes the m parameters.

Figure 4. Layout of Viper (PNX8500).

Table 3. Design statistics. Parameter Process technology Transistors Instances Memories CPUs Peripherals Clock domains Clock speed Power Supply voltage Package Value About 35 million 1.2 million instances, or 8 million gates 243 instances, 750-Kbit memory 2 (TriMedia TM32 and MIPS PR3940) 50 82 TM32: 200 MHz; PR3940: 150 MHz; SDRAM: 143 MHz 4.5 W 1.8-V core and 3.3-V I/O BGA456

S. Mancini

TSMC 0.18 m, six metal layers

WE HAVE LEARNED much from th experience and trust it will gu future, particularly since the ne SOC designs are significantly m calling for still higher levels of inte of our current activities, in addit chip-development tasks, are inve efficient on-chip bus architectu design-reuse methodologies.

Acknowledgments

We thank the Viper manageme teams for their hard work, part architects Gert Slavenburg and L without whose foresight and lead ject never would have been succe

References

1. S. Rathnam and G. Slavenburg, A

Overview of the Programmable M

Processor, TM-1, Proc. 41st IEE

Society Intl Conf. (COMPCON 96

chiplet timing, clock matching, and I/O timing analysis.

Press, Los Alamitos, Calif., 1996, & Sons, New York, 1997.

2. D. Paret and C. Fenger, The I2C B

Horloges multiples
Functional Description: FPGA Each global clock multiplexer buffer can be driven either by the clock pad to distribute a clock directly to the device, or by the Digital Clock Manager (DCM), discussed in Digital Clock Manager (DCM), page 40. Each global clock multiClock Pad

Lanalogique La plupart des technologies numriques sont compatibles avec lanalogique. Les blocs analogiques sont conus part et intgrs lassemblage. Les zones numriques/analogiques sont spares pour rduire le bruit dhorloge.

plexer buffer can also be driven by local interconnects. The DCM has clock output(s) that can be connected to global clock multiplexer buffer inputs, as shown in Figure 47.

les arbres dhorloge sont dj construits. Le nombre dhorloges est limit.


Clock Pad CLKIN

DCM

Local Interconnect

CLKOUT

ASIC

Les changements de domaines sont dlicats.


Clock Multiplexer I Clock Buffer O Clock Distribution
DS083-2_43_122001

ASIC

Figure 47: Virtex-II Pro Clock Multiplexer Buffer Configuration

Global clock buffers are used to distribute the clock to some or all synchronous logic elements (such as registers in CLBs and IOBs, and SelectRAM+ blocks. Eight global clocks can be used in each quadrant of the Virtex-II Pro device. Designers should consider the clock distribution detail of the device prior to pin-locking and floorplanning. (See the Virtex-II Pro Platform FPGA User Guide.)

FPGA

les FIFOs asynchrones sont faites de cellules du FPGA : leur performances sont limites.
38

macro bloc Apex 20k

Xilinx propose des Digital Clock Manager


8 BUFGMUX

Figure 48 shows clock distribution in Virtex-II Pro devices. In each quadrant, up to eight clocks are organized in clock rows. A clock row supports up to 16 CLB rows (eight up and eight down). To reduce power consumption, any unused clock branches remain static.

NW

NE
8

NW

8 BUFGMUX

NE

8 max

16 Clocks
8

16 Clocks

SW

8 BUFGMUX

SE

SW

SE

FPGA

8 BUFGMUX

DS083-2_45_122001

Figure 48: Virtex-II Pro Clock Distribution 1


www.xilinx.com 1-800-255-7778

Horloges Virtex II Pro


DS083-2 (v2.7) June 2, 2003 Advance Product Specification

64- ASIC vs FPGA SoC- Etude comparative

S. Mancini

65- ASIC vs FPGA SoC- Etude comparative

S. Mancini

Lanalogique Pas danalogique intgre. Les circuit analogiques programmables existent mais ils sont peu performants.

Plan Introduction Mthodologie de conception Durcissement aux radiations SoC Bilan

ASIC

FPGA

66- ASIC vs FPGA Bilan

67- ASIC vs FPGA

S. Mancini

Comparaisons de performances Performances et complexit de la ralisation du microprocesseur LEON pour diffrentes cibles technologiques :
Technologie Atmel 0.18 CMOS std-cell Atmel 0.25 CMOS std-cell UMC 0.25 CMOS std-cell Atmel 0.35 CMOS std-cell Xilinx XC2V500-6 (0.15 m ) Altera 20K200C-7 (0.15 m ) Actel AX1000-3 (0.15 m ) Complexit ASIC 35K gates + RAM 33K gates + RAM 35K gates + RAM 2 mm2 + RAM FPGA 4,800 LUT + 14/32 block RAM 5,700 LCELLs + EAB RAM (52%) 7,600 cells + 14/36 RAM 65 MHz (post-layout) 49 MHz (post-layout) 48 MHz (post-layout) 165 MHz (pre-layout) 140 MHz (pre-layout) 130 MHz (pre-layout) 65 MHz (pre-layout)) Frquence

Bilan Matrise complte du projet ASIC Matrise de la rsistance aux radiations Cots rduits grande chelle Fort taux dintgration Performances maximum Les erreurs cotent cher Connaissance approfondie de la technologie NRE
69- ASIC vs FPGA Bilan

FPGA

http ://www.gaisler.com/

68- ASIC vs FPGA Bilan

S. Mancini

S. Mancini

Bilan Temps de dveloppement rduits ASIC Familles rsistantes aux radiations Investissements rduits Contraintes darchitecture Mconnaissance des dtails internes /caractristiques Relachement de lattention Accroissement des risques de pannes Cots unitaires leves Complexit limite Performances limites
70- ASIC vs FPGA Bilan 71- ASIC vs FPGA Bilan

Conclusion Choisir entre un FPGA et un ASIC ?


Surface/cot Fonctionnalit Efficacit

Technologie

Souplesse

Puissance de calcul

Rutilisabilit

Temps de dveloppement Dbits Architecture mmoire Consommation

FPGA

... a dpend ...

S. Mancini

S. Mancini

Rfrences
Plan Dtaill
Introduction Problmatique ASICs Les familles Evolution des technologie FPGA Principe Technologies de programmation Actel (ProAsic) Actel (Axcelerator) Xilinx (Spartan 3/Virtex II) Altera (Apex/Stratix) Modles de cots Cots des FPGAs Cot des ASICS Comparaison Les circuits multi-projets Mthodologie de conception Mthodes communes Flot de conception Spcicit des ASICs Synthse directe

ASIC vs FPGA
S. Mancini
Composants "prcaractriss"-IP Le Back-End Spcicit des FPGAs Modles dentres Placement/routage Utilisation des ressources Le prototypage : FPGA vers ASIC Principe Exemple : Aptix Exemple de projet multi-plateforme : LEON Architecture de LEON Cibles technologiques Organisation du projet Exemple de code Durcissement aux radiations Single Event Upset (SEU) Single Event Transient (SET) Latchup Durcissement des ASICs Principales mthodes Les registres Les mmoires SoC Rappels sur les SoCs Constituants des SoCs Un SoPC : Excalibur (Altera) Etude comparative Les microprocesseurs Les bus La mmoire Horloges multiples Lanalogique Bilan Comparaisons de performances Bilan Conclusion Rfrences Mthodologies de durcissement Durcissement des FPGAs Origine des disfonctionnements Remdes Mthodologie de durcissement Composants spciques Efcacit des durcissements

72- ASIC vs FPGA Bilan

S. Mancini

You might also like