You are on page 1of 9

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 18, NO.

10, OCTOBER 1999

1487

Fault Emulation: A New Methodology


for Fault Grading
Kwang-Ting Cheng, Senior Member, IEEE, Shi-Yu Huang, and Wei-Jin Dai

AbstractIn this paper, we introduce a method that uses the


field programmable gate array (FPGA)-based emulation system
for fault grading. The real-time simulation capability of a hardware emulator could significantly improve the performance of
fault grading, which is one of the most time-consuming tasks
in the circuit design and test process. We employ a serial fault
emulation algorithm enhanced by two speed-up techniques. First,
a set of independent faults can be injected and emulated at the
same time. Second, multiple dependent faults can be simultaneously injected within a single FPGA-configuration by adding
extra circuitry. Because the reconfiguration time of mapping
the numerous faulty circuits into the FPGAs is pure overhead
and could be the bottleneck of the entire process, using extra
circuitry for injecting a large number of faults can reduce
the number of FPGA-reconfigurations and, thus, improving the
performance significantly. In addition, we address the issue of
handling potentially detected faults in this hardware emulation
environment by using the dual-railed logic. The performance
estimation shows that this approach could be several orders of
magnitude faster than the existing software approaches for large
sequential designs.
Index TermsEmulation, fault grading, foult simulation, testing.

I. INTRODUCTION

N todays quality-conscious very large scale integration


(VLSI) world, measuring a designs quality by the fault
coverage is considered essential. Usually, the fault coverage
figure is derived by fault simulation, which is a very timeconsuming process. Very few existing software fault simulators
can handle a design with more than 200 K gates without
resorting to some design for testability (DFT) technique.
Furthermore, even for those tools that can handle such a
design, the process is still very time consuming [13] and could
lengthen the time to market.
In recent years, a lot of effort has been put into the
development of all kinds of parallel algorithms that can run
fault simulation on machines ranging from coarse-grained
distributed system to massively parallel connection machine
[4], [7], [8], [15], [17]. Their policies include distributing
gates or distributing faults on multiple processing elements
(PEs), partitioning the fault simulation kernel, or doing the
Manuscript received July 6, 1998; revised January 13, 1999. This paper
was recommended by Associate Editor R. Aitken.
K.-T. Cheng is with the Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA 93106 USA
(e-mail: timcheng@ece.ucsb.edu).
S.-Y. Huang is with the Department of Electrical Engineering, National
Tsing Hua University, HsinChu, Taiwan, R.O.C.
W.-J. Dai is with Quick-Turn Design System Inc., Mountain View, CA
94043 USA.
Publisher Item Identifier S 0278-0070(99)07733-7.

fault simulation in a pipelined manner. Also, some algorithms [6], [10][12] using a zero-delay model along with
some efficient techniques have successfully handled very large
sequential circuits. On the other hand, a special purpose
hardware accelerator for fault simulation [4] has also been
devised and implemented in a board that plugs into a SUN
workstation. In [1], a concurrent fault simulation algorithm is
partitioned into pipeline stages. Each part is performed by a
processing element controlled by a dedicated microprogram.
This approach, requiring a large number of memory chips,
achieves an order of magnitude run-time speed up over a
conventional software fault simulator. However, the above
approaches, either software or hardware, are still inefficient
for handling large sequential designs.
Logic emulation systems [4], [19] are now commercially
available for fast prototyping, real-time operation, and logic
verification. A logic emulator consists of both hardware and
software. It can automatically implement the function of a
gate-level design on a board composed of dozens of field
programmable gate array (FPGA) chips. Even larger emulator
can be built by integrating several FPGA boards. The software
aided implementation process can be divided into two phases,
circuit compilation and bitstream downloading as shown in
Fig. 1. The circuit compilation process maps the given gatelevel netlist into the FPGA-based format. It involves circuit
partitioning, placement and routing for FPGAs. The output of the compilation process is a bitstream representing
the configuration for target implementation. The bitstream
is then downloaded into the FPGA-boards by programming
each lookup table (LUT) that defines the function of each
configurable logic block (CLB), and the routing switch that
defines the interconnection between CLBs. After the bitstream
downloading is completed, the system is ready for emulation. Usually a hardware emulation engine is used to assist
the emulation process. Given a set of test vectors and its
correct responses, the emulation engine handles the process
of applying the test vectors, collecting the output responses,
and checking if any mismatch occurs between the output
responses and the prestored correct responses. One test vector
is emulated for each clock cycle. A state-of-the-art logic
emulator can implement a logic design with up to three-million
gates and operate at a speed of 100 KHz to several MHz
[23]. Accordingly, it can emulate 100 K to several million
test vectors per second, which is about 10 0001 million times
faster than the traditional software simulators.
A number of methods have been proposed to use a logic
emulation system for fault grading. Wieler et al. [21] proposed

02780070/99$10.00 1999 IEEE

1488

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 18, NO. 10, OCTOBER 1999

Fig. 1. The synthesis process in the FPGA-based emulator.

a serial fault emulation algorithm that emulates one faulty


circuit at a time sequentially. A similar idea was also adopted
by Burgun et al. [2]. In these methods, the implementation
of each faulty circuit is constructed from the fault-free circuit
before the emulation process through a technique called static
fault injection which requires partial reconfiguration of the
emulator. The details of this technique will be reviewed in
Section II-A. The major drawback of these algorithms lies in
the large amount of time spent in reconfiguration.
In this paper, we propose a new approach to perform
fault grading using a FPGA-based logic emulation system.
Beyond the serial fault emulation, we introduce two techniques
to further enhance the performance. First, we exploit the
independence between faults to allow concurrent emulation.
Second, we propose an efficient technique called dynamic
fault injection to reduce the reconfiguration time. Through the
insertion of extra hardware to the circuit under fault grading,
this technique allows for the emulation of multiple structural
dependent faults within a single configuration. A similar idea
was also independently developed by Wieler et al. [20]. In
comparison, the proposed technique has a much lower area
overhead (22%) than the one proposed in [20] (18 times). Like
a real chip, a hardware emulator cannot simulate the unknown
logic value properly. We will also discuss the impact
of the lack of value in the fault emulation environment
and deal with this problem with a dual-railed logic. Relatively
speaking, the proposed method could be faster than the one
based on a logic simulation machine as proposed in [7] and [8],
especially for large circuits. Although the latter is capable of
performing extremely fast logic evaluation, (e.g., 800 million
primitive gates per second [8]), its performance is still highly
sensitive to the size of the circuit. On the other hand, an FPGAbased emulation system is less sensitive to the circuit size
and can simulate one input vector within one clock cycle for
circuits it can implement.
The rest of this paper is organized as follows. In Section II,
we describe our enhanced serial fault emulation algorithm

Fig. 2. Serial fault emulation process.

and then give a performance estimation. In Section III, we


propose a dual-railed logic to handle the unknown logic value
in the hardware emulation system. In Section IV, we present
the experimental results that analyze the area overhead due to
our speed-up techniques.
II. FAULT EMULATION
A. Primitive Scenario
A serial fault emulation process is shown in Fig. 2. In the
preprocessing stage, the collapsed stuck-at fault list is generated and the fault-free netlist is compiled and programmed into
the FPGAs. Since the circuit partitioning algorithm used in the
technology mapper for FPGAs will probably duplicate part of
the original netlist to minimize the amount of interconnection

CHENG et al.: FAULT EMULATION

(a)

1489

(b)

Fig. 3. Static fault injection by changing the netlist of a CLB. The netlist
(a) before fault injection and (b) after fault injection.

across FPGA-chips, a single stuck-at fault in the original


netlist may correspond to a multiple stuck-at fault in the
FPGA-implementation. Therefore, the fault list generated for
emulation is, in general, a set of multiple stuck-at faults.
After the preprocessing, the algorithm enters a loop. At each
iteration, one fault is selected for emulation. The target fault
is first injected to the fault-free implementation to convert it
into a faulty-circuit. This requires recompilation (preparing
the new bitstream) and reconfiguration (reprogramming the
FPGAs through downloading the bitstream generated in the
recompilation phase). Once the fault injection is complete, the
test sequence is applied. The output response is compared with
the prestored correct response. If any mismatch is observed,
the target fault is declared as detected by the given test
sequence. This iterative process continues until all faults are
injected and emulated once.
One issue that arises in the hardware fault emulation system
but not in the software approaches is how to inject a fault into
the implementation. In the software approaches, it involves
only netlist manipulation and takes constant time. But in the
emulation system, it will involve the modification of the target
FPGA implementation, which is referred to as reconfiguration
or reprogramming. Any stuck-at fault can be injected by
simply changing the contents of the affected CLBs [21]. For
example, consider the CLB shown in Fig. 3(a). To inject the
s.a.0 fault at signal , we can simply change its contents to the
one shown in Fig. 3(b). No global recompilation is required.
A stuck-at fault of multiplicity requires reprogramming of at
most CLBs. With the information extracted at compilationtime, we can directly manipulate the corresponding bitstream
of those affected CLBs and then download the modified
portion into the FPGA-boards to inject a fault efficiently.
Current FPGAs architecture requires the reprogramming of
an entire chip even if only one CLB needs to be changed. But
board-level partial reprogramming is possible. the emulation
system of [23] contains the chip-addressing circuitry on the
board to direct the bitstream to any individual FPGA-chip. As
a result, it allows partial reprogramming in terms of one chip
at a time. According to the Xilinxs FPGA Databook [22], it
takes about several milliseconds to reprogram a chip. This is
about the cost to inject a fault in our system. In the future, if
partial reprogramming can be done at the CLB level, injecting
a static fault can be even cheaper.
The total run time for such a process consists of two
major parts: 1) The recompilation and reconfiguration time.
For each fault injection, certain amount of computation is
required to derive the information regarding to what needs
to be changed in the mapped FPGAs. The time spent in

(a)

(b)

Fig. 4. Illustration of the independent fault set. (a) Independent fault set and
(b) dependent faults.

this computation is referred to as recompilation time. The


following reconfiguration of the FPGAs also takes some time,
which is referred to as reconfiguration time. If the sum of
the recompilation time and the reconfiguration time for one
fault injection is , the total time spent in recompilation
, where
is the number of
and reconfiguration will be
fault injections. 2) Emulation time. Suppose the emulator can
typically operate at the speed of more than 1 MHz. Simulation
of 100 K vectors takes only 0.1 s or less. Therefore, the actual
emulation time is only a small fraction of the recompilation
and reconfiguration time that dominates the total time of the
entire process.
For large designs with large number of faults to be emulated,
pure serial fault emulation may be too run-time expensive
due to the large number of recompilation and reconfigurations.
Two techniques are proposed to enhance the performance. 1)
Inject multiple independent faults simultaneously [9]. A set
is called independent if the fanout cones of the
of faults
are mutually disjoint. Independent faults can be
faults in
injected simultaneously. By observing the output response to
the input stimuli, the detection of each single stuck-at fault of
an independent fault set can be determined without ambiguity.
2) We add extra logic into the prototype design such that a
number of dependent faults can also be injected and emulated
within one FPGA-configuration. Using these techniques, the
total number of recompilation and reconfigurations can be
reduced dramatically.
B. Emulating Independent Faults Simultaneously
Definition 1: Output Image of a fault , denoted as
Image , is the set of the primary outputs that are reachable
structurally from the faulty signal. In the case of a sequential
circuit, it includes those primary outputs that are reachable
by a path through FFs.
is called (structurally)
Definition 2 [3]: A set of faults
independent as long as the output images of the faults in
are mutually disjoint.
Consider the example in Fig. 4(a), Suppose is an indepenand
Then we can
dent fault set containing two faults
and
in a single run by injecting both faults into
emulate
the fault-free circuit simultaneously. From the output responses
of this circuit with double faults, the exact output responses
and the circuit with only
can be
of the circuit with only
and Image
determined without ambiguity because Image
are disjoint. If a fault effect appear at a primary output in
, then
is detected. Similarly, a fault effect appear
Image
, then
is detected. In
at a primary output in Image

1490

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 18, NO. 10, OCTOBER 1999

(a)

(b)
Fig. 5. Dynamic fault injection by adding extra circuitry. (a) A portion of an original circuit mapped into FPGAs and (b) dynamic faults, signal a
s.a.1 and signal g s.a.0, are injected by G1 and G2 :

contrast, if the output images of


and
are not disjoint like
the case in Fig. 4(b), then these two faults cannot be emulated
simultaneously due to the possible effect of fault masking, i.e.,
it is possible that even though a fault (e.g., ) is detected by
the applied test sequence, the fault effect is masked in the
.
presence of another fault (e.g.,
In the preprocessing stage, the independent fault sets are
identified. All faults in each independent fault set are then
emulated at the same time during the subsequent emulation
stage. In [9], an algorithm was proposed to damnify the
independent fault set for combinational circuits. We extend
this algorithm for sequential circuits. The average size of an
independent fault set for ISCAS89 benchmark circuits will be
presented in Section IV.
C. Dynamic Fault Injection
Exploiting the independency between faults can reduce the
number of times that FPGAs needs to be reconfigured. But
in general, the average size of an independent fault set for
sequential circuits is quite small. Hence, we propose another
technique to further reduce the reconfiguration time. This
technique, called dynamic fault injection,1 allows the injections
of a large number of dependent faults in a single FPGAconfiguration by adding extra supporting circuitry.
Fig. 5 illustrates the idea. Fig. 5(a) shows a portion of a
circuit which has been mapped into the FPGAs. The small
boxes represent the logic blocks (CLBs). Suppose stuck-atone (s.a.1) and the stuck-at-zero (s.a.0) are not independent.
Therefore, they cannot be injected and emulated in a single
configuration. However, by adding a fault activation controller
and
as shown in Fig. 5(b), these
and two logic gates
1 This

technique was also independently proposed in [2] and [20].

two faults can be injected in a single FPGA-configuration.


Note that the functionality of this FPGA-configuration is
now dependent on the output values of the fault activation
controller. In Fig. 5(b), the fault activation controller has two
outputs, and One extra gate is also added at the location of
is added for
each to be injected. For example, an OR gate
injecting s.a.1 dynamically, i.e., when is set to 1, the fault
is activated (because the output of an OR gate is a constant
1). On the other hand, if is set to 0, then no fault effect is
created and CLB1 becomes fault-free. Similarly an AND gate,
, is added for injecting the s.a.0 fault dynamically. When
is set to 1, the fault is present. When is set to value 0,
CLB2 is fault-free. The fault activation controller should be
designed in such a way that only one dynamic fault is activated
at any time. For instance, initially the controller produces
10 to emulate s.a.1. After all input vectors are emulated, we
01, which activates the
force the controller to produce
second fault, s.a.0. Note that two passes of emulation in this
case are still required. But since the emulation time is not the
bottleneck, saving in the reconfiguration time will reflect in
the overall fault grading time. The performance estimation in
Section IV will show the significant potential of this technique.
Since the injected faults are about to be activated one by one
during the emulation stage, a circular shift register (CSR) can
be used to implement the fault activation controller. Each flipflop (FF) of this CSR is responsible for activating one injected
dynamic fault. At the beginning of the emulation session,
the content of this shift-register is initialized to
for emulating the fault dynamic fault. After the first fault
is emulated, an external clock signal is applied to CSR to
, which will activate the
change its content to
second dynamic fault. Similar operations are performed for the
following activation of the rest of the injected dynamic faults.

CHENG et al.: FAULT EMULATION

1491

(a)

(b)

Fig. 6. Mapping only four-input function to each CLB for dynamic fault injection. (a) A fault-free CLB regardless of the value of x and (b) a CLB
with a dynamic fault (activated as x
1).

Note that each FF can be used to activate a set of independent


faults in a more general case. With this dynamic fault injection
technique, we are able to inject a large number of faults per
configuration at the cost of extra logic (the controller and one
gate for each dynamic fault). In the extreme case, suppose the
emulator has unlimited capacity for fault emulation, then no
reconfiguration is needed because all faults can be injected
within one configuration.
Since recompiling the entire or partial circuit is timeconsuming (involving repartitioning, replacement, and rerouting), we propose a technique to inject dynamic faults without
changing the layout of the FPGAs. Our goal is to inject a set of
selected dynamic faults by only changing the affected CLBs
contents like the way we inject a static fault. Suppose a CLB is
capable of implementing any arbitrary five-input function. We
adopt a conservative policy that maps only four-input function
to each CLB. The intention is to reserve one input terminal in
each CLB for fault activation control signal. Fig. 6 illustrates
this idea. The function of a CLB is expressed in the ShannonExpansion form. It contains two four-input functional blocks
These two functional
sharing the same input signals
blocks are connected to a multiplexer controlled by a fault
In Fig. 6(a) the CLBs output exhibits a
activation signal
fault-free function regardless of the value of because both
of its four-input functional blocks realize identical fault-free
function. On the other hand, Fig. 6(b) shows a CLB with an
injected dynamic fault. When equals 0, the output is faultfree. But when
equals 1, the injected dynamic fault is
is selected to
activated and the faulty function
the CLBs output. Recall that the function of
is
derived by evaluating the function of the CLB when the target
fault is present as shown earlier in Fig. 3.
The above scheme for dynamic fault injection can be done
without changing the layout of the FPGAs. In order to
incorporate this scheme, the circuit compilation before the fault
emulation process needs to be modified as follows.
1) Add a circular shift-register (CSR) to the design under
fault simulation.
2) Map the design into CLBs with the restriction that a
CLB can only realize a function with at most four inputs.
3) Connect the output of each FF of the CSR to the

Fig. 7. The fixed layout of the fault activation signals generated by a CSR.
(Only the interconnections needed for dynamic fault injections are shown.)

reserved inputs of a set of selected CLBs as illustrated


in Fig. 7.
4) Map the added interconnections into the routing channels of the FPGAs.
5) Download the configuration bitstream into the FPGA
hardware.
After these modification, the fault injection can be done
by simply changing the contents of the affected CLBs like
Fig. 6(b). Recompilation is completely avoided because the
interconnections between CLBs remains the same throughout the entire fault emulation process. This implementation
technique trades emulation capacity for efficient dynamic fault
injection. Experimental results on ISCAS benchmark circuits
show that the number of CLBs increases by only 22% due to
this conservative policy. The entire procedure of fault grading
with this idea is summarized in Fig. 8.
For a large design that requires more than one FPGA
chips to implement, the shift register in each chip should be
connected in a global way that only one fault activation signal
is activated in the entire system during any emulation session.
III. HANDLING

THE

UNKNOWN LOGIC VALUE

A software fault-simulator typically uses the three-valued


is an artificial logic
logic system, zero, one, and , where
value to represent the unknown. The emulation system, similar

1492

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 18, NO. 10, OCTOBER 1999

Fig. 8. Fault emulation procedure using proposed dynamic fault injection.

(a)

(b)

Fig. 9. A primitive noninverting gate for handling three-valued logic using


dual-railed signals. (a) Original cell and (b) dual-railed cell.

to a real chip, do not have the notion of


The lack of
value has both disadvantages and advantages. The presence of
the hyperactive faults that cause the value to populate large
portion of the design creates a large number of unnecessary
value. Simulation for these
events in simulations using the
faults is very computationally expensive and significantly
degrades the performance of the software fault-simulators.
However, on the other hand, fault emulation system may not be
able to differentiate the hard detected fault and the potentially
value. A fault is
detected fault because of the lack of the
called hard detected if a fault effect (0/1 or 1/0) appears at a
primary output. While a fault is called potentially detected
(i.e., the fault-free value is one and the
when either
appears at a the primary
faulty value is unknown ) or
output after simulation. Fault emulation system will classify
a potentially detected fault as either detected or not detected
depending on the initial state of the emulation system. For
synchronous circuits, the percentage of potentially detected
faults is usually very low (Some results on ISCAS benchmark
circuits will be presented in Section IV).
We can use a dual-railed logic to accommodate the unknown
logic value in our emulation system. The idea is to
use two wires to represent a three-valued logic signal. For
instance, 00 represents logic 0, 11 represents logic 1, and
01 represents unknown while the unused code ten is a
dont care term that may be used to minimize the circuit. This
encoding can be implemented by simple duplication of the
original netlist. For example, consider an AND gate shown in
Fig. 9(a). The new cell to implement the encoding scheme is
given in Fig. 9(b). It doubles the fanins, the gate counts, and
the fanouts.
The correctness of the functionality can be verified by the
tables in Fig. 10. Fig. 11 shows the transformation for an
inverting NOR gate. The two wires of the output signal need
to be swapped. These simple transformations are applicable
to complex gates. We also need to duplicate all the FFs

(a)

(b)

Fig. 10. Verification of the AND cell of the dual-railed logic. (a) Original
cell operation and (b) dual-railed cell operation.

(a)

(b)

Fig. 11. A primitive inverting gate for handling 3-valued logic using
dual-railed signals. (a) Original cell and (b) the cell implemented in hardware
emulator.

and implement them with set/reset features. At the beginning


of an emulation run, the FFs are reset to 01 to represent
unknown initial value One property that can be used to
reduce the overhead is based on the observation that not every
signal has the possibility to have an unknown value. Those
signals that are not reachable from the FFs (only reachable
from primary inputs), will never be contaminated by the
value and, thus, do not have to use double rails. Instead, they
remain single railed. When these signals feed into the contaminated region, they are duplicated and changed into
dual-railed (i.e., 0 becomes 00 and 1 becomes 11). The
concept is illustrated in Fig. 12. The overhead of using dualrailed logic in terms of the number of extra gates will be
presented in Section IV.
IV. EXPERIMENTAL RESULTS
Several experimental results on ISCAS-89 benchmark circuits are presented in this section. Table I shows the average
size of an independent fault set for some benchmark circuits.
It indicates the average number of faults that can be injected
and emulated at the same time. This property allows about
1.36 times speedup without any overhead when it is added
to the scenario of the serial fault emulation. Table II shows
the approximate area penalty of using proposed dynamic fault

CHENG et al.: FAULT EMULATION

1493

TABLE III
THE PERCENTAGE OF THE POTENTIALLY DETECTED FAULTS

Fig. 12.

Illustrating the dual-railed region.


TABLE I
THE AVERAGE SIZE OF INDEPENDENT FAULT SET

TABLE IV
THE AREA OVERHEAD OF USING DUAL-RAILED LOGIC
TABLE II
THE EXTRA NUMBER OF CLBS FOR DYNAMIC FAULT INJECTION

injection technique in terms of the number of extra CLBs.


The result is obtained by using the technology mapper in SIS
[10]. The overhead of the fault activation controller, primarily
consisting of FFs, is relatively small and ignored. The average
overhead for the proposed dynamic fault injection technique
is about 22%.
Table III shows the percentage of the potentially detected
faults that are obtained by running a software simulator,
PROOFS [16], using two test sequences, one of which is
generated by an automatic test pattern generation (ATPG)
program and the other is random. The first number is the
fault coverage counting only hard-detected faults. The second
number considers both hard-detected and potentially detected

faults. If we do not use the dual-railed logic, the fault emulator


would report a coverage within these two numbers. Since
the difference between these two numbers is small, the fault
coverage reported by the fault emulator (without dual-railed
logic) would be very close to either one of the listed fault
coverages reported by the software simulator. Table IV shows
the estimated overhead of applying the dual-railed logic to
handle the logic value in terms of the number of extra
gates. The column under title -region represents the number
of gates that are reachable from present state lines and, thus,
needed to be duplicated.
V. PERFORMANCE ESTIMATION
Suppose the circuit for fault emulation has
gates and
collapsed faults. There are input patterns. Some technical as-

1494

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 18, NO. 10, OCTOBER 1999

sumptions for estimating the fault emulation time are described


below. The reprogramming time of one FPGA-chip, denoted
as , is assumed to be 0.5 s pessimistically. The bitstream
manipulation can be done off-line and thus its time is assumed
to be negligible. The emulator is assumed to operate at the
speed of 1 MHz. The total fault grading time using the serial
fault emulation algorithm can be expressed as

time (i.e., 61 min). In general, the circuit emulation time is


proportional to
/Independent) as mentioned earlier.
The reduction factor of the reconfiguration time, denoted as
Gain, can be re-expressed as follows:

Time (serial fault emulation)


(reconfiguration time

emulation time)

where
is the implementation time of the original faultfree design. Consider a sample circuit with 100 K gates, 100 K
collapsed faults, and a test sequence with 50 K input patterns.
0.05)
55 K s,
The second term becomes 100 K (0.5
about 14.5 h without fault dropping. Now if we incorporate
both techniques mentioned above, the fault emulation time
can be expressed as
Time (fault emulation)

where dynamic is the average number of dynamic faults


injected in each configuration, Independent is the average size
is the time to reconfigure
of an independent fault set, and
every FPGA-chip. The second term of the above expression
can be viewed as the total time for reconfiguration and the third
is related to the number of
term the total emulation time.
chips required to implement the circuit under grade grading,
expressed as
(reconfiguration time for a chip)*(number of chips)
(number of chips)(s).
Suppose there are 20 20 CLBs in each FPGA-chip. The
results of Table II show that it takes about 4000 CLBs to
implement a circuit with 20 K gates. By extrapolation, we
assume that it takes 20,000 CLBs (or 50 FPGA-chips) to
implement a 100 K gate circuit. Therefore, , being the time
s. Let us further
to reconfigure 50 chips, will be
assume that we inject 20 dynamic faults into each chip, (i.e.,
one dynamic fault per column) and the size of the independent
fault set, Independent, is 1.36 by the result of Table I. Then,
the estimated fault emulation time of this 100 K gate circuit
becomes

min

min.

Based on this estimation, the reconfiguration time (i.e.,


100 s) becomes negligible. The entire fault emulation time,
reduced from 14.5-h to only 62.6 min as compared to the
serial algorithm, is now dominated by the circuit emulation

This derivation indicates that the reduction factor only


depends on the number of dynamic faults injected per chip.
Increasing this factor will lead to a greater reduction. However,
it may also cause a routing problem. Let us consider the
extreme case for example. If we inject one dynamic fault into
each CLB (i.e., 400 dynamic faults into each chip), then a fault
activation control signal described in Section II needs to be
connected to every CLB of the chip it controls. Therefore, this
net will become a global one and very difficult to route. On the
other hand, if we inject only one dynamic fault into a chip, then
the proposed dynamic fault injection technique will behave
very much like a serial one. Since the reprogramming time of
the FPGA-chips varies from vendors to vendors, in a specific
emulation environment this factor needs to be carefully chosen
in order to strike a balance between the routability and the
overall reconfiguration time reduction.

VI. CONCLUSION
Fault simulation for large sequential circuits remains a
very time-consuming task. With the increasing performance
of field-programmable gate-array and logic emulation technology, a hardware fault emulation system has become not only
feasible but also very efficient as compared with the existing software-based or hardware-accelerator-based approaches.
This paper addresses the issues of realizing a fault emulator
based on an existing logic emulation system. In addition to
a primitive scenario of serial fault emulation, two techniques
are proposed to further speed up the process. The concept
of the independent fault set is exploited to allow parallel
fault emulation, while the dynamic fault injection using extra
circuitry attempts to break the performance bottleneck, i.e.,
the reconfiguration time. The experimental results show that
the overhead of our conservative policy of four-input CLB
realization for dynamic fault injection is modest. Meanwhile,
the issue of incorporating the unknown logic value in the
emulator is addressed. A dual-railed logic is used to augment
our emulator with the ability to simulate the unknown logic
value and, thus, to differentiate the hard detected faults from
those potentially detected faults. It is worth mentioning that
the use of the dual-railed logic is not always necessary: 1) For
circuits with a reset state, there will be no potentially detected
faults assuming that the reset hardware is fault-free. 2) For
synchronous circuits, the percentage of the potentially detected
faults is usually very low. The performance estimation of the
proposed emulation scheme shows that this novel approach
could improve the performance of fault grading significantly.

CHENG et al.: FAULT EMULATION

REFERENCES
[1] P. Agrawal, V. D. Agrawal, and K.-T. Cheng, Fault simulation in a
pipelined multiprocessor system, in Proc. Int. Test Conf., Aug. 1989,
pp. 727734.
[2] L. Burgun, F. Reblewski, G. Fenelon, J. Barbier, and O. Lepape, Serial
fault simulation, in Proc. Design Automation Conf., June 1996, pp.
801806.
[3] M. Abramovici, M. A. Breuer and A. D. Friedman, Digital Systems
Testing and Testable Design. Piscataway, NJ: IEEE Press, 1990.
[4] M. Butts, J. Batcheller, and J. Varghese, An efficient logic emulation
system, in Proc. Int. Conf. Computer Design (ICCD-92), Oct. 1992,
pp. 138141.
[5] P. A. Duba, R. K. Roy, J. A. Abraham, and W. A. Rogers, Fault simulation in a distributed environment, in Proc. 25th Design Automation
Conf., June 1988, pp. 686691.
[6] N. Gouders and R. Kaibel, PARIS: A parallel pattern fault simulator
for synchronous sequential circuits, in Proc. Int. Conf. Computer-AidedDesign, Nov. 1991, pp. 542545.
[7] F. Hirose, M. Ishii, J. Niitsuma, T. Shindo, N. Kawato, H. Hamamura,
K. Uchida, and H. Yamada, Simulating processor SP, in Proc. Int.
Conf. Computer-Aided Design, Nov. 1987, pp. 484487.
[8] F. Hirose, K. Takayama, and N. Kawato, A method to generate tests for
combinational logic circuits using an ultra high-speed logic simulator,
in Proc. Int. Test Conf., 1988, pp. 102107.
[9] V. S. Iyengar and D. T. Tang, On simulation faults in parallel, in
Dig. Papers 18th Int. Symp. Fault-Tolerant Computing, June 1988, pp.
110115.
[10] D. H. Lee and S. M. Reddy, On efficient simulation for synchronous
sequential circuits, in Proc. 29th Design Automation Conf., June 1992,
pp. 327331.
[11] H. K. Lee and D. S. Ha, HOPE: An efficient parallel fault simulator
for synchronous sequential circuits, in Proc. 29th Design Automation
Conf., June 1992, pp. 336340.
, New methods of improving parallel fault simulation in
[12]
synchronous sequential circuits, in Proc. Int. Conf. Conputer-AidedDesign, Nov. 1993, pp. 1017.
[13] C. Y. Lo, H. N. Nham, and A. K. Bose, Algorithms for an advanced
fault simulation system in motis, IEEE Trans. Computer-Aided Design,
vol. CAD-6, pp. 232240, Mar. 1987.
[14] R. Murgai, N. Shenoy R. K. Brayton, and A. Sagiovanni-Vincentelli,
Improved logic synthesis algorithms for table look up architecture, in
Proc. Int. Conf. Computer-Aided Design, Nov. 1991, pp. 564567.

1495

[15] V. Narayanan and V. Pitchumani, A massively parallel algorithm for


fault simulation on the connection machine, in Proc. 26th Design
Automation Conf., June 1989, pp. 734737.
[16] T. M. Niermann, W. T. Cheng, and J. H. Patel, PROOFS: A fast, memory efficient sequential circuit fault simulator, IEEE Trans. ComputerAided Design, vol. 11, pp. 198207, Feb. 1992.
[17] D. L. Ostapko, Z. Barzilai, and G. M. Silberman, Fast fault simulation
in a parallel processing environment, in Proc. Int. Test Conf., Sept.
1987, pp. 686691.
[18] E. G. Ulrich and T. Baker, Concurrent simulation of nearly identical
digital networks, Computer, pp. 3444, Apr. 1974.
[19] S. Walters, Computer-aided prototyping for ASIC-based system, IEEE
Design Test Comput., pp. 410, June 1991.
[20] R. W. Wieler, Z. Zhang, and R. D. McLeod, Simulating static and
dynamic faults in BIST structures with a FPGA based emulator, in
Proc. Int. Workshop Field-Programmable Logic and Applications, Sept.
1994, pp. 240250.
, Emulating static faults using a Xilinx based emulator, in Proc.
[21]
IEEE Symp. FPGAs for Custom Computing Machines, Feb. 1995, pp.
110115.
[22] The Programmable Gate Array Data Book, Xilinx Inc., San Jose, CA,
1992.
[23] MARSIII Emulation System Users Guide, Quick-Turn Design System Inc., Mountain View, CA, Jan. 1994.

Kwang-Ting (Tim) Cheng (S88M88SM98), for a photograph and biography, see p. 1352 of the September 1999 issue of this TRANSACTIONS.

Shi-Yu Huang, for a photograph and biography, see p. 1352 of the September
1999 issue of this TRANSACTIONS.

Wei-Jin Dai, photograph and biography not available at time of publication.

You might also like