You are on page 1of 141

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Low power On-Chip Amplifier for CCD Array


Er. Rahul Malhotra*, Er. Amit Kumar**
Bhai Maha Simgh College of Engineering, Sri Muktsar Sahib, India
*blessurahul@gmail.com, **amitkumar_sgnr@yahoo.co.in

Abstract— The field of Analog VLSI design is an essential parallel array which acts as the image plane the architecture is
part of any electronics system because of our real world is shown in the fig. 1
analog, In this paper low power amplifier is presented for FT CCDs are very much like FF architectures. The difference
CCD array [1]. CCD are used to capture the images modern is that a separate and identical parallel register, called a
digital cameras and high resolution cameras consists of CCD storage array, is added which is not light sensitive. The idea is
array but all the performance of the CCD array is depends on to shift a captured scene from the photosensitive, or image
the performance of On-Chip amplifier which is placed at the array, very quickly to the storage array [5]. Readout off chip
end of the array in this paper single and two stage amplifier from the storage register is then performed as described in the
are simulated and the result is presented for the power and FF device previously while the storage array is integrating the
bandwidth by varying the sizes of the different transistors all next frame. The architecture is shown in the fig. 2
the results are verified by using the Tanner tool (version 7.1)
[11]. There are number of analysis presented by the
researchers in the literature to improve the power dissipation
but most of the structure are compromise sometimes with the
area or sometimes with the bandwidth here we have achieve
the lesser power dissipation but with the handsome value of
bandwidth is also maintained to support this claim the
detailed results are presented in the result section.
Keywords: Gain, power dissipation, bandwidth, capacitance

INTRODUCTION

Charge Coupled Devices (CCDs) were invented in the 1970s


and originally found application as memory devices Charge
Coupled Devices (CCD) have many applications, but the Fig. 1 Full Frame architecture
most important is in imaging [3]. The basic operation of the
sensor is to convert light into electrons. When light is
Incident on the active area of the image sensor it interacts
with the atoms that make up the silicon crystal. The energy
transmitted by the light (photons) is used to enable an
electron to escape from the tight control of one atom to roam
more freely about the device as a “conduction” electron,
leaving behind an atom shy of one electron. Modern CCD has
two types of architecture:

1. Full-Frame (FF)
2. Frame-Transfer (FT)

FF CCDs have the simplest architecture and are the easiest to


fabricate and operate. They consist of a parallel CCD shift
register, a serial CCD shift register and a signal sensing
output amplifier. Images are optically projected onto the Fig. 2 Frame transfer architecture

VLP0101-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Both of the above architecture are widely used but the The Two stage amplifier further improves the character tics of
performance of both the architecture are depends on the type the amplifier and gives the better result which is shown in the
and the quality of the On-chip (output) amplifier which is result section of the paper and the architecture of two stages is
fabricated at the last stage of the structure as shown in the fig shown two stage amplifier also improves the sensitivity of
above. the amplifier and this also reduces the noise level of the
overall CCD.
ARCHITECTURE OF ON-CHIP AMPLIFIER
Output amplifier has also two type of the architecture
VRD
1. Single stage amplifier
2. Two stage amplifier
W=22u
L=2u
Mr Vdd
Reset gate pulse
VRD
W=22u W=22u
L=2u L=2u

M1 M2
W=22u FD
L=2u
VDD Detection node
VRG output
W=22u

L=2u

M1 W=22u W=22u
L=2u L=2u

FD VCS Mc M3
Detection Node out

W=22u

L=2u

Mc

Fig. 4 Two Stage CCD On-Chip amplifier

OPTIMIZATION
For optimization of the on-chip amplifier Length and Width
Fig. 3 Single Stage CCD On-Chip amplifier
of the individual transistor are varied and the various
optimization results are obtained. The effect of increase and
The single stage amplifier consists of source follower M1 and decrease of Length and Width of the transistor is given as
load transistor Mc for biasing. The reset FET is connected to
the detection node and consists of floating diffusion [6, 7] and To achieve maximum gain:
the gate of M1. In the ON state it resets the detection node to a
reference voltage (VRD) and in the OFF state the floating can Transistor „M1‟: -The gain can be maximized by increasing
receives the next charge packet. The voltage source between the width of this transistor as this increases the difference in
the gate and source of the current sink Transistor Mc
the output voltage amplitude.
determines the bias current of the first stage and can be used
as a signal injection point to measure the ratio between total
capacitance and the effective sense capacitance and the Transistor „MC‟: -The gain can be maximized by decreasing
bandwidth in the off state. the width of this transistor as this increases the difference in
the output voltage amplitude.

VLP0101-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Transistor „M3‟: - The power dissipation of the circuit can be


Transistor „M2‟: -The gain can be maximized by increasing reduced by reducing the width of this transistor as the current
the width of this transistor as this increases the difference in flowing into this transistor reduces with the reduction in the
the output voltage amplitude. width.

Transistor „M3‟: -The gain can be maximized by decreasing


the width of this transistor as this increases the difference in
the output voltage amplitude.
RESULTS

To achieve maximum bandwidth: Table 1: When the width of the transistor M3 varied

Transistor „M1‟: - The bandwidth of the circuit can be Transistor M2 M3 Power Bandwidth
increased by increasing the width of this transistor as the Dimensions (W× (W× L) Dissipation BM
(W× L) μm L) μm μm (mW) (MHz)
increase in width increases the transconductance which helps
in increasing the bandwidth as the impedance decreases. M1 Mc

Transistor „MC‟: - The bandwidth of the circuit can be 15×25 12×10 20x10 10x25 5.9 302
increased by increasing the width of this transistor as the
15×25 12×10 20x10 12x25 5.95 320
increase in width increases the transconductance which helps
in increasing the bandwidth as the impedance decreases.
15×25 12×10 20x10 15x25 6.0 242

Transistor „M2‟: - The bandwidth of the circuit can be 15×25 12×10 20x10 18x25 6.1 207
increased by increasing the width parameter of this transistor.
So bandwidth can be increased by changing this parameter.
Table 2: When the width of the transistor M2 varied
Transistor „M3‟: - The bandwidth of the circuit can be
increased by increasing the width of this transistor as the Transistor M2 M3 Power Bandwidth
Dimensions (W× L) (W× L) Dissipation BM
increase in width increases the Tran conductance which helps (W× L) μm μm μm (mW) (MHz)
in increasing the bandwidth as the impedance decreases,
although the change desired is not that large. M1 Mc

15×25 12×10 20x10 10x25 5.15 69


To achieve minimum power dissipation:
15×25 12×10 18x10 10x25 5.25 62
Transistor „M1‟: - The power dissipation of the circuit can be
reduced by reducing the width of this transistor as the current 15×25 12×10 16x10 10x25 5.2 78
flowing into this transistor reduces with the reduction in the
width while power dissipation can be reduced by increasing 15×25 12×10 14x10 10x25 5.3 70
the length because increase in length reduces
transconductance which in turn reduces the amount of current 15×25 12×10 12x10 10x25 5.4 87
flowing into the transistor.
15×25 12×10 10x10 10x25 5.7 122
Transistor „MC‟: - The power dissipation of the circuit can be
reduced by reducing the width of this transistor as the current 15×25 12×10 8x10 10x25 5.8 148
flowing into this transistor reduces with the reduction in the
width while power dissipation can be reduced by increasing Table 3: When the Length of the transistor M3 varied
the length because increase in length reduces
transconductance which in turn reduces the amount of current Transistor M2 M3 Power Bandwidth
flowing into the transistor. Dimensions (W× (W× L) Dissipation in
(W× L) μm L) μm μm (mW) (MHz)

Transistor „M2‟: - The power dissipation of the circuit can be M1 Mc


reduced by reducing the width of this transistor as the current
flowing into this transistor reduces with the reduction in the 15×25 12×10 20x10 10x5 7.0 580
width.
15×25 12×10 20x10 10x10 6.4 594

VLP0101-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

In this thesis Analog simulation is done by using the Tanner


15×25 12×10 20x10 10x15 6.1 596 tool and using the enhancement type MOSFET transistor is
used, this thesis can be further extended for the depletion type
15×25 12×10 20x10 10x18 6.0 365
MOSFET because in depletion type MOSFET noise level
15×25 12×10 20x10 10x20 5.9 270
will get further reduce and the other thing which can be
improved in future is, semiconductor and environmental
15×25 12×10 20x10 10x25 5.7 122 noise effect which is not consider in this current thesis.

15×25 12×10 20x10 10x30 5.8 109

Table 4: When the Length of the transistor M2 varied

Transistor M2 M3 Power Bandwidth


Dimensions (W× (W× Dissipa in
(W× L) μm L) μm L) μm tion (MHz) REFERENCES
(mW)
M1 Mc [1] Gruner, Sol M. Tate, Mark W. Eikenberry and Eric
F “Charge - coupled device area x-ray detectors”.
15×25 12×10 20x5 10x15 6.4 150 Review of Scientific Instruments, page No. 2815 -
2842 Volume:73 Issue: 8
15×25 12×10 20x10 10x15 6.1 490
[2] M.J.Howess & D.V.Morgan, “Charge-Coupled
Devices and Systems”, John Wiley & Sons.
15×25 12×10 20x15 10x15 5.9 550
[3] James R. Janesick, “Scientific Charge-Coupled
Devices”, Spie Press Monograph Vol.85.
15×25 12×10 20x18 10x15 5.8 570
[4] M.s Tyagi, “Introduction To Semiconductor
15×25 12×10 20x20 10x15 5.8 326 Materials And Devices”, by John Wiley & Sons,
Inc © 1991.
15×25 12×10 20x25 10x15 5.75 380 [5] Dalsa web site; CCD Technology Primer;
http://www.dalsa.com/corp/markets/ccd_vs_cmos.as
The results of the above table are taken from the Tanner T- px
spice tool by using the 2.0 Mosis model file for the [6] Kodak CCD Primer, #KCP-001,”Charge coupled
enhancement MOSFET transistor. The power dissipation and device (CCD) Image Sensors”, Eastman Kodak
the bandwidth are directly, measures from the waveform Company - Microelectronics Technology Division.
editor in the Tanner EDA tool. [7] D.Barbe, "Imaging Devices Using the Charge-
Coupled Concept". Proceedings of the IEEE,
pp. 38-67, Jan. 1975.
[8] Stuart A. Taylor, “CCD and CMOS Imaging Array
CONCLUSION AND FUTURE SCOPE Technologies: Technology Review”, Technical
Report EPC106, Xerox Research Centre Europe,
It is observed from the result that in case of single stage On- 1998.
Chip amplifier minimum power dissipation and maximum [9] Beynon J.D.E, “The Basic Principles Of Charge
bandwidth is achieved when the Width of the M1 transistor is Coupled Devices”, MICROELECTRONICS,
18μm and the Length of the M1 transistor is 25μm meter and vol.7 No.2c 1975 Mackintosh Publications Ltd.
the Width of the Mc transistor is 10μmr and the Length of the Luton.
Mc transistor is 16μm. In this case power dissipation is 4.3 [10] P.Centen, E. Roks. "Characterization of Surface- and
milli-watts and the gain of the amplifier is 0.82 and Buried-Channel Detection Transistors for On-
bandwidth is 617MHz. In case of two stage amplifier Chip Amplifiers". Technical Digest IEDM97,
maximum bandwidth is achieved when dimension of pp.193-196, San Francisco, Dec 7-10, 1997.
transistor is as M1(15μmx25μm), M2(20μmx10μm), [11] http://www.mosis.com/products/fab/vendors/tsmc/ts
M3(10μmx15μm) & Mc(12μmx10μm) and for minimum mc-kits.html
power dissipation the dimension of all the transistor should be
M1(15μmx25μm), M2(20μmx10μm), M3(10μmx25μm) &
Mc(12μmx10μm). The whole design simulated using
MOSIS/Orbit 2.0μm process by using Tanner tool.

VLP0101-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Programmable Input Output Resistances of


FET Amplifier
Mrs. Meena Singh Arun Kumar Singh Dr. B. P. Singh
Lecturer, Deptt. of ECE, University Deptt. of ECE, Madan Mohan Professor, Deptt. of ECE &EEE,
Polytechnic, B.I.T. Mesra, Ranchi Malaviya Engg. College, Gorakhpur Mody Institute of Technology &
(meena71_singh@rediffmail.com) (singh16.arun@gmail.com) Science, Lakshmangarh
+91-9279265054 +91-9312801316 (bpsingh@ieee.org)+91-9468688102
RF

Abstract— The mathematical model provides an insight into the


complete behavior of the physical system that reduces the VDD
problem to its essential characteristics. The floating admittance C RD1 R22 RD2+
matrix (FAM) approach is a neat method of mathematical R12 VD
modeling of electronic devices and its uses in circuits. The zero C 2 C
sum property of the floating admittance matrix provides a 1 CD
check to proceed further or reobserve the first equation itself. rs 3
All transfer functions are represented as cofactors of the RL
floating admittance matrix of the circuit. R12 R21
vi
RS2 C RS2 C
Keywords: Amplifier, Common Source FET, Floating
Admittance Matrix, Zero Sum property, Cofactors, Plots

INTRODUCTION

The most commonly used amplifier configuration of Fig.1 Two-stage Common Source Amplifier
MOSFETs is common source amplifier. The common-
source (CS) amplifier may be viewed as a transconductance The a.c. equivalent circuit of Fig. 1is shown in Fig. 2
amplifier or as a voltage amplifier. As a transconductance
RF
amplifier, the input voltage is seen to be modulating the
current going to the load. As a voltage amplifier, input
voltage modulates the amount of current flowing through the
FET, changing the voltage across the output resistance 2 4
accordingly.
This paper aims to develop the mathematical model of 1
common source amplifier. The floating admittance matrix of RD1 RD2
FET is taken to advantage for derivation of its voltage gain,
input resistance and output resistance in the common source rs RG1 3 RG2
configuration.

MATHEMATICAL MODEL OF FET Fig.2 ac circuit of two-stage Common Source Amplifier

The two stage Common Source FET amplifier can be The matrix representation of FET as two-port network (four
represented as in Fig. 1 terminals) is written as

VLP0102-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

1 2 3 Y =
g s G G1 G F 0 g s G G1 GF
i1 ig = gg 0 gg 1 v1 v g (6.1) (5.1)
i2 id gm gd gm gd 2 v2 vd g m1 g d1 G D1 G G 2 g m1 g d1 G D1 G G 2 0

i3 is gg gm g d g g g m g d 3 v3 vs g m1 g s G G1 g d1 g m 2 G D1 G G 2 g m1 g d1 g m 2 g d 2 g s g d2 G L
(1) G G1 G D1 G G 2 G L
The admittance matrix of the FET as a device is expressed in
(1). Its coefficient matrix is expressed as GF g m2 g m 2 g d2 G L g d2 G L G F
1 2 3 (6)
Equation (6) represents the Floating Admittance Matrix [3],
Y =
gg 0 gg 1 [4], [5] of two stages Common Source Amplifier.
gm gd gm gd 2 Now from (6) the input impedance of circuit in Fig.2 can be
expressed as [1],[2]
gg gm gd gg gm gd 3

0 0 0
= (2) (g d1 g g 2 GD G G )(g d 2 GL GF)
gm gd gm gd =
(g g1 G G G F )(g d1 g g 2 g m2 GD G G )(g d 2 GL GF)
gm gd gm gd G F [(g m1g m 2 (g d1 g g 2 g m2 GD G G )G F ]
Gate to source resistance of FET is assumed to be very large
(7)
(ideally infinity) as it is always reverse biased, hence g g = Similarly, its output impedance and voltage gain can be
0 S. Then the above coefficient matrix of the FET of (1) expressed as [1], [2]
reduces to (2). Thus, the admittance matrix of two FETs
(device1 and device2) connected in Fig.2 can be written as

1 2 3
(g d1 g g2 GD G G )(g g1 GG GF)
0 0 0 1 =
Ydevice1 = (3) (g g1 gs GG G F )(g d1 g g2 g m2 GD G G )(g d 2 GF)
g m1 g d1 g m1 g d1 2 G F [(g m1g m 2 (g d1 g g2 g m2 GD G G )G F ]
g m1 g d1 g m1 g d1 3 (8)
43 Y13
11 43
A Sgn 4 3 Sgn 1 3 1
2 4 3 V 13
Y13
13
0 0 0 2 AV=
g m1g m2 G F (gd1 G D G G )
(9)
Ydevice 2 = (4)
(gd1 g g2 G D G G )(gd2 G L G F )
g m2 g d2 g m2 g d2 4
g m2 g d2 g m2 g d2 3 VERIFICATION ON MATLAB
Now the composite matrix of two devices (device1 and
43
device2) is written as The values of , , and A V 13 for different values of
1 2 3 4
source conductance and load conductance ( 0mS, 1mS, and
0 0 0 0 1 2mS) have been programmed through MATLAB. The
Ydevices = output of the MATLAB programs have been plotted for ,
g m1 g d1 g m1 g d1 0 2
43
g m1 g d1 g m 2 g m1 g d1 g m 2 g d 2 g d2 3 , and A V 13 with respect to feedback conductance, Gf .
0 g m2 g m 2 g d2 g d2 4 If we assume that the two MOSFETs of Fig. 2 are properly
biased to yield the same values of its internal parameters
(5)
( g d1 = g d 2 and g m1 = g m 2 ), then for plotting on demand
The over all admittance matrixes for Fig.2 is written as

VLP0102-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

value of simulated input and output resistances, typical


values of external parameters along with its internal
parameters can be given as:
g d1 = g d 2 = 0.1mS, g m1 = g m 2 = 5mS, G L = G D = 1mS,
G G1 = G G 2 = G G = 0.001mS, g g1 = g g 2 = 0.0001mS, G F
= variable (0mS to 0.15mS).
The plots of input and output resistances results into on
demand values or in other words simulated input and output
resistance can have any values, both negative and positive
that is controlled by the feedback conductance between the
two stages of the amplifier.
The plot of input resistance as a function of feedback
conductance is shown in Figs.3, 4, and 5 for 0 S, 1 mS and 2
mS of load conductance respectively as per (7).
Following observations are recorded from the plots in Fig. 3,
4 and 5:
Fig.4 Input resistance as a function of feedback conductance for
GL= 1 mS
b) For GL= 1 mS, input resistance is almost constant at
3.289e+05 Ω from initial values of G f till Gf reaches
0.0004036 mS, thereafter Ri starts increasing linearly (from
3.289e+05 Ω to 4.393e+07 Ω) from G f = 0.0004036 mS to
Gf = 0.0004038 mS and suddenly jumps down (to -
7.805e+06 Ω) as Gf reaches 0.00040381 mS. Again, Ri
began to rise (from -7.805e+06 Ω to -6.729e+05 Ω) from Gf
= 0.00040381 mS to Gf = 0.0004039 mS respectively, and
remains constant thereafter at -6.729e+05 for higher values
of Gf.

Fig.3 Input resistance as a function of feedback conductance for


GL= 0 S
a) For GL = 0 S, input resistance is almost constant (
1.148e+06 Ω) from initial values of Gf till Gf reaches
2.7520e-05 mS, thereafter input resistance began to rise
exponentially (from 1.148e+06 Ω to 4.837e+06 Ω) for
2.7520e-05 mS to 2.7523e-05 mS variation in Gf. It is
interesting to note that Ri suddenly jumps down (from
4.837e+06 Ω to -6.828e+07 Ω) for 2.7523e-05 mS to
2.7524e-05 mS variation in Gf , again Ri began to increase
suddenly to -4.237e+06 Ω as Gf approaches 2.7525e-05 mS,
the curve then starts increasing linearly (from -4.237e+06 Ω
to -1.473e+06 Ω) from Gf = 2.7525e-05 mS to Gf = 2.7527e- Fig.5 Input resistance as a function of feedback conductance for
05 mS respectively, and Ri remains constant thereafter at - GL = 2 mS
1.473e+06 Ω for higher values of Gf. c) For GL= 2 mS, input resistance rises exponentially (from
216.5 Ω to 3331 Ω) from Gf = 0.0001 mS to Gf = 0.0011 mS
respectively, then suddenly it jumps down to Ri= -4418 Ω at
Gf = 0.0012 mS and again rises exponentially( to -225.4 Ω)
till Gf = 0.002 mS and remains constant thereafter at -225.4
Ω for higher values of Gf.

VLP0102-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

The plot of output resistance as a function of feedback mS, thereafter Ro starts increasing exponentially (from
conductance (Gf) is shown in Figs.6, 7, and 8 for 0 S, 1 mS 237.9 Ω to 2829 Ω) from Gf = 0.03340 mS to Gf = 0.03341
and 2 mS of source conductance respectively as per (8). mS and suddenly jumps down (to -7836 Ω) as Gf reaches
Following observations are recorded from the plots in Fig. 6, 0.033411 mS. Again, Ro rises (from -7836 Ω to -22.83 Ω)
7 and 8: from Gf = 0.033411 mS to Gf = 0.0335 mS, and remain
constant thereafter at -22.83 Ω for higher values of Gf.

Fig.6 Output resistance as a function of feedback conductance for


GS = 0 S Fig.8 Output resistance as a function of feedback conductance for
a) For gs = 0 S, output resistance is almost constant ( GS = 2 mS
1.735e+04 Ω) from initial values of Gf till Gf reaches c) For Gs= 2 mS, output resistance rises exponentially (from
2.752e-05 mS, thereafter output resistance starts rising 0.805 Ω to 39.85 Ω) from Gf = 0.09 mS to Gf = 0.1 mS
exponentially (from 1.735e+04 Ω to 5.452e+04 Ω) for respectively, suddenly it jumps down to Ro= -1.028 Ω at Gf
2.7520e-05 mS to 2.7522e-05 mS variation in Gf. It is = 0.11 mS and remains constant thereafter at -1.028 Ω for
interesting to note that Ro suddenly jumps down (from higher values of Gf.
5.452e+04 Ω to -7.697e+05 Ω) for 2.7522e-05 mS to
2.75242e-05 mS variation in Gf, again Ro began to increase The plot of voltage gain as a function of feedback
suddenly to -4.776e+05 Ω as Gf reaches 2.75262e-05 mS, conductance is shown in Figs.9 and 10 for 0 S, 1 mS and 2
then starts increasing exponentially (from -4.776e+05 Ω to - mS of load conductance respectively as per (9).
1.252e+04 Ω) from Gf = 2.75262e-05 mS to Gf = 2.753e-05
mS respectively, and then Ro remains constant thereafter
at -1.252e+04 Ω for higher values of Gf.

Fig.9 Voltage gain as a function of feedback conductance for


GL = 0 S
Fig.7 Output resistance as a function of feedback conductance for
GS = 1 mS
b) For Gs= 1 mS, output resistance is almost constant at
237.9 Ω from initial values of Gf till Gf reaches 0.03340

VLP0102-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Fig.10 Voltage gain as a function of feedback conductance for


GL = 1 mS and 2 mS
Plots in the figs. 9 and 10 reveals that voltage gain (AV) is an
inverse function of feedback conductance (Gf), further the
voltage gain decreases as the value of source conductance
(gs) increases due to their inverse relationship given by (9).

CONCLUSION
Plots in the Figs. 3 to 8 reveal a region of very sudden
change in the values of input resistance and output resistance
from very high positive values to large negative value, for
very small change of the order of 10-05 in the value of
feedback conductance, Gf. This zone of very high variation
in input and output resistances can be used for compensation
of resistances to obtain very high Q-factor in the lossy
networks.

REFRENCES
[1] Wai-Kai Chen, On second order cofactors and null return difference in
feedback amplifier theory, International Journal of circuit theory and
application, Vol. 6, Issue 3, pp. 305-312, Dec. 2006.
[2] Otso Juntunen , A two port S-parameter data transformation, circuit
theory laboratory report series, CT-35, Helsinki University of technology,
Finland, Espoo 1998.
[3] B.P. Singh, Unified Approach to electronics circuit analysis, IJEEE, pp.
276-285, July 1978.
[4] B.P. Singh, Active bridge for measurement of admittance parameters of
the transistors, Indian Journal of Pure and Applied Physics, Vol. 15, pp.
783-786, Nov. 1976.
[5] B.P. Singh, A new active bridge for measuring FET parameters, J Phys.
E. Scientific Instrument, Vol. II, pp. 667-670, 1978.
[6] Jacob Millman and Christos C. Halkias, Integrated Electronics, Analog
and Digital Circuits and Systems, TATA McGRAW-HILL publication, pp.
471-475, 2004.
[7]B.P. Singh, Meena Singh, Sanjay Kumar Roy and S.N. Shukla,
Mathematical Modeling of Electronic Devices and its integration;
Proceedings of National Seminar on Recent Advances on Information
Technology, Allied Publishers Pvt. Ltd., Indian School of Mines Dhanbad
University, pp.494-502, Feb. 6-7, 2009
[8]B.P. Singh, Arun Kumar Singh, verification of transfer functions
of BJT obtained by using MATLAB, Proceedings of IEEE National
Symposium on Innovative Development in Electronics Arena, Arya
College of Engineering, pp. 92-96, Dec. 12, 2009.

VLP0102-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

RELIABILITY PREDICTION FOR IGBT BASED INVERTERS UNDER


DIFFERENT SWITCHING PATTERNS

Fuzail Ahmad#1, S.K.Singh*2, Amit Kumar Verma


#
DOEACC CENTRE GORAKHPUR,INDIA, DOEACC CENTRE GORAKHPUR,INDIA
IBM GURGAON,INDIA
er.fuzailahmad@hotmail.com
3amitkverma@in.ibm.com

sksingh@doeaccgkp.edu.in

Abstract—Due to the increasing importance of power electronics Inverters are used in hybrid electric vehicles to
in control of devices particularly in electrical vehicles the convert the DC supply coming from battery into AC for use in
reliability analysis becomes important. The reliability of a motor to run the vehicle. Inverters are made up of
component is the probability that this component will perform its semiconductors and capacitors, so it is important to assure the
intended function after a time ‘t’ in a given operating condition. reliability of these components. Because malfunctioning of
Nowadays component reliability is not very important by any of the power electronic components may prevent the
considering only the power losses. For predicting reliability of vehicle to operate.
power electronics components temperature and temperature
Mainly three phase voltage source inverters are used
cycle are to be determined.
Military handbook [3] has been released by US
in these types of applications. Here IGBTs are used as
department of defence is generally accepted and often used to switching devices. For designing an inverter, it is important to
determine reliability [1]. Now the handbook is not revised and make a good thermal design such that on the one hand the
new components like IGBTs are not considered here the values temperature of the components never exceeds their specified
are too conservative for available devices. Some manufacturers maximum temperature and on the other hand the cooling
gives information of finding reliability through information that system is not oversized.
only continue to finding switching losses and total power losses, document is a template. An electronic copy can be
very few of them gives the thermal model of the devices. The downloaded from the conference website. For questions on
information of calculating the power losses and thermal
paper guidelines, please contact the conference publications
modelling is presented in [5] based on PWM reconstruction
technique. This method is useful for large simulation time step
committee as indicated on the conference website.
and particularly for long mission profiles. D. Hirschman Information about final paper submission is available from the
presented an approach with simple formulas for reliability conference website.
prediction of inverters in HEVs. Work presented in literature so
far has developed reliability models for power electronics II. BASICS OF RELIABILITY CALCULATION
components but not bothered about the effect of PWM method 2.1 INTRODUCTION
on the reliability. This work presents the comparison between
―The reliability of a component is the probability that
six-step PWM based inverter and SVPWM based IGBT inverter
on finding reliability. In this work reliability is found by this component will perform its intended function after a time
conventional method and also by considering thermal cycles. t in a given working condition.‖
MATLAB/Simulink based models for finding out the switching The Global reliability of the system is the product of all
losses and temperature cycles are developed. reliabilities

Here n is the no. of components and .


It means adding component reduces reliability.
The starting point in reliability analysis is the
I. INTRODUCTION
evaluation of reliability of a device or a component. This is
The use of power electronic components in automobile generally done from the available failure data. That is, a large
applications is increasing day-by-day. Due to this it becomes number of identical components are subjected to identical
important to determine the reliability of power electronic operating conditions and the frequency of their failures is
components used in automotive applications. tabulated.

VLP0103-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Let be the total no. of identical components for The component failure rate is computed by multiplying
reliability. a component base failure rate with application specific -factors.
is the no. of components surviving at time‗t‘. Failures/
is the no. of components failed at time‗t‘. (2.6)
Then at any time, + = and Reliability is Here is Base Failure Rate
given as is Temperature Factor
= is Application Factor
is Quality Factor
(2.1)
is Environmental Factor
Reliability is characterized by the failure rate .
However no-factor exist which takes temperature cycles into
The failure rate is the probability that a component, which is
consideration.
still operational at time , fails in the time interval ,
where . Thus, it gives the fraction of failures in a
certain time interval for defined boundary conditions. The unit III. ELECTRICAL MODELING AND
of the failure rate is FIT (failures in time) CALCULATION OF POWER LOSSES
A. During the design phase of an inverter, it is important to
make a good thermal design such that on the one hand the
(2.2) temperatures of the components never exceed their
The total failure rate for a system, consisting of k specified maximum temperature and on the other hand the
components, is the sum of all single failure rates , given as cooling system is not oversized. In hybrid electric vehicles,
the inverter load cannot directly be derived from the
(2.3) current load status. Instead, the inverter load is computed
The mean time to failure (MTTF) is also used to characterize by a complex algorithm that considers the motor speed, the
the reliability. MTTF is mean time elapsed before the first required torque, the state of charge of the traction battery
failure occurs, is equal to the area under the reliability curve. etc.
The electrical simulation includes the inverter model and
computes the currents and voltages at the terminals of the
(2.4) inverter. These values are stored in a file which is used as
It can be calculated easily as input for the thermal simulation. The advantage of this
procedure is that the results of the electrical simulation can be
reused for different thermal simulations, if nothing in the
model is changed.
(2.5) SIMULATION AND RESULTS
Different approaches can be used to calculate reliability. 5.1 INTRODUCTION
Well known method is to use Failure rate catalogs. There are A block diagram representation of the whole work is
various failure rate catalogs available e.g. Military Handbook shown in fig 5.1. The fig shows a three-phase inverter with
(MIL-HDBK-217F) and Recueil de Données de Fiabilité IGBT/Diode as a switching device, constant DC supply as an input to
(RDF 2000). the inverter model and a three-phase load.
The losses in IGBT i.e. conduction loss and switching loss
2.2 Military Handbook (MIL-HDBK-217F) Method is calculated and fed to the thermal model. Here it should be noted
that switching losses in an IGBT can be found by using datasheets.
Military Handbook 217F has been released in 1995 by the
The thermal model gives the junction temperature as an output,
US Department of Defense, Washington DC. This revised which is later used in calculating reliability of the devices.
version is also the last version as the Department of Defense
has discontinued updating this standard. Hence, new
electronic devices like IGBTs are not considered in this
standard and many reference values are too conservative for
the currently available devices. Regardless, MIL-HDBK-217F
is generally accepted and often used to determine reliability.
The models have been developed, based on the historical part
failure rates.

2.2.1 Component failure rate for IGBT

VLP0103-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

generator itself, or they can be a vector of external signals connected


at the input of the block. Three reference signals are needed to
generate the pulses for a three-phase bridge.
The amplitude modulation ratio, phase, and frequency of
the reference signals can be changed to control the output voltage of
the bridge connected to the PWM Generator block on the AC
terminals. The two pulses firing the two devices of an arm bridge are
complementary to each other for example, when pulse 1,3,5 is low (0)
then pulse 2,4,6 is high (1).
INPUT – Internal generation of modulating signals.
OUTPUT - Six pulses are generated for a three-arm bridge. Pulses 1,
3, and 5 fire the upper devices of the first, second, and third arms.
Pulses 2, 4, and 6 fire the lower devices.
The parameters of the control circuit used in simulation are as
follows:
1. Carrier Frequency (Hz) – 1080 Hz.
2. Sample Time (sec) – 5.14 µsec.
3. Modulation Index – 0.8. The amplitude of the internal
sinusoidal modulating signal. The Modulation index must be greater
than 0 and lower than or equal to 1. This parameter is used to control
the amplitude of the fundamental component of the output voltage of
the controlled bridge.
4. Frequency of Output Voltage – 60 Hz. The frequency, in hertz, of
the internal modulating signals. This parameter is used to control the
fundamental frequency of the output voltage of the
Fig.5.1 Block Diagram Representation of Model controlled bridge.
5. Phase of Output Voltage (degrees) – 0.
5.2 THREE PHASE INVERTER
5.4 CALCULATION OF LOSSES IN IGBT
The Universal Bridge block used in simulation model
implements a universal three-phase power converter that consists of 5.4.1 CONDUCTION LOSSES
up to six power switches connected in a bridge configuration. The As described in detail in chapter 4 conduction loss in an
types of power switch and converter configuration are selectable IGBT is given as a multiplication of collector to emitter voltage of
from the dialog box. The Universal Bridge block allows simulation IGBT when it is conducting and the collector current.
of converters using both naturally commutated and line-commutated
power electronic devices (diodes or thyristors) and forced- (5.1)
commutated devices (GTO, IGBT, MOSFET).
5.2.1 DESCRIPTION OF IGBT

The important specifications of IGBTs are as follows: 5.4.2 SWITCHING LOSSES


The best way to find switching losses in an IGBT is by
INPUT (g) - PWM switching signal to control the opening and using datasheets provided by the manufacturer. For this model IXER
closing of the IGBT. 35N120D1 by IXER is used. In this datasheet
OUTPUT (m) - The Simulink output of the block is a vector
containing two signals. These signals are demultiplexed by using the
Bus Selector block provided in the Simulink library. These signals
are -
1. IGBT Current (A)
5.5 THERMAL MODEL
2. IGBT Voltage (V)
The Parameters of the IGBT used in simulation model are as follows:
In the thermal model the transient thermal impedance curve
1. Internal resistance (Ron) - The internal resistance Ron of the IGBT
(Fig.5.3) provided in every datasheets of IGBT/Diode is used to find
device, in ohms (Ω). In this model it is 1mΩ.
the parameters of the thermal network (given in Fig.4.2).
2. Snubber resistance (Rs) - The snubber resistance, in ohms (Ω).
The Snubber resistance Rs is set to infinite to eliminate the snubber
from the model.
3. Snubber capacitance (Cs) - The snubber capacitance in farads (F).
The Snubber capacitance Cs is set to zero to eliminate the snubber.

5.3 CONTROL CIRCUIT


Generate pulses for carrier-based pulse width modulator
(PWM) for IGBTs. For each arm the pulses are generated by
comparing a triangular carrier waveform to a reference modulating
signal. The modulating signals can be generated by the PWM

VLP0103-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

The values of calculated thermal resistance and thermal capacitance


values are given in Table 5.3.

TABLE 5.1
Calculated and
IGBT DIODE
= 0.238 °C/W = 0.650 °C/W
= 0.362 °C/W =0.650 °C/W
= 0.095 J/°C = 0.064 J/°C
= 0.240 J/°C = 0.133 J/°C

Fig.5.2 Block Diagram Representation of THERMAL MODEL. Values of coefficients


IGBT DIODE
Some manufacturers provide the values of thermal = 0.60 = 1.30
resistance and capacitance in their datasheets. But in most of the
datasheets the information required to obtain thermal network = 0.0202 = 0.0562
parameters is commonly given in form of a transient thermal = 0.1431 = 0.1698
impedance curve ( ). =

Parameters of fitted curve


IGBT DIODE

= -0.499 = -1.099

= 7.772 = 6.901
= 69.518 = 40.212

Fig. 5.3 Transient Thermal Impedance Curve


Table 5.3

5.5.1 CURVE FITTING 5.6 RESULTS AND DISCUSSION


Here the curve fitting technique is used to approximate the The simulation was carried out for three-phase IGBT
inverter used in two different applications: Six step VSI induction
curve by eq. 4.10. The points taken for data fitting are: motor drive and Space vector PWM VSI induction motor drive. The
For IGBT junction temperature of six IGBTs and six Diodes are simulated. In
t= [0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0] this case the temperatures of the IGBT and diode junctions do not
= [0 0.38 0.5 0.58 0.59 0.595 0.6 0.6 0.6 0.6 0.6] differ significantly. Hence, the temperatures of one IGBT junction
For Diode and one diode junction are presented here. The simulations are
t= [0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0] carried out for both of these cases for different simulation times, also
= [0 0.79 1.01 1.19 1.25 1.28 1.3 1.3 1.3 1.3 1.3] speed and torque values are changed in between simulations to better
incorporate the driving cycles. It can be seen that the results are
Here for smoothening the curve moving average method is used. For
improved for long simulation run time.
fitting values to this curve exponential type of FIT is used the
governing equation is
a+b*exp(-c*x)+d*exp(-e*x) 5.6.1 SIX STEP GENERATION TECHNIQUE
The results shown here is for values of thermal coefficients
Here the values of variables a,b,c,d,e gives the coefficients for
given in datasheets. The results for values of thermal coefficients
thermal network equations.
calculated from transient thermal impedance curve by curve fitting
Table 5.1 gives the values of coefficients of the equation
technique are given in AppendixIII.
approximated and fitted by exponential curve fitting technique.
Table 5.2 gives the values coefficients of the transfer
function found by transient thermal impedance curves.

VLP0103-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Stator current
Stator Current (A)

TEMPERATURE CURVE
50 80

TEMPERATURE (in degree Celcius)


0 70

60
-50
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
50
Time (sec)
Rotor speed 40
2000
Speed (rpm)

30
1000
20

0 10
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec) 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
50 TIME (in seconds)
Torque (Nm)

0 number of detected temperature cycles


16
-50
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 14
Time (sec)
12

number of times
10
Fig.5.4 Stator Current, Speed and Torque Curve
In fig. 5.5 the changes in torque and speed values is shown 8

which clearly indicates the changes that occur at time 1sec, 1.5sec, 6

2.5sec and 4sec. The change in the curve of stator current takes place 4
in accordance with changes in speed and torque values. 2
A speed reference step from 0 to 1800 rpm is applied at t = 0
0 5 10 15 20 25 30
0. The speed set point doesn't go instantaneously at 1800 rpm but difference in temperature
follows the acceleration ramp. The motor reaches steady state at t = 1
s.
At t = 1.5 s, a decelerating torque is applied on the TEMPERATURE CURVE
35
motor's shaft. We can observe a speed decrease. Since the rotor speed
TEMPERATURE (in degree Celcius)

30
is higher than the synchronous speed, the motor is working in the
25
generator mode. The braking energy is transferred to the DC
link and the bus voltage tends to increase. However the over voltage 20

activates the braking chopper which causes the voltage to reduce. In 15

this example, the braking resistance is not big enough to avoid a 10

voltage increase but the bus is maintained within tolerable 5

limits. 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
At t = 2.5 s, the torque applied to the motor's shaft TIME (in seconds)

steps from 30 Nm to 0 Nm .You can observe a DC bus voltage and number of detected temperature cycles
30
speed drop. At this point, the DC bus controller switches from
braking to motoring mode. 25

At t = 4 s, the load torque is switched from 0 to 15 the


number of times

20

speed of motor again starts following the acceleration ramp. Again 15


motor reaches a steady state at t=4.4sec.
10

0
0 5 10 15 20 25 30 35 40
difference in temperature

VLP0103-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

100
∆T n(reldata) N N(Total)= N* 1/N
n(reldata)
Power Loss (W)

3 22 8.6071e+006 1.89E+08 5.28E-09


50

4 20 8.1873e+006 1.64E+08 6.11E-09


0 5 27 7.7880e+006
0 0.5 1 1.5 2 2.5
Time (sec)
3 3.5 4 4.5 5
2.10E+08 4.76E-09
40 6 7 7.4082e+006 5.19E+07 1.93E-08
Temp(deg cel)

30
7 4 7.0469e+006 2.82E+07 3.55E-08
20
8 4 6.7032e+006 2.68E+07 3.73E-08
10

0
9 9 6.3763e+006 5.74E+07 1.74E-08
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec) 10 5 6.0653e+006 3.03E+07 3.30E-08
100
11 5 5.7695e+006 2.88E+07 3.47E-08
Temp(deg cel)

50 12 5 5.4881e+006 2.74E+07 3.64E-08


13 5 5.2205e+006 2.61E+07 3.83E-08
0
0 0.5 1 1.5 2 2.5
Time (sec)
3 3.5 4 4.5 5
14 3 4.9659e+006 1.49E+07 6.71E-08
15 3 4.7237e+006 1.42E+07 7.06E-08
Fig. 5.5 Power Loss, Junction Temperature Curve 16 3 4.4933e+006 1.35E+07 7.42E-08
Fig 5.6 shows the power losses that occur in IGBTs. 17 5 4.2741e+006 2.14E+07 4.68E-08
Since total power loss is summation of conduction losses and 18 3 4.0657e+006
switching losses, and switching losses are constant losses, which 1.22E+07 8.20E-08
is 40W as shown in fig.5.5 The fluctuation in curve is only due to 19 3 3.8674e+006 1.16E+07 8.62E-08
the variation of conduction losses. The losses in a diode are same 20 5 3.6788e+006
as that in IGBTs. 1.84E+07 5.44E-08
The junction temperature for IGBT and diode is shown. 21 4 3.4994e+006
1.40E+07 7.14E-08
Which indicate that the temperature in a diode is higher than that 22 4 3.3287e+006 1.33E+07 7.51E-08
in IGBTs. The curve shows the variation in power and 23 4 3.1664e+006 1.27E+07 7.90E-08
temperature cycles due to the variation in speed of the motor.
24 4 3.0119e+006 1.20E+07 8.30E-08
25 4 2.8650e+006 1.15E+07 8.73E-08
Fig. 5.6 Detected Temperature Cycles for IGBT
26 4 2.7253e+006 1.09E+07 9.17E-08
27 4 2.5924e+006 1.04E+07 9.64E-08
Fig. 5.7 Detected Temperature Cycles for Diode 28 4 2.4660e+006 9.86E+06 1.01E-07
29 4 2.3457e+006 9.38E+06 1.07E-07
The temperature cycles of junction temperature are 30 4 2.2313e+006 8.93E+06 1.12E-07
detected from the algorithm (given in 4). There are total of 210 31 4 2.1225e+006 8.49E+06 1.18E-07
temperature cycles are detected. The curve clearly indicates that 32 4 2.0190e+006 8.08E+06 1.24E-07
numbers of temperature cycles are high at low values of 33 4 1.9205e+006 7.68E+06 1.30E-07
difference in temperature and goes on decreasing. The
34 4 1.8268e+006 7.31E+06 1.37E-07
temperature cycles below 15 ºC are not much harmful for
semiconductors. But they should be considered due to their large 35 4 1.7377e+006 6.95E+06 1.44E-07
numbers. 36 4 1.6530e+006 6.61E+06 1.51E-07
37 4 1.5724e+006 6.29E+06 1.59E-07
38 4 1.4957e+006 5.98E+06 1.67E-07
F(t) 2.78e-6
R(t) 1-F(t) 0.99999722

VLP0103-6
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

at 20 Nm shortly after. Note that the DC bus voltage increases since


the motor is in the braking mode. This increase is limited by the
action of the braking chopper.

Table 5.4 Number of Temperature Cycles and Reliability 150

Power Loss(W)
100

50

The values of reliabilities found by using direct value Time (sec)


0
putting and by using curve fitting technique shows that, curve fitting 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
technique gives better reliability.
30
5.6.2 SPACE VECTOR PWM

Temp(deg cel)
TECHNIQUE 20

Stator current
10
50
Stator Current(A)

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0 Time (sec)

60

Temp(deg cel)
-50
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
40
Time (sec)
Rotor speed
2000 20

1500
Speed (rpm)

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
1000 Time (sec)

500
Fig
0 5.9 Power Loss and Temperature Curves
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec) TEMPERATURE CURVE
30
TEMPERATURE (in degree Celcius)

Electromagnetic Torque
40 25

20
Torque(Nm)

20 15

10

0 5

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
TIME (in seconds)
-20
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 number of detected temperature cycles
Time (sec) 15
number of times

10
Fig. 5.8 Stator Current, Speed and Torque
5
At time t = 0 s, the speed set point is 1800 rpm. The speed
follows precisely the acceleration ramp. Speed comes to a steady 0
0 5 10 15 20 25 30 35
state at t=1 sec. difference in temperature

At t = 1.5 s, the full load torque is applied to the motor shaft


while the motor speed is still ramping to its final value. This forces Fig. 5.10 Detected temperature Cycles for IGBT
the electromagnetic torque to increase to a high value and then to
stabilize at 20 Nm once the speed ramping is completed and the
motor has reached 1200 rpm.
At t = 2.5 s, the speed set point is changed to 1500 rpm and the
electromagnetic torque reaches again a high value so that the speed
ramps precisely at 1800 rpm/s up to 1500 rpm under full load.

At t = 4 s, the mechanical load passed from 0 Nm to 15 Nm,


which causes the electromagnetic torque to stabilize at approximately

VLP0103-7
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

TEMPERATURE CURVE
60

TEMPERATURE (in degree Celcius)


50

40
∆T n(reldata2) N N(Total)= 1/N
N* 30

n(reldata2) 20
3 3 8.6071e+006 2.58E+07 3.87E-08
10
4 3 8.1873e+006 2.46E+07 4.07E-08
5 3 7.7880e+006 2.34E+07 4.28E-08 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
6 4 7.4082e+006 2.96E+07 3.37E-08 TIME (in seconds)

7 4 7.0469e+006 2.82E+07 3.55E-08


number of detected temperature cycles
8 4 6.7032e+006 2.68E+07 3.73E-08 35
9 3 6.3763e+006 1.91E+07 5.23E-08 30
10 3 6.0653e+006 1.82E+07 5.50E-08
25

number of times
11 3 5.7695e+006 1.73E+07 5.78E-08
20
12 3 5.4881e+006 1.65E+07 6.07E-08
13 3 5.2205e+006 1.57E+07 6.39E-08 15

14 3 4.9659e+006 1.49E+07 6.71E-08 10

15 3 4.7237e+006 1.42E+07 7.06E-08 5

16 3 4.4933e+006 1.35E+07 7.42E-08 0


0 5 10 15 20 25
17 3 4.2741e+006 1.28E+07 7.80E-08 difference in temperature
18 3 4.0657e+006 1.22E+07 8.20E-08
19 3 3.8674e+006 1.16E+07 8.62E-08 Fig. 5.11 Detected temperature Cycles for Diode
20 3 3.6788e+006 1.10E+07 9.06E-08
21 3 3.4994e+006 1.05E+07 9.53E-08
22 3 3.3287e+006 9.99E+06 1.00E-07
23 3 3.1664e+006 9.50E+06 1.05E-07
Table 5.5 Number of Temperature Cycles and Reliability
24 3 3.0119e+006 9.04E+06 1.11E-07
25 3 2.8650e+006 8.60E+06 1.16E-07
26 3 2.7253e+006 8.18E+06 1.22E-07 5.6.3 COMPARISON OF RELIABILITIES
27 3 2.5924e+006 7.78E+06 1.29E-07 Six-step SVPWM SVPWM
28 3 2.4660e+006 7.40E+06 1.35E-07 (Ts=50sec)
29 3 2.3457e+006 7.04E+06 1.42E-07 MTTF IGB Diod IGBT Diod IGB Diod
30 3 2.2313e+006 6.69E+06 1.49E-07 (hrs) Ts es s es Ts es
31 3 2.1225e+006 6.37E+06 1.57E-07 MIL-
32 3 2.0190e+006 6.06E+06 1.65E-07 HDBK-
33 3 1.9205e+006 5.76E+06 1.74E-07 217
34 3 1.8268e+006 5.48E+06 1.82E-07 Coffin-
F(t) 2.95E-06 Manson
R(t) 1-F(t) 0.99999705

TABLE 5.6 CALCULATED

MTTFs

In Table 5.3 the calculated MTTFs for the two approaches


are compared. Even though the same simulation data were used, the
both approaches calculated components MTTFs which differ for
orders of magnitude. It can be easily seen that the MTTFs for IGBTs
and Diodes in both applications are comes out to be nearly same.
This due to the reason that both experiments are done in nearly same
the operating condition. Reliability calculated by Military handbook
does not consider the effect of temperature cycling hence MTTFs
from this method is same for all three cases. In all cases the IGBTs
are comes out to be least reliable component.

VLP0103-8
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

The results for SVPWM technique for simulation time of inverter,‖ in Proc. 37th IEEE Power Electron. Specialists
50sec is given in appendix II. Conf., 2006, PESC ’06, Jun. 2006, pp. 1–5.
REFERENCES [14] Semikron Application Handbook. Berlin, Germany: ISLE
Verlag, 1998. ISBN 3-932633-24-5.

[15] Z. Zhou,M. S. Khanniche,P. Igic,S. M. Towers ,P. A. Mawby,


[1] D. Hirschmann, D. Tissen, S. Schroder, and R. De Doncker, ―Power loss calculation and thermal modeling for a three
― Reliability Prediction for Inverters in Hybrid Electrical phase phase inverter drive system‖, J. Electrical Systems 1-4
Vehicles‖, IEEE transactions on power (2005): 33-46.
electronics,vol.22,n0.6,nov 2007
[16] Takashi Kojima, Yuji Nishibe, Yasushi Yamada,Takashi Ueta,
[2] D. Hirschmann, D. Tissen, S. Schroder, and R. De Doncker, Kaoru Torii, Shoichi Sasaki, Kimimori Hamada. ―Novel
―Inverter design for hybrid electrical vehicles considering Electro-Thermal Coupling Simulation Technique for
mission profiles,‖ in Proc. IEEE Vehicle Power Propulsion Dynamic Analysis of HV (Hybrid Vehicle) Inverter‖ 37th
Conf., Sep. 2005. IEEE Power Electronics Specialists Conference / June 18 - 22,
2006, Jeju, Korea.
[3] “Military Handbook (MIL-HDBK-217F),” Dept. Defense, Dec.
1991, Ed. [17] K & K Associates, Ed., Thermal Network Modeling Handbook
10141 Nelson St.. Westminster, CO, 80021, K & K
[4] L.K. Mestha, P.D. Evans, ―Analysis of on-state losses in PWM Associates, Developers of Thermal Analysis Kit (TAK), 2000.
inverters‖. IEE Proceedings, Vol. 136 pp.189-195, July 1989. [18] A.R. Hefner. ―A dynamic electro-thermal model for the IGBT‖.
IEEE Trans. Industry Applications, 30(2):394-405, March
[5] A.D. Rajapakse, A.M. Gole, and PL. Wilson. ―Electromagnetic 1994.
transient simulation models for accurate representation of
switching losses and thermal performance in power electronic [19] L.K. Mestha, P.D. Evans, ―Analysis of on-state losses in PWM
systems‖. IEEE Trans. Power Delivery, 20(1):319-327, inverters‖, IEE PROCEEDINGS, Vol. 136, Pt. B, No. 4, JULY
January 2005. 1989.
[6] A. Goel and R. J. Graves, ―Electronic system reliability: [20] C.-S. Yun, P. Malberti, M. Ciappa, and W. Fichtner, ―Thermal
Collating prediction models,‖ IEEE Trans. Device Mater. component model for electromechanical analysis of IGBT
Rel., vol. 6, no. 2, pp. 258–265, Jun. 2006. module systems,‖ IEEE Trans. Adv. Packag., vol. 24, no. 3,
pp. 401–405, Aug. 2001.
[7] P.Nance,M.Marz ―Thermal Modeling of Power Electronics
System‖ PCIM Europe Power Electronic Systems, No. [21] M. Ciappa and W. Fichtner, ―Lifetime prediction of IGBT
2/2000 pp.20-27. modules for traction applications,‖ in Proc. IEEE Int.
Reliability Physics Symp.,
[8] W. Engelmaier, ―Fatigue life of leadless chip carrier solder San Jose, CA, 2000, pp. 210–216.
joints during power cycling,‖ IEEE Trans. Comp. Hybrids
Manufact. Technol., vol. CHMT-6, no. 3, pp. 232–237, Sep. [22] A.T. Bryant, A. Walker, and P.A. Mawby, ―Fast Inverter loss
1983. simulation for Hybrid electrical vehicle drives.‖, Hybrid
Vehicle Conference, IET The Institution of Engineering and
[9] M. Ciappa, F. Carbognani, and W. Fichtner, ―Lifetime Technology, 2006.
prediction and design of reliability tests for high-power
devices in automotive applications,‖ IEEE Trans. Device [23] IXYS Semiconductor GmbH, IXER 35N120D1 Product
Mater. Rel., vol. 3, no. 4, pp. 191–196, Dec. 2003. Specification Sheet, Lampertheim, Germany, 2003.
[10] A. Morozumi, K. Yamada, T. Miyasaka, S. Sumi, and Y. Seki,
―Reliability of power cycling for IGBT power semiconductor [24] Eupec IGBT modules , BSM 100 GD 60 DLC datasheet,
modules,‖ IEEE Trans. Ind. Appl., vol. 39, no. 3, pp. 665–671, 2000-02-08.
May. 2003.
[25] International rectifier,IGBT, IRG4PC40KD datasheet,2000.
[11] Mitsubishi Semiconductors Power Modules ―General
considerations for IGBT and intelligent power modules‖. [26] TOSHIBA, GTR Module silicon n-channel IGBT,
MG300J2YS50 datasheet.
[12] Z. Zhou, M. S. Khanniche, P. Igic, S. T. Kong, M. Towers, and
P. A. Mawby, ―A fast power loss calculation method for long [27] Dustin A. Murdock, Jose E. Ramos Torres, Jeffrey J. Connors,
real time thermal simulation of IGBT modules for a three- and Robert D. Lorenz. “Active Thermal Control of Power
phase inverter system,‖ in Power Electron. Applications, Electronic Modules‖, IEEE Transactions on industry
2005 Eur. Conf., Sep. 2005. Applications, VOL. 42, NO. 2, March/April 2006.
[13] T. Kojima, Y. Nishibe, Y. Yamada, T. Ueta, K. Torii, S. Sasaki,
and K. Hamada, ―Novel electro-thermal coupling simulation
technique for dynamic analysis of HV (hybrid vehicle)

VLP0103-9
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

[28] D. Xu, H. Lu, L. Hang, S. Azuma, M. Kimata and R. Uchida,


Power Loss and Junction Temperature Analysis of Power
Semiconductor Devices, IEEE Transaction on Industry
Applications, Vol..38, No.5, pp, 1426-1431,
September/October 2002.

VLP0103-10
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1

Carry Ripple Adder based on Charge Recycling


for Lower Energy MTCMOS
Arvind Kumar, Member, IEEE , Sanjeev Rai, Sarad Shrestha, ECED, MNNIT,Allahabad


Abstract— Multi-threshold CMOS (MTCMOS) technology calculation in section III, simulation results in section IV and
features the MOSFETS having low threshold voltage (for speed conclusion in section V.
enhancement) and high threshold voltage (for suppressing
standby leakage current during sleep period). In this design, Vdd
frequent transition of mode i.e. active to sleep and sleep to active
may occur, which consumes significant amount of energy. This
paper presents charge recycling concept between virtual supply
and virtual ground to reduce dynamic energy consumption Virtual
Vdd
during mode transition. This paper presents the simulation of Vdd (Vp)
two bit carry ripple adder used in 2 bit accumulator depicting
reduction of 75% dynamic energy consumption during mode CMOS Full
transition as compared to a ripple adder with conventional Carry
Adder CMOS Full
MTCMOS.
Adder
Virtual
Index Terms— Charge recycling, Gated ground, Gated-power, Gnd (VG)
Multi-threshold voltage, Virtual power node,

I. INTRODUCTION

L OW power design is one of the most significant challenges


in designing today’s advanced VLSI circuit. Currently,
portable devices consume lots of energy during idle period due
Fig. 1. Power gating structure using NMOS and PMOS sleep
transistors. High Vt transistor is represented with thick line in channel
region
to leakage current which shortens the battery lifetime. A
popular low leakage circuit technique –multi-threshold voltage
technology which is based on disconnecting the low threshold II. CONVENTIONAL MTCMOS
voltage (low Vt) logic gates from power supply and /or the The conventional MTCMOS as shown in Fig. 1, consist of
ground line by the use of sleep transistor (high Vt) (Fig.1) two blocks where 1st block is power gated by an NMOS sleep
during the standby mode by turning off the sleep transistor [1]. transistor creating virtual ground node (VG) between the
However during the mode transition from active to sleep and block and sleep transistor , and the second block is power
sleep to active, a significant amount of energy is consumed. If gated by the PMOS sleep transistor creating virtual power
mode transition is frequent, then energy overhead is more node (VP) between the sleep transistor and the block.
significant to turn off and turn on the power gating structure.
As shown in Fig. 1, virtual power node and virtual ground A. Virtual Ground and virtual power voltages
node have high parasitic capacitance due to due diffusion In active mode, sleep transistors NMOS and PMOS are turn
capacitances of transistor connected to virtual line, wire on (linear region). During active mode, voltage at virtual
capacitances. ground node (VG) is zero and at virtual power node (VP) is Vdd
This paper applies a new charge recycling technique to [2]. In sleep mode, both NMOS and PMOS are in cut-off.
minimize energy consumption during mode transition from Then the virtual ground node (VG) and virtual power node
active to sleep and sleep to active. The charge stored on the (VP) will be charged up to steady state value of high voltage (≈
parasitic capacitances of virtual power node (VP) and virtual 1.4 V) and low voltage (≈ 0V) for the supply of 1.8 V as
ground node (VG) is recycled during mode transition. shown in fig. 2. Large portion of the total energy drawn from
The remainder of the paper is organized as follows. The the power supply is stored in the parasitic capacitance (shown
conventional MTCMOS and virtual node voltages are in fig.1 as lumped capacitance) associated to virtual nodes.
described in section II, charge recycling technique during The remaining portion of energy is dissipated in the parasitic
sleep to active and active to active and parasitic capacitance impedances of low Vt circuitry i.e. a full adder in fig. 1. In
order to calculate total dynamic energy, i.e. energy consumed
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2

VDD

Sleep
VCR
VDD
Virtual Vdd
CMOS Full (Vp)
Adder
VCR CMOS Full
Virtual Gnd Adder
(VG) Carry

Fig. 2. Virtual Ground Voltage VG =1.3V and Virtual supply voltage VP = Sleep
0V during sleep mode

during sleep to active and active to sleep mode transition, we


assumed that sleep period is long enough to charge the virtual Fig. 3. Charge recycling MTCMOS circuit with transmission gate between
ground node (VG) to VDD and virtual power node (VP) to zero. virtual ground node (VG) and virtual power node (VP)
Let CG-virtual and CP-virtual represents the total parasitic
capacitances at Virtual ground node (VG) and virtual power
node (VP) respectively. Then energy consumed during sleep to complete charge sharing i.e. having equal voltages on the
active mode transition is as follows: virtual nodes, the transmission gate is switched off and now
the sleep transistors are turned on for sleep to active state. The
Esleep-active = CP-virtual V2DD (1) total energy drawn from the power supply to charge the
parasitic capacitance at virtual power node (VP) during mode
Similarly during active to sleep mode, we assumed virtual transition from sleep to active is as follows:
power node (VP) is at value of VDD and virtual ground node Esleep-active = VDD (VDD-Vf) CP-virtual
(VG) at zero. For active to sleep mode transition, energy (4)
consumed is as follows:
Active
Eactive-sleep = CG-virtual V2DD (2) Sleep
Sleep
The total energy consumed during one cycle of active to
Charge
sleep and sleep to active is follows:
Recycling

. ETOTAL = CG-virtual V2DD + CP-virtual V2DD (3)


= V2DD (CG-virtual + CP-virtual)
Fig. 4. Charge recycling Signal (VCR)

III. CHARGE RECYCLING MTCMOS TECHNIQUE


B. Charge recycling during active to sleep mode transition
The charge recycling technique includes the charge
During active state, the virtual power node (VP) is at value
recycling of charges stored at virtual ground node (VG) and of VDD and virtual ground node (VG) is at almost zero value.
virtual power node (VP) is recycled through a transmission Before turning off the sleep transistor while going from active
gate [3, 4] shown Fig. 3. to sleep state, the transmission gate is switched on shortly for
charge recycling between virtual ground node (VG) and virtual
A. Charge recycling during sleep to active mode transition power node (VP). The charge sharing occurs between two
nodes until the common voltage (Vf) on both nodes and
As mentioned in section II, during sleep mode, virtual transmission gate is switched off. Now the sleep transistors are
ground node (VG) will be charged to almost VDD and virtual turned off. The charge recycling process is shown in Fig. 7.
power node to almost zero. Before turning on sleep transistor The parasitic capacitance at virtual ground node (VG) draws
to make in active state, the transmission gate is turned on for a the energy Eactive-sleep from supply during active to sleep
short period [3]. This allows for charge sharing between transition which is as follows:
virtual ground node (VG) and virtual power node (VP) until the (5)
parasitic capacitance on the nodes share the common voltage E = V (V -V ) C
active-sleep DD DD f G-virtual
(Vf) [5] as shown in fig. 5. Here we assume that parasitic
Hence total energy drawn from the power supply during
capacitance on the virtual nodes are almost equal. After the
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 3

TABLE I.
Process parameter of TSMC 180 nm process for VDD =1.8 V
Parameters NMOS PMOS
CGDO (fF/µm) 0.37 0.33
CJ (fF/µm2) 0.77 0.85
CJSW (fF/µm) 0.18 0.33

value over three operating regions. Likewise, diffusion


capacitance includes source to body (CSB) and drain to body
(CDB) which is calculated by following equation:

CDIFF = CBP + CSW


= CJ Area + CJSW Perimeter (7)

where CJ is zero bias bulk capacitance per square meter and


CJSW zero bias perimeter capacitance per meter.

IV. SIMULATION RESULT


We used the cadence-spectre simulator and the technology
180nm(Vtnlow=|Vtplow| =0.156 V and Vtnhigh=|Vtphigh|=0.386V )
Fig. 5. Charge recycling waveform of the two bit carry ripple adder during for the simulation of the circuit. Two bit static carry-ripple
jkjkjkjk
mode transition from sleep mode to active mode adders (using 28-transistors) are designed. The carry-ripple
lklklklkl
adder with conventional MTCMOS shown in fig.1 and the
adder with charge recycling technique shown in fig.2 are
active to sleep and sleep to active mode transition is given as compared in terms of dynamic energy during mode transitions.
follows: All possible input vectors are given to the circuit and the
ETotal(CR) = Esleep-active + Eactive-sleep almost same energy overheads are found out. By using charge
= VDD (VDD-Vf) CP-virtual + VDD (VDD-Vf) CG-virtual recycling, the energy overhead during mode tranistion of
(6) charge recycling ripple adder is 75% lower as compared to the
= VDD (VDD-Vf) (CP-virtual + CG-virtual)
adder with conventional MTCMOS. Fig. 8 shows the total
energy overheads for a full cycle of mode transition i.e. from
active to sleep and sleep to active.
C. Capacitance calculation

For the capacitance calculation on the virtual node, all the


parasitic capacitances of transistors connected to virtual node
are summed up. MOSFET intrinsic capacitance, fig. 6, mainly
includes structural capacitance, channel capacitance, diffusion
capacitances [6] [7].

CGD CDB

CGB

CGS CSB

Fig. 6. Capacitances of MOS transistor


Structural capacitance includes the overlap capacitances
(gate to source overlap capacitance (CGSO) and gate to drain Fig.7. Charge recycling waveform of the two bit carry ripple adder during
overlap capacitance (CGDO)). Channel capacitance depends on mode transition from active mode to sleep mode
the operating regions. For digital circuit, we can take average
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 4

50

45

40

35

30
Enargy(fJ)

25

20

15

10

0
Conventional Gated - Charge Recycling
MTCMOS MTCMOS
Fig.8. The energy overheads of the MTCMOS 2-bit ripple adders

V. CONCLUSION
In this paper, a charge recycling MTCMOS technique for
two bit ripple adder is proposed to reduce the dynamic energy
overhead during mode transition from sleep to active and
active to sleep transition. Transmission gate is used for charge
recycling between virtual rails. We have shown the reduction
of 75% of energy overhead during mode transition i.e. active
to sleep and sleep to active, in charge recycling technique with
compare to conventional one. Here, in the standby mode, the
circuit lost the data. So in future, we can propose the data
retentive circuit in this circuit.

REFERENCES
[1] S.Mutoh, T. Douseki, Y. Matsuya, T. Aoki, S. Shigematsu, and J.
Yamada, “1 V power supply high-speed digital circuit tehnology with
multi-threshold-voltage CMOS, “IEEE J. Solid-State Circuits, vol. 30,
no.8, pp.847-854, Aug.1995.
[2] A. Abdollahi, F. Fallah, M. Pedram “ A Robust Power Gating Structure
and Power Mode Transition Strategy for MTCMOS Design”, IEEE
Trans. Very Large Scale Intergrated Sysytem, vol. 15, Jan. 2007.
[3] E. Pakbaznia, F. Fallah, and M. Pedram, “Charge recycling in
MTCMOS circuits: concept and analysis,” in Proc.ACM/IEEE Des.
Autom. Conf., 2006,pp 97-102.
[4] Z. Liu and V. Kursun, “ Charge Recycling between Virtual Power and
Ground Lines for Low Energy MTCMOS,” Proceedings of the
IEEE/ACM International Symposium on Quality Electronic Design. Pp.
239-244, March 2007.
[5] J. P. Uyemura, Introduction to VLSI CIRCUITS ANS SYSTEMS,
WIELWY Student edtition.
[6] S. Mo. Kang, Y. Leblebici, CMOS Digital Intergrated Circuits-
Analysis and Design, 3rd ed. PEARSON Education.
[7] N. H.E . Weste, D. Harris, A. Benerjee, CMOS VLSI Design- A circuit
and system perspective, 3rd ed. PEARSON Education.
[8] J. M. Rabey, A. P. Chandrakasan, B. Nicolic, Digital Intergrated Circuit,
A Design Perspective, 2nd ed.
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

Forthcoming CMOS Technology in Nanoscale Era


Shashank Mishra#1, Kshitij Bhargava#2, Rohit Tripathi#3 , Piyush Jain#4
Electronics and Communication Engineering (Microelectronics and Embedded Technology) Department
Jaypee Institute of Information Technology, Noida-201307, U.P., India
shashankmishra05EI44@gmail.com
kbhargava3@gmail.com
rohit.tri2008@gmail.com
4
piyush.oct@rediffmail.com

Abstract— CMOS technology has reached to the level of sub- believed that the CMOS device downsizing will approach the
45nm range. It is expected that the nano-CMOS technology will physical limit.
govern the IC manufacturing at least for another couple of
decades. Though there are many challenges ahead, further down-
sizing the device to a few nanometers is still on the schedule of
International Technology Roadmap for Semiconductors (ITRS).
Several technological options for manufacturing nano-CMOS
microchips has been available or will be available very soon. This
paper reviews the challenges of nano-CMOS downsizing and will
focus on the recent developments on the key technologies for the
nano-CMOS in the years to come.

I. INTRODUCTION
Among numerous great inventions made in the 20th
century, electronics is the most important one. Almost every
thing related to human activities, such as power generation,
transportation, entertainment, medical care, is now provided
and controlled by electronics. Semiconductor is strategically
an important technological area for all nations. The electronic
circuit development has been accomplished with the
downscaling of component size since the replacement of Figure 1: Feature size versus time in silicon ICs.
vacuum tubes with transistors 40 years ago. The circuit
characteristics have benefited a lot from the downsizing. We II. CHALLENGES IN SCALING
are now able to integrate millions of CMOS transistors at the Device downsizing from 10 μm to the sub-45-nm range
nanoscale level on the silicon chip with only few centimetres presented a lot of benefits in terms of speed, power, and cost.
square of area occupied. Right now the operating speed of the But apart from the improvements, reported above, one of the
recently developed microprocessor has already reached upto 5 major problems for performance degradation in the ultra-large
GHz and is expected to increase further. Although recent scale circuits is the interconnect delay due to the increase in
trends indicate that the increase in the clock frequency may the resistance and the capacitance values of narrow and dense
gradually get saturated. The CMOS integrated circuits as well interconnection metal lines (parasitic). Furthermore, the
as their core device technology are expected to evolve further performance improvement is also questionable for the ultra-
for at least a couple of decades and their importance will be small MOSFET itself. According to the scaling theory, the
further increased in future intelligent systems. CMOS device drain current per unit gate width should remain constant.
dimensions have been reduced to a millionth at the production However, a significant reduction of the drain current value per
level in the past 100 years. Hundred years ago, no one could unit gate width for sub-45nm gate length MOSFETs was
have ever imagined that the mankind of our time will be able reported recently (as in Fig. 2), [2]. This phenomenon is due
to make any such electronic components which will consist of to the non-optimized MOSFET structure and process. On the
billions of electronic components with dimension smaller than other hand, the small drain current (of several tens of micro-
the bacteria size and those circuits will fulfil the different Ampere per micrometer) at the scaled supply voltage becomes
needs of the society. Future scaling trends have been predicted a major concern. Besides, the fringing capacitance of the gate
by the International Technology Roadmap for Semiconductors electrode, and the inversion layer capacitance will also
(ITRS) for 30 years up to 2040, when the physical gate length degrade the performance of the ultra-small MOSFETs (as in
is expected to be 1 nm (as shown in figure 1), [2]. It is Fig.3), [2]. It is still doubtful at this moment that such a small
MOSFET can be used for high-speed devices. Hence, without

VLP0105-1
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

any new technology support, further downscaling may only


result in performance degradation.

Figure 4.a: Bulk CMOS Gate

Figure 2: Significant reductions of the unit drain currents

Figure 4.b: SOI Gate

There are of course some disadvantages associated with the


new structure as well. While the floating Vbs provides many
benefits, its variability can also become problematic. The
value of Vbs is a function of the present current level in the
gate as well as the history of previous states which the gate
Figure 3: Challenging issues further downsizing of MOS transistor has been in. This means that the threshold of a gate may vary
significantly throughout its operation. Also, if Vbs climbs too
high it can cause pass-gate leakage. There have been
III. IMPROVEMENTS IN CMOS techniques developed to address some of these issues. To test
There have been proposals to try and change the structure this technology, IBM redesigned some of their PowerPC line
of the transistor itself. Here we are discussing the two most chips using SOI. They were able to demonstrate a 22-33%
prominent structural changes: Silicon on Insulator (SOI) and performance increase over the bulk CMOS version of these
Double Gate CMOS (DGCMOS). The basic concept of chips. They also found that, while implementing SOI
Silicon on Insulator is fairly simple. Rather than fabricating a structures it requires a proper understanding of the unique
transistor whose body is connected to the substrate (Fig. 4.a), problems that this technology gets associated with, it was
which is the normal method, an insulating oxide is first possible to redesign existing technologies in a reasonable
deposited on the substrate and then the transistor is fabricated amount of time. The second structure is more experimental,
on top of that (Fig. 4.b). By doing this the body can be made but promises great benefits in the future. That structure is the
electrically isolated from its surroundings. This means that the Double-Gate CMOS (DGCMOS). The basic idea of this
bulk to source voltage Vbs is now floating. This design structure is to add an extra gate (or more) to increase coupling
provides a number of performance benefits. Vbs is now greater between the gate and the channel. Some have called this the
than or equal to zero, which lowers the threshold voltage, Vt, ―ideal structure for scalability‖. Most of the people agree that
providing a performance increase. Also, there is no junction it is the design of the future, but there are some difficulties to
area capacitance. Finally, stacked circuits do not suffer from overcome before them. The difficulties arise in how to
the reverse body effect. The new structure also lends itself to implement the DGCMOS structure. Using traditional
some new uses, such as using the insulating layer for a high fabrication processes a second gate could be added below the
resistance element. body. However, the alignment issues of such a gate are
troublesome. The proposed solution is known as the FinFET.
This structure builds the drain, source, and gate up vertically.
(as in Fig. 5).

VLP0105-2
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

IV. CONCLUSIONS
Silicon MOSFETs have been the smallest electronic device
for several decades. The gate length used for high
performance logic unit is 45 nm in production and 5 nm in
research. Note that the 5-nm gate length is the distance of 18
atoms and 0.8-nm oxide thickness is two atomic layers only.
Si technology is no doubt the most successful nano-devices.
We do not see that there is any realistic replacement for
silicon devices. Even the Si devices reach the downsizing
limit no matter 10 nm, 5 nm, or 1 nm, other emerging devices
such as molecular transistors will also reach their limit of
downsizing in similar dimensions. It is a critical period for
moving from 45-nm to 10-nm technology within this decade.
Most of the materials and the manufacturing processes used in
Figure 5: FinFET structure the deep-submicron era are now pushing to their physical
limits. New materials and technologies are required for further
This may solve the alignment issue, but there is one other down-scaling the device to 10-nm technology and below.
challenge to overcome. In order to control SCE, the body Immersion lithography for ultra fine patterning, strained
thickness must be ¼ of the gate length. This is a daunting channels, nickel salicide, high-k gate dielectric, low-k
challenge because the gate length is usually the smallest interlayer for interconnect, plasma doping, flash and laser
dimension that can be fabricated. There are some technologies annealing for source and drain doping, elevated source and
that may address this, but more work needs to be done in this drain and three-dimensional MOSFETs for controlling short-
area. channel effects, would help to overcome the materials and
The most popular idea is to use carbon nanotubes (CNTs) as technological constraints and improve the device performance
transistors (a configuration example is shown in Fig. 6). This in the ultra-small scale. The final remark is a non-technical
concept is very appealing because it is still a transistor and issue. We anticipate that this issue will be one of the most
could make use of all the architectural knowledge developed important issues for nano-CMOS technology development in
for CMOS. Carbon nanotubes however do have a long way to the next 15 years. We are aware that most of the new mega-
go before they can start replacing the silicon based MOS fabs being planned or under construction are in the East and
transistors. First of all, nanotube transistors developed till date Southeast Asia, and particularly the Mainland China. In 10 or
has shown very poor performance characteristics. Many of the 15-year’s time, the distribution of semiconductor
problems they are exhibiting are similar to the challenges manufacturing sites in Asia (including Japan) will be quite
CMOS is currently facing, such as high off-state leakage and substantial. Currently, Korea and Taiwan are in the first place
source-to-drain tunneling. Also, despite the hopes for for semiconductor memory manufacturing and semiconductor
chemical self assembly some day, it is still very difficult to foundry, respectively. They also lead the technology
produce nanotube transistors. development in Asia region. Mainland China seems to be
another super power for semiconductor manufacturing. The
share of China semiconductor manufacturing will keep fast
growing with the support of booming IC design houses,
constructing new fabs with remarkable increase in industrial
investment, and will be the most important huge and rapidly
expending market. As many other industries and other sectors
of electronic products, Mainland China will eventually
become ―the factory of the world‖ in semiconductor
manufacturing in 15 years or longer and will have great
impact on the future nano-CMOS technology.

REFERENCES

[1] G. E. Moore, ―Cramming more components onto integrated circuits‖,


[Electronics, vol. 38, no. 8, 1965.
[2] International Technology Roadmap for Semiconductors, 2003 Edition,
Semiconductor Industry Association (SIA), Austin, Texas:
SEMATECH, USA.
[3] H. Iwai, Future semiconductor manufacturing-challenges and
Figure 6: Basic carbon nanotube transistor opportunities, IEDM Tech. Dig., 2004, pp. 1-16.
[4] H. Iwai, CMOS downsizing toward sub-100 nm, Solid–State Electron.,
vol. 48, 2003, pp. [497-503].

VLP0105-3
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

[5] Zhao W, Cao Y. New generation of Predictive Technology Model for


sub-45nmearly design exploration IEEE Trans. Electron Devices 2006;
11:2816-23.
[6] T. Morimoto, H. S. Momose, T. Iinuma, et al, A NiSi salicide
technology for advanced logic devices, IEDM Tech. Dig., 1991,
653-656.
[7] T. Iizima, A. Nishiyama, Y. Ushiku, et al, A novel selective
Ni3Si contact plug technique for deep-submicron ULSIs, Symp. VLSI
Technology, 1992, pp.70-71.
[8] R. Tsuchiya, M. Horiuchi, S. Kimura, et al, Silicon on thin BOX: A new
paradigm of the CMOSFET for low-power and high-performance
application featuring wide-range back-bias control, IEDM Tech. Dig.,
2004, pp.631-634.
[9] T. Ghani, et al., "Scaling challenges and device design requirements for
high performance sub-50 nm gate length planar CMOS transistors,"
Symp. VLSl Technology, 2000, pp. 174-175.
[10] B. Yu, ―Scaling towards 35 nm gate length CMOS,‖ in Proc. VLSI
Symp., Kyoto, AMD, June 12–14, 2001, pp. 9–10.
[11] D. Connelly, C. Faulkner, and D.E. Group, ―Performance advantage of
Schottky source/drain in ultrathin-body silicon-on-insulator and dual
gate CMOS,‖ IEEE Trans. Electron Devices, vol. 50, no. 5, pp. 1340–
1345, May 2003.
[12] J. Knickerbocker et al., IEEE Custom Integrated Circuits Conference
(CICC) p. 659 (2005).
[13] G. Anelli, Design and characterization of radiation tolerant integrated
circuits in deep submicron CMOS technologies for the LHC
experiments, Ph.D. Thesis, Institute National Poly-technique de
Grenoble, France, December 2000, also available at http://www.cern.ch/
RD49.
[14] D. Frank et al., ―CMOS device and circuit limits,‖ Proc. IEEE, vol. 89,
Mar. 2001.
[15] Davari, R. H. Dennard, and G. G. Shahidi, ―CMOS scaling, the next ten
years,‖ Proc. IEEE, vol. 83, p. 595, 1995.
C. Mead, ―Scaling of MOS technology to submicrometer feature sizes,‖
J. VLSI Signal Processing, pp. 9–25, 1994.
[16] Y. Taur and E. Nowak, ―CMOS devices below 0.1 m: How high will
performance go?‖ in Proc. Int. Electron Devices Meeting, 1997, p. 215.

VLP0105-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

On Demand Simulation of Input and Output


Resistances of MOSFET Amplifier
Mrs. Meena Singh Arun Kumar Singh Dr. B. P. Singh
Lecturer, Deptt. of ECE, University Deptt. of ECE, Madan Mohan Professor, Deptt. of ECE &EEE,
Polytechnic, B.I.T. Mesra, Ranchi Malaviya Engg. College, Gorakhpur Mody Institute of Technology &
(meena71_singh@rediffmail.com) (singh16.arun@gmail.com) Science, Lakshmangarh
+91-9279265054 +91-9312801316 (bpsingh@ieee.org)+91-9468688102
maximum value of the input bias resistance. A number of
Abstract—A very simple circuit of the MOSFET amplifier to papers are available in the literature which describes
realize both very high positive as well as negative resistances at separate circuits for realization of positive and negative
its input and output terminals is presented. The mathematical resistances. The simple single set-up here realizes both
model is the representation of any device or system that positive and negative input and output resistance and saves
predicts response of the device or system under different types large number of active and passive components. The
of excitations. The floating admittance matrix (FAM) approach importance of the negative resistance is very much felt in the
is one of the neat methods of mathematical modeling of
design of oscillators, multivibrators, filters, and synthesis of
electronic devices and its uses in circuits. The zero sum
property of the floating admittance matrix provides a check to driving-point functions. An attractive method for controlling
the worker to proceed further or reobserve the first equation of the line loss in the telephone lines to any extend can be
itself. All transfer functions are represented as cofactors of the achieved by introducing resistance; which covers very large
floating admittance matrix of the circuit. range of values, in the impedance boosting-network. The
realization of very high positive as well as negative
Keywords: Amplifier, Common Source FET, Floating resistances of any amplifier is all the more important for
Admittance Matrix, Zero Sum property, Cofactors, Plots instrumentation.
This paper aims to develop the mathematical model of
INTRODUCTION common source amplifier in the form of floating admittance
matrix. The floating admittance matrix of the MOSFET is
The input resistance of a MOSFET is supposed to be very taken to advantage for derivation of its voltage gain, input
high, yet a single-stage MOSFET amplifier is sometimes not resistance and output resistance in its common source
suitable for certain applications, especially, when high gain configuration.
along with change in the resistance levels from positive to
negative of very high to very low, is required. This type of MATHEMATICAL MODEL OF FET
requirement is solved by either cascading or cascoding or
combination of the both in different sections of the amplifier The two stage common source MOSFET amplifier can be
stages. Fig. 1 shows two stages of the MOSFET amplifier represented as in Fig. 1 with a feedback through RF from
with RF connected between output of the second stage to the output of the second stage to the input of the first stage.
input of the first stage. It reveals that with proper adjustment
of the feedback resistance, RF, one may realize extremely RF
value of input and output resistance, both positive and
negative. The common source amplifier is the most versatile
MOSFETs amplifier configuration. The common-source VDD
(CS) amplifier may be viewed as a transconductance C RD1 R22 RD2
amplifier or as a voltage amplifier. As a transconductance R12
C 2 C
amplifier, the input voltage is seen to be modulating the 1 C
current going to the load. As a voltage amplifier, input 3
voltage modulates the amount of current flowing through the
rs +
VD RL
MOSFET, changing the voltage across the output resistance R12 R21
vi D
accordingly. The input resistance of a conventional emitter RS2 C RS2 C
follower, cathode follower or source follower is limited by
finite value of the passive emitter/carthode/source resistance Fig.1 Two-stage Common Source Amplifier
as well as the input bias resistance. In fact, the input bias
resistance shorts the input resistance of the amplifier and The a.c. equivalent circuit of Fig.1 is shown in Fig. 2. The
hence the effective input resistance is limited to the matrix representation of MOSFET as two-port network (four
terminals) is written as

VLP0106-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

1 2 3 Y =
i1 ig = gg 0 gg 1 v1 vg g s G G1 G F 0 (6.1) g s G G1 GF (5.1)
i 2 id gm gd gm gd 2 v 2 vd
g m1 g d1 G D1 G G 2 g m1 g d1 G D1 G G 2 0
i3 i s gg gm gd gg gm g d 3 v3 vs
(1) g m1 g s G G1 g d1 g m 2 G D1 G G 2 g m1 g d1 g m 2 g d 2 g s g d2 G L
RF G G1 G D1 G G 2 G L

GF g m2 g m 2 g d2 G L g d2 G L G F

2 4 (6)
Equation (6) represents the Floating Admittance Matrix [3],
1 [4], [5] of two stages common source amplifier.
RD1 RD2 Now from (6) the input impedance of circuit in Fig.2 can be
expressed as [1]-[3]
rs RG1 3 RG2

Fig.2 ac circuit of two-stage Common Source Amplifier


(g d1 g g 2 GD G G )(g d 2 GL GF)
The admittance matrix of the MOSFET as a device is =
(g g1 G G G F )(g d1 g g 2 g m2 GD G G )(g d 2 GL GF)
expressed in [1]-[3]. Its coefficient matrix is expressed as
1 2 3 G F [(g m1g m 2 (g d1 g g 2 g m2 GD G G )G F ]
gg 0 gg 1 (7)
Y = (1)
gm gd gm gd 2 Similarly, its output impedance and voltage gain can be
expressed as [1]- [3]
gg gm gd gg gm gd 3
The gate to source resistance of MOSFET is assumed to be
very large (ideally infinity) as it is always reverse biased,
hence gg = 0S. Then the above coefficient matrix of the
= (g d1 gg2 GD G G )(g g1 GG GF )
MOSFET of (1) reduces to (2).
0 0 0 (g g1 gs GG G F )(g d1 gg2 g m2 GD G G )(g d 2 GF)

gm gd gm gd (2) G F [(g m1g m 2 (g d1 gg2 g m2 GD G G )G F ]

gm gd gm gd (8)
Y13
43 11
43
A Sgn 4 3 Sgn 1 3 1
Thus the floating admittance matrix of two MOSFETs V 13
Y13
(device1 and device2) connected in Fig.2 can be written as 13
1 2 3 g m1g m2 G F (gd1 G D G G )
AV= (9)
0 0 0 1 (gd1 g g2 G D G G )(gd2 G L G F )
Ydevice1 = (3)
g m1 g d1 g m1 gd1 2
VERIFICATION ON MATLAB
g m1 g d1 g m1 g d1 3
43
2 4 3 The values of , , and A V 13 for different values of
0 0 0 2 source conductance and load conductance ( 0mS, 1mS, and
Ydevice 2 = (4)
2mS) have been programmed through MATLAB. The
g m2 gd 2 g m2 gd 2 4
g m2 gd 2 g m2 gd 2 3 output of the MATLAB programs have been plotted for ,
43
Now the composite matrix of two devices (device1 and , and A V 13 with respect to feedback conductance, GF .
device2) is written as If we assume that the two MOSFETs of Fig. 2 are properly
0 0 0 0 1 biased to yield the same values of its internal parameters
Ydevices = g m1 g d1 g m1 g d1 0 2 ( g d1 = g d 2 and g m1 = g m 2 ), then for plotting on demand
g m1 g d1 g m2 g m1 g d1 g m2 g d 2 g d2 3 value of simulated input and output resistances, typical
0 g m2 g m 2 g d2 g d2 4 values of external parameters along with its internal
(5) parameters can be given as:
The overall admittance matrixes for Fig.2 is written as

VLP0106-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

g d1 = g d 2 = 0.1mS, g m1 = g m 2 = 5mS, G L = G D = 1mS, 225.4 Ω) till GF = 0.002 mS and remains constant thereafter
G G1 = G G 2 = G G = 0.001mS, g g1 = g g 2 = 0.0001mS, G F at -225.4 Ω for higher values of GF.

= variable (0mS to 0.15mS).


The plots of input and output resistances results into on
demand values or in other words simulated input and output
resistance can have any values, both negative and positive
that is controlled by the feedback conductance connected
between the two stages of the amplifier.
The plot of input resistance as a function of feedback
conductance is shown in Figs.3, 4, and 5 for 0 S, 1 mS and 2
mS of load conductance respectively as per (7).
Following observations are recorded from the plots in Fig. 3,
4 and 5:

Fig.4 Input resistance as a function of feedback conductance for


GL= 1 mS

Fig.3 Input resistance as a function of feedback conductance for


GL= 0 S
a) For GL = 0S, input resistance is almost constant
(1.148e+06 Ω) from initial values of GF till GF reaches
2.7520e-05 mS, thereafter input resistance began to rise
exponentially (from 1.148e+06 Ω to 4.837e+06 Ω) for
2.7520e-05 mS to 2.7523e-05 mS variation in GF. It is
interesting to note that Ri suddenly jumps down (from
4.837e+06 Ω to -6.828e+07 Ω) for 2.7523e-05 mS to Fig.5 Input resistance as a function of feedback conductance for
2.7524e-05 mS variation in GF, again Ri began to increase GL = 2 mS
suddenly to -4.237e+06 Ω as GF approaches 2.7525e-05 mS,
the curve then starts increasing linearly (from -4.237e+06 Ω The plot of output resistance as a function of feedback
to -1.473e+06 Ω) from GF = 2.7525e-05 mS to GF = conductance (GF) is shown in Figs.6, 7, and 8 for 0 S, 1 mS
2.7527e-05 mS respectively, and Ri remains constant and 2 mS of source conductance respectively as per (8).
thereafter at -1.473e+06 Ω for higher values of GF. Following observations are recorded from the plots in Fig. 6,
b) For GL= 1 mS, input resistance is almost constant at 7 and 8:
3.289e+05 Ω from initial values of GF till GF reaches a) For gs = 0S, output resistance is almost constant (
0.0004036 mS, thereafter Ri starts increasing linearly (from 1.735e+04 Ω) from initial values of GF till GF reaches
3.289e+05 Ω to 4.393e+07 Ω) from GF = 0.0004036 mS to 2.752e-05 mS, thereafter output resistance starts rising
GF = 0.0004038 mS and suddenly jumps down (to - exponentially (from 1.735e+04 Ω to 5.452e+04 Ω) for
7.805e+06 Ω) as GF reaches 0.00040381 mS. Again, Ri 2.7520e-05 mS to 2.7522e-05 mS variation in GF. It is
began to rise (from -7.805e+06 Ω to -6.729e+05 Ω) from interesting to note that Ro suddenly jumps down (from
GF = 0.00040381 mS to GF = 0.0004039 mS respectively, 5.452e+04 Ω to -7.697e+05 Ω) for 2.7522e-05 mS to
and remains constant thereafter at -6.729e+05 for higher 2.75242e-05 mS variation in GF, again Ro began to increase
values of GF. suddenly to -4.776e+05 Ω as GF reaches 2.75262e-05 mS,
c) For GL= 2 mS, input resistance rises exponentially (from then starts increasing exponentially (from -4.776e+05 Ω to -
216.5 Ω to 3331 Ω) from GF = 0.0001 mS to GF = 0.0011 1.252e+04 Ω) from GF = 2.75262e-05 mS to GF = 2.753e-
mS respectively, then suddenly it jumps down to Ri= -4418 05 mS respectively, and then Ro remains constant
Ω at GF = 0.0012 mS and again rises exponentially (to - thereafter at -1.252e+04 Ω for higher values of GF.

VLP0106-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Fig.8 Output resistance as a function of feedback conductance for


Fig.6 Output resistance as a function of feedback conductance for GS = 2 mS
GS = 0 S

Fig.7 Output resistance as a function of feedback conductance for


GS = 1 mS Fig.9 Voltage gain as a function of feedback conductance for GL
=0S
b) For gs= 1 mS, output resistance is almost constant at
237.9 Ω from initial values of GF till GF reaches 0.03340
mS, thereafter Ro starts increasing exponentially (from
237.9 Ω to 2829 Ω) from GF = 0.03340 mS to GF = 0.03341
mS and suddenly jumps down (to -7836 Ω) as GF reaches
0.033411 mS. Again, Ro rises (from -7836 Ω to -22.83 Ω)
from GF = 0.033411 mS to GF = 0.0335 mS, and remain
constant thereafter at -22.83 Ω for higher values of GF.
c) For gs= 2 mS, output resistance rises exponentially (from
0.805 Ω to 39.85 Ω) from GF = 0.09 mS to GF = 0.1 mS
respectively, suddenly it jumps down to Ro = -1.028 Ω at GF
= 0.11 mS and remains constant thereafter at -1.028 Ω for
higher values of GF.
The plot of voltage gain as a function of feedback
conductance is shown in Figs.9 and 10 for 0 S, 1 mS and 2
mS of load conductance respectively as per (9).
Plots in the figs. 9 and 10 reveals that voltage gain (A V) is an Fig.10 Voltage gain as a function of feedback conductance for
inverse function of feedback conductance (GF), further the GL = 1 mS and 2 mS
voltage gain decreases as the value of source conductance
(gs) increases due to their inverse relationship given by (9). CONCLUSION
The plots from Figs. 3 to 8 reveal a region of very sudden
change in the values of input resistance and output resistance

VLP0106-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

from very high positive values to large negative value, for


very small change of the order of 10 -05 in the value of
feedback conductance, GF. This is the zone of very high
variation in input and output resistances, both negative and
positive, which can be used for compensation of resistances
to obtain very high Q-factor in the lossy networks.

REFRENCES
[1] Wai-Kai Chen, On second order cofactors and null return difference in
feedback amplifier theory, International Journal of circuit theory and
application, Vol. 6, Issue 3, pp. 305-312, Dec. 2006.
[2] Otso Juntunen , A two port S-parameter data transformation, circuit
theory laboratory report series, CT-35, Helsinki University of technology,
Finland, Espoo 1998.
[3] B.P. Singh, Unified Approach to electronics circuit analysis, IJEEE, pp.
276-285, July 1978.
[4] B.P. Singh, Active bridge for measurement of admittance parameters of
the transistors, Indian Journal of Pure and Applied Physics, Vol. 15, pp.
783-786, Nov. 1976.
[5] B.P. Singh, A new active bridge for measuring FET parameters, J Phys.
E. Scientific Instrument, Vol. II, pp. 667-670, 1978.
[6] Jacob Millman and Christos C. Halkias, Integrated Electronics, Analog
and Digital Circuits and Systems, TATA McGRAW-HILL publication, pp.
471-475, 2004.
[7]B.P. Singh, Meena Singh, Sanjay Kumar Roy and S.N. Shukla,
Mathematical Modeling of Electronic Devices and its integration;
Proceedings of National Seminar on Recent Advances on Information
Technology, Allied Publishers Pvt. Ltd., Indian School of Mines Dhanbad
University, pp.494-502, Feb. 6-7, 2009
[8]B.P. Singh, Arun Kumar Singh, verification of transfer functions
of BJT obtained by using MATLAB, Proceedings of IEEE National
Symposium on Innovative Development in Electronics Arena, Arya
College of Engineering, pp. 92-96, Dec. 12, 2009.

VLP0106-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Performance Analysis and Comparison of PFSCL and


MCML
Kirti Gupta , Ranjana Sridhar, Jaya Chaudhary

DTU (formerly Delhi College of Engineering)

ABSTRACT:
CML or current mode logic is a in robustness to switching noise as
differential logic style which offers high compared to CMOS logic style [1]. Also,
noise immunity and high speed of at high frequencies (hundreds of MHz to
operation. In this paper we compare the GHz range) CML style is more power
performance of PFSCL or positive efficient than CMOS logic[7].This type of
feedback source coupled logic with logic was first implemented using bipolar
MCML or MOS current mode logic which transistors [5] and extended for application
are derivatives of CML style. We show with MOS transistors. It has less power
through simulations on Orcad PSPICE consumption than ECL but is slower than
using .18nm technology that PFSCL offers ECL.
significant advantages over MCML in
terms of power consumption, area MCML is a extension of Current Mode
occupied and propagation delay . Logic where MOSFET is used as the
transistor instead of BJT. A constant
Due to growing market for digital signal current source is used to bias the
processing and optical communication differential pair of transistors which
applications, commercial interest in high switches the current from one of the pair to
resolution mixed signal ICs has been another depending upon the applied input.
growing. In mixed signal ICs the analog The differential operation suppresses the
and the digital blocks are integrated on the noise coupled with the signal inputs.
same base and hence the resolution of the
analog block is limited by the dynamic PFSCL is new logic style which introduces
switching noise produced by the digital positive feedback into single ended
block. Hence CMOS logic style is not MCML gates [ 7]. This eliminates the need
suitable as it is suffers from dynamic for complementary second input signal
switching noise. Also, for CMOS the while still maintaining the differential
advantage of having zero static power mode of operation.
consumption is lost when it is used at In the following, the operation of MCML
hundreds of MHz to GHz of frequencies. gates is explained in section II. The
Several other logic styles have been architecture of PFSCL and its operation is
proposed to reduce the dynamic switching addressed in section III. In section IV,
noise in mixed signal ICs such as in [2],[3] result of comparison between the
and [4]. The CML style offers advantage performance of PFSCL and MCML is
VLP0107-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

presented and the simulation results are presented.

MCML GATES :
To understand the operation and the ground. When vi is low,the PU switch will
unique properties of MCML we consider be closed and the PD switch open
the simple case of an inverter and will see establishing vo=vdd. Next if vi is raised to
different configurations for its logic high, the PU switch will be open
construction.[8] while the PD switch will close thus
establishing vo=vdd. This circuit constitutes
Inverters can be implemented using the basis of the CMOS inverter.
transistors operating as voltage controlled
switches. The simplest configuration is as The third type of configuration can be
shown in the figure below: implemented using a double –throw switch
as shown below :

[from ref 9]

When vi is low switch will be open and


vo=vdd since no current flows through [fromref 9]
resistance R.When vi is high then switch
will be closed and vo= 0. The switch is used to steer the constant
current IEE into one of the two resistors
We can modify the above configuration by connected to the positive supply VCC. If
using a pair of complementary switches logic high is applied at vi it results in the
called as PU and PD. switch being connected to Rc1, then a
logic inversion function is realized at v01.
This current steering is the basis for
current mode logic circuits.

[from ref 9]

PU switch connects the output node to vdd


and the PD switch connects output to the
VLP0107-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

PRINCIPLE OF OPERATION AND STATIC MODEL OF MCML:

If VGS (M2) is higher than VGS (M1),


then current ID2 exceeds the current
ID1.Therefore, the output voltage Vo2
begins to drop until it reach steady sate
.The output voltage swing Vswing is
defined as voltage difference between Vo1
and Vo2 at steady state.

The differential output voltage Vo is equal

Vo = Vo1 – Vo2 = RD (iD1 – iD2)

The voltage swing is defined as difference


in the output voltage between cutoff and
saturation codition and is given by
Vswing=(Rd)(IT)

The small signal gain Av of a MCML with


MCML is a dual rail logic circuit which
matched gm for single ended output is
uses both the applied input and its
given by : Av= gmRD ∕ 2
complement as an input pair. The
schematic is made up of NMOS source Noise margin is given by:
coupled pair where the transistors work in
the saturation or cutoff. Here we are NM = (Vswing/2)(1 - √2 /AV)
considering the resistive load, however where AV>>1/√2 was assumed.[9].
different types of loads can be used such
as active PMOS load. Total current IT is The delay associated with a SCL gate is
steered to any of the two branches and is given by:Г = .69 Rd Cout
converted to differential output voltage by
where Cout is the net parasitic capacitance
the two resistors RD1 and RD2.M1 and
at the output.
M2 constitutes a differential pair.

VLP0107-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

III PFSCL
In PFSCL, the MCML logic style is MCML.
modified to include positive feedback from
the drain of M1 vo1 to the gate of M2, the
second transistor of the differential
pair.[10]

The small-signal circuit around a given


bias point can be represented as in above
figure where the source voltage vx value is
STATIC BEHAVIOUR OF PFSCL calculated by applying the superposition of
GATE: input voltages vin and vout at the gate of
M1 and M2 and observing that the voltage
The bias current Iss is steered through
gain between the gate and the source of
either M1 or M2 depending on the input
M1 is equal to
signal vin. The transistors M1 or M2
operate in the cutoff or in the saturation and that of M2
region depending on vin. The logic high is
voltage level is Vdd and the logic low
level is Vdd-IssRd. Hence, the PFSCL has For ,
the same Vswing as in calculating Av from the small signal
equivalent circuit of PFSCL gives us:

VLP0107-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Av = (gmn Rd/2)/( 1- gmnRd/2) We see that the factor


From this expression we see that very high gm1gm2/(gm1+gm2) reduces significantly
value of closed loop gain is achieved for outside the transition region around VLT
gmnRd/2 tending to 1.
and due to high sensitivity of Av to this
factor, the closed loop gain rapidly reduces
to zero outside the transition region.
Hence, the DC transfer characteristic has a
slope that sharply tends to zero outside the
transition region and can be approximated
by three segment piecewise linear function
with slope =-Av around VLT and zero
slope for other ones.

Due to positive feedback the expression


for the small signal voltage gain Av
changes to

NM = Vswing/2 (1-1/Av)

From the expression for NM we see that


Piecewise linear approximation of DC noise margin is lower than half of
transfer characteristics of PFSCL gates VSWING and tends to it for high values of
Av ( ie gmnRd/21).

SUMMARY FROM THE ABOVE SECTIONS ON PFSCL AND MCML


From the expressions of Av , voltage the width of the transistors M1 or
swing and NM(noise margin) we see that M2 or value of Rd.
PFSCL topology offers advantages with
respect to MCML: 4) As gmn depends directly on Iss,
increase in gmn requires a increase
1) Keeping all design parameters in gate voltage of the transistors
constant like (voltage swing, implementing the constant current
biasing voltages and noise margin) source or increase in (W/L) ,
PFSCL achieves same gain for
lower value gmn and RD. 5) That is, increase in power or area
required.
2) Less Rd implies an area saving.
6) Whereas for PFSCL we have to
3) From the expression for Av , for satisfy the relation gmn Rd/21
MCML we see that for increasing only, this can be easily achieved.
Av we have to proportionally
increase the gmn or in other words 7) The reduction in area of the NMOS
transistors for particular value of
VLP0107-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Vswing and Av leads to decrease 8) This increase in speed can be


in the associated parasitic utilised for certain applications.
capacitances.
It can also be traded off for a
a. This gives PFSCL a speed corresponding decrease in
advantage over MCML power supply voltage which is
circuits. required in low power design.

IV COMPARISON OF MCML AND PFSCL LOGIC GATES


PERFORMANCE
In this section we present results of For simulation purpose, the voltage swing
simulation carried out on PFSCL and was taken to be 400mV which is within
MCML gates. The simulations were the practical range of 350mV-650mV.The
carried out on Orcad PSPICE using 180nm value of the voltage gain Av is generally
BSIMv3 MOS model. between 2-10. Simulations have been
performed using Av=2 and Av=6.The
The values of circuit parameters voltage Cload value is taken as .1pF.All the results
gain, gate bias voltage and voltage swing have been presented for input signal
were taken within the range used in frequency 500Mhz, with input swing =1.4
practical applications. pMOS loads were to 1.8V.
used in PFSCL and MCML.

Area required vs Iss for given Av=2 and Vswing=0.4V

6.00E-05
Area required

5.00E-05
4.00E-05 W1+W2+W3 MCML
3.00E-05
2.00E-05 W1+W2+W3 PFSCL
1.00E-05
0.00E+00
0.00E+ 1.00E- 2.00E- 3.00E- 4.00E- 5.00E-
00 04 04 04 04 04
Iss,bias current

This graph shows that as the bias current occupied by PFSCL. The advantage in
value increases, the area occupied by area also leads to decrease in associated
MCML increases at a faster rate than area parasitic capacitance which in turn causes

VLP0107-6
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

the PFSCL gate to be faster than a MCML gate.

t_delay PFSCL vs MCML

2.50E-09

2.00E-09

1.50E-09
t_d pfscl
t_delay

t_d MCML

1.00E-09

5.00E-10

0.00E+00
0.00E+00 2.00E-05 4.00E-05 6.00E-05 8.00E-05 1.00E-04
Iss

This graph shows the advantage of PFSCL gate vs MCML in terms of speed of operation.
This enables the extension of CML architecture into the GHz frequency range.

(For the values of Av=6,Vswing=0.4V and Cload=0.1pF)

Monte Carlo Simulations were also carried From the simulation result it was found
out on PFSCL vs MCML gate to that PFSCL was more robust and its
determine the robustness of the logic style robustness increases as the bias current
to process variations(eg: tox ) and increases.
variations in Vth(the threshold voltage of
the MOS).

VLP0107-7
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

REFERENCES :
1) D. Allstot, S. Chee, S. Kiaei, and basic concepts and
M. Shristawa, “Folded source- perspectives(lecture), Massimo
coupled Logic vs. CMOS static Alioto,2007
logic for low-noise mixed-signal
7) Modeling and Evaluation of
ICs,”IEEE Trans. Circuits Syst. I,
Positive-Feedback Source-
vol. 40, pp. 553–563, Sept. 1993.
Coupled Logic, M. Alioto,
2) S. Kiaei, S. Chee, and D. Allstot, Member, IEEE, L. Pancioni, S.
“CMOS source-coupled logic for Rocchi, Member, IEEE, and V.
mixed-mode VLSI,” in Proc. Int. Vignoli, Member, IEEE, IEEE
Symp. Circuits Systems, 1990, Tansactions on Circuits and
pp.1608–1611. Systems—I: Regular Papers vol.
51, NO. 12, December 2004
3) J. Kundan and S. Hasan,
“Enhanced folded source- 8) A. Sedra and K. Smith,
coupled logic techniquefor low- Microelectronic Circuits,Oxfords
voltage mixed-signal integrated
9) M. Alioto and G. Palumbo,
circuits,” IEEE Trans.Circuits Syst.
“Design strategies for source
II, vol. 47, pp. 810–817, Aug.
coupled logic gates,” IEEE Trans.
2000.
Circuits Syst. I, vol. 50, pp. 640–
4) J.Kundan and S. Hasan, “Current 654, May 2003.
mode BiCMOS folded source-
10) Modeling and Evaluation of
coupled logic circuits,” in Proc.
Positive-Feedback, Source-
ISCAS, June 1997, pp. 1880–
Coupled Logic, M. Alioto,
1883.
Member, IEEE, L. Pancioni, S.
5) ] P. Gray, P. Hurst, S. Lewis, and Rocchi, Member, IEEE, and
R. Meyer, Analysis and design of V. Vignoli, Member, IEEE, IEEE
analog integrated circuits, 4th Transactions on Circuits and
ed. New York: John Wiley & Systems—I: Vol. 51, No. 12, Dec
Sons, 2000. 2004

6) Design of nanometer MOS


Current Mode Logic (MCML):
VLP0107-8
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-9
Comparative Study of Fast Adders using VHDL and FPGA

Nishi Chandra Rajani Bisht

ET Deptt.,H.B.T.I.,Kanpur Associate Professor,ET Deptt.,H.B.T.I,Kanpur

Abstract: Adders are one of the most widely lookahead adder and its variations, and carry-save
used components in integrated circuits and they adders.
are most commonly used in various electronic Several researchers had worked on the
applications. The major challenge for VLSI performance analysis of adders and other
designer is to reduce area of chip and the next researchers on the performance analysis of
multipliers. Therefore, lot of research is going on to
phase is to increase the speed of operation to
reduce power consumption. Therefore, there are
achieve fast operations.
three performance parameters on which a VLSI
Therefore, various adders such designer has to optimize their design i.e. Area,
Speed and Power[2]. It is very difficult to achieve
as the ripple adder, carry-look-ahead adder, carry
all constraints for particular design, therefore
select adder etc. are compared and VHDL is used
depending on demand or application some
in their comparison. Their comparative study compromise between constraints has to be made.
included the use Xilinx 9.2i as the synthesis Hence, the VHDL codes
tool, Xilinx ISE Simulator as the simulation tool have been formulated for these fast adders and to
and FPGA Spartan-II kit for the implementation get area and delay report, Xilinx 9.2i is used as
of these adders.In this comparison study, area the synthesis tool. In addition to this, Xilinx ISE
and delay report is generated for these adders Simulator is used for simulation and FPGA
and the VHDL codes can be as well implemented Spartan –II kit is used for implementation.
on the FPGA Spartan-II kit.

Fast Parallel Adders


Introduction

One of the most widely used components in Ripple Carry Adder (RCA)
integrated circuits are adders, so designing efficient
adders has been the goal of research in VLSI It is possible to create a logical circuit using
design. Addition is a crucial arithmetic function for multiple full adders to add N-bit numbers. Each full
most digital systems. Various adder structures can adder inputs a Cin, which is the Cout of the previous
be used to execute addition such as serial and adder. This kind of adder is a ripple carry adder,
parallel structures. They are used not only for since each carry bit "ripples" to the next full adder.
addition, but also for other operations such as Ripple carry adder can be designed by cascading
subtraction, multiplication, division, and address full adder in series i.e. carry from previous full
computation .Adders are one of the most widely adder is connected as input carry for the next stage.
used components in integrated circuits and they are Full adder is a basic building block of Ripple carry
most commonly used in various electronic adder. The major limitation of Ripple carry adder is
applications e.g. Digital signal processing in which that as the bit length goes on increasing, delay also
adders are used to perform various algorithms like increases. Therefore, Ripple carry adder is not
FIR, IIR etc[1]. In past, the major challenge for suitable if large number bits are to be added.
VLSI designer is to reduce area of chip by using
efficient optimization techniques.
Apart from aiding a The two Boolean functions for the
designer in selecting an adder with favorable sum and carry are:
characteristics, aim is providing insight into design
tradeoffs that can save power and enhance SUM = Ai ⊕ Βi ⊕ Ci (i)
performance. The adders studied include linear
Cout = Ci+1 = Ai · Bi + (Ai ⊕ Bi) · Ci (ii)
time ripple carry and manchester carry chain
adders, carry skip and carry select adders, carry
Fig 1. Ripple carry adder

Fig 3. Carry lookahead adder


Condition Carry Adder (CCA)
Given the two Boolean functions for the sum and
This adder computes sum and carry depending carry as follows[ref1]:
upon status of previous carry i.e.
SUM = Ai ⊕ Βi ⊕ Ci (viii)
1. If ci = 0 then Sout = ai xor bi & Cout = Ci+1 = Ai · Bi + (Ai ⊕ Bi) · Ci (ix)
ci+1 = ai and bi (iii)

2. If ci = 1 then Sout = ai xnor bi Manchester Carry Chain Adder


& ci+1=ai or bi (iv)
Manchester adder is also a type of Carry look-
The adder does not consider the case of computing ahead adder.In the case of manchester adder ,there
sum and carry directly by using full adder. is a slight modification in calculating next carry to
be propagated i.e. instead of using Boolean
expression

Ci+1 = Gi + Ci.Pi to calculate next carry,


Manchester carry adder uses expression:

Ci+1=Gi+Ci.ti (x)

ti = Xi + Yi (xi)

Thus, we can say that carry recurrence can be


written in terms of ti instead of Pi, which leads to
Fig 2. Condition carry adder
slightly faster adder because in binary addition, ti
is easier to produce than Pi (OR instead of XOR).

Carry Lookahead Adder (CLA) Conventional Carry Skip Adder (CSKA)


In the lookahead carry algorithm ,carry for the next Carry has to propagate through all N stages in case
stages is calculated in advance based on input of N-bit Ripple carry adder, which results in large
signals.As a result this algorithm speed up the delay in performing binary addition. On the other
operation to perform addition. If ‘‘X’’ and ‘‘Y’’ are hand,it is possible to skip carry over group of n-bits
two inputs, “ci” is initial carry, “sout” and “cout” in case of Carry Skip Adder. This results in less
are output sum and carry respectively, then delay as compared to ripple carry adder.
Boolean expression for calculating next carry and The logic used
addition is[3]: for the carry skip is shown in the figure below and
also obvious from the equations.
Pi = xi xor yi --- Carry Propagation (v)
P(0:3)<= ((x(0) or y(0)) and (x(1) or y(1)) and
Gi = xi and yi --- Carry Generate (vi)
(x(2) or y(2)) and (x(3) or y(3))); (xii)
Ci+1 = Gi or (Pi and Ci) --Next Carry (vii)
blocks each of 4-bit Look ahead carry adder . Carry
output of each block is fed into next block as input
carry.

Fig 4. Conventional carry skip adder

Modified Carry Skip Adder (CLSKAs)

In the case of conventional carry skip adder, each


block consists of ripple carry adder and skip logic
is used after each block to generate carry for next
block. The speed of operation is affected by the Fig 6. Carry select adder
method of carry propagation from previous block
to next block[4]. While in CLSKAs, carry
lookahead scheme is used in each block to generate Carry Save Adder (CSA)
carry for next block. As a result ther is a better
performance in terms of speed as look ahead carry In carry save adder, if sum of two 16-bit binary
adder is faster than ripple carry adder[5]. Figure numbers is to be computed, so 16 half adders are
shows modified CLSKA with fixed block size i.e. taken at first stage instead of using 16 full adders.
4-bit each. Therefore, carry save unit consists of 16 half
adders, each of which computes single sum and
carry bit based only on the corresponding bits of
the two input numbers. It is used to compute sum
of three or more n-bit binary numbers. This adder
is same as a full adder Let x and y are two 16 bit
numbers and produces partial sum and carry as s
and c:

Si = xi xor yi (xii)
Ci = xi and yi (xiii)

The final addition is then computed as:

1. Shifting the carry sequence C left by one place.

2. Placing a 0 to the front (MSB) of the partial sum


sequence S.

3. Finally, a ripple carry adder is used to add these


two together and computing the resulting sum.
Fig 5.Modified carry skip adder

Carry Select Adder (CSA)

In the carry select adder, the principle used to


calculate sum is based on assuming input carry
from previous stage. One adder calculates the sum
assuming input carry of 0 while the other calculates
the sum assuming input carry of 1[6]. Then, the
actual carry triggers a multiplexer that selects the
appropriate sum . Fig. shows the schematic block
diagram of 16-bit Carry select adder consists of 4- Fig 7. Carry save adder
RESULTS AND ANALYSIS Manchester 31.744 32
Carry Chain
The adders namely, ripple carry adder, carry Adder
lookahead adder , manchester carry chain adder,
carry select adder, carry save adder, condition Carry Select 26.056 45
carry adder, conventional carry skip adder and Adder
modified carry skip adder have been designed
using VHDL (Very High Speed Integration Carry Save 35.424 46
Hardware Description Language) for 16-bit Adder
unsigned data. In order to demonstrate the
performance of these adders , the adders are Condition 33.378 32
compared on the basis of their delays and area Carry Adder
occupied. The delay and area reports are
generated for these specified adders. Conventional 17.636 43
To get the delay and area report, the Carry Skip
following tools are used: Adder
1. Xilinx 9.2i is used as the synthesis tool.
2. Xilinx ISE Simulator is used for Modified 27.163 69
simulation.
Carry Skip
3. FPGA – Spartan II is used for
Adder
implementation.

The delay and area reports of the adders are


generated with the help of the synthesis tool
Table 1. Delay and area report of 16-bit fast adders
i.e. Xilinx 9.2i. The VHDL codes formulated for
these adders are firstly simulated using the
Xilinx ISE Simulator and further these codes are
synthesized using the synthesis tool. The CONCLUSION
synthesis tool after the synthesis process
generates a synthesis report and this report can The delay and area reports generated as a
provide us with the propagation delay and also result of simulation and synthesis processes run
the number of 4-input LUTs used by the design on the VHDL codes of the adders provide us
out of the total number of LUTs. with the performance analysis of these 16-bit
Further, the adders. According to the reports of these
VHDL codes of the adders after being adders, comparison between the delays of the
simulated and synthesized can be implemented adders concludes that the conventional carry
on FPGA kit by downloading design codes on skip adder has minimum propagation delay
the kit. The VHDL codes implemented on the (17.636 ns) while it occupies 43 LUTs out of
kit such that the codes are converted in the total 1536 LUTs on the Spartan II -XC3S50-5-
design format (i.e. the programming file) to be TQ144 FPGA kit. However, carry lookahead
downloaded on the kit. adder has next least propagation delay (i.e.
The delay and 22.792 ns) and least number of LUTs occupied
area reports generated for these adders are given on the FPGA kit (i.e 17 LUTs out of 1536
in tabular form in table 1. LUTs).

From the area and delay reports of these


adders , it is observed that there are trade-offs
between performance parameters i.e Area and
ADDERS DELAY LUTs Delay. In order to design delay efficient adder,
(With Fix (ns) (out of 1536) conventional carry skip adders in which it is
Block Size=4 possible to skip carry over group of n-bits. This
bit) results in less delay as compare to ripple carry
adder to generate output sum and carry bit for
Ripple Carry 32.997 32 next block. This result in fast operation but at the
Adder cost of few more LUT’s due to carry skip logic.

Carry 22.792 17
Lookahead
Adder
References

1. R.P.P.Singh,Parveen Kumar and


Balwinder Singh, “Performance Analysis
of fast adders using VHDL”,2009
International Conference on advances in
Recent Technologies in Communication
and Computing.

2. Nagendra, C.; Irwin, M.J.; Owens,


R.M.,“Area-time-power tradeoffs in
parallel adders”, Circuits and Systems II:
Analog and Digital Signal Processing,
IEEE Transactions on Volume 43, Issue
10, Page(s): 689 – 702, 1996.

3. Hasan Krad and Aws Yousif Al-Taie,


“Performance Analysis of a 32-Bit
Multiplier with a Carry-Look-Ahead
Adder and a 32-bit Multiplier with a
Ripple Adder using VHDL”, Journal of
Computer Science 4 (4): 305-308, 2008.

4. Wang, Y.; Pai, C.; Song, X., “The design


of hybrid carry lookahead/carry-select
adders, Circuits and Systems II: Analog
and Digital Signal Processing, IEEE
Transactions on Volume 49, Page(s): 16-
24, 2002.

5. Min Cha and Earl E. Swartzlander, Jr,


“Modified Carry Skip Adder for reducing
first block delay”, Proc. 43rd IEEE
Midwest Symp. on Circuits and Systems,
Lansing MI, Page(s): 346-348, 2000.

6. Behnam Amelifard, Farzan Fallah,


Massoud Pedram, “Closing the gap
between Carry Select Adder and Ripple
Carry Adder: A new class of Low-power
and High-performance Adders”,
Proceedings of the Sixth International
Symposium on Quality Electronic Design
(ISQED’05) , 2005.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Organic Thin Film Transistor: Materials,


Structures and Operational Parameters
Poornima Mittal1, Brijesh Kumar2, B. K. Kaushik3, Y. S. Negi4 and Krishna Raj5
1
Electronics and Communication Engineering, Graphic Era University, Dehradun, INDIA
3
Electronics and Computer Engineering, Indian Institute of Technology, Roorkee, INDIA
2,4
Polymer Science and Technology Group, DPT, Indian Institute of Technology, Roorkee, INDIA
5
Department of Electronics Engineering, H.B.T.I., Kanpur, INDIA
[poornima2822@gmail.com, brijesh2228@gmail.com, bkk23fec@iitr.ernet.in, ynegifpt@iitr.ernet.in, kraj_biet@yahoo.com]

ABSTRACT: Organic Thin Film Transistors (OTFTs) applications. Some important applications like
are out breaking their performance over the past few display drivers, advertising boards, smart cards,
years and becoming very attractive for large range of
applications such as oscillators, flexible display devices, wall sized televisions, identification tags, portable
small and large scale and even integrated optoelectronic products such as modern cell phones and video
devices. Transistor based on organic semiconductor as games [1]. Organic material based devices like
active layer to manage electric current flow is known as Organic Thin Film Transistor (OTFT), Organic
organic thin film transistor. For the last decade
Field Effect Transistor (OFET), Organic Light
organic/polymeric materials have been extensively
investigated for substrate, conducting semiconductor Emitting Diode (OLED) and Solar Cell have
layer, dielectric and contact electrodes for thin film numerous advantages of low cost, flexibility and
transistor (TFT) devices. In organic thin film transistor, light weight than their inorganic counterparts.
the type of semiconductor, processing, doping and Organic semiconductors can be processed at low
structure can affect their electrical characteristics. This
paper presents new insight into structure, organic
temperatures compatible with plastic substrate
materials, conduction mechanism and performance whereas higher temperatures are required for
characteristics of OTFT. However pentacene based alternative Si based devices [2, 3]. Organic
bottom and top contact structure has been modelled to transistors can usually be manufactured at or near
characterise adopted structures for organic transistor. room temperature, unlike silicon based
It explores the current status of OTFTs in terms of
various parameters such as contact resistance, effect of transistors, which typically require fairly high
channel length, active layer thickness and on/off current process temperatures (>800ºC for crystalline Si
ratio etc. Organic electronic products are lighter, more transistor).
flexible and less expensive than their inorganic For simulation of OTFTs certain structures
counterparts. These are also biodegradable being made
have been proposed. In order to enhance the
from carbon. This opens the door to many exciting
applications that would be impossible using silicon. device speed, considerable research effort has
Since OTFT provide simple and low cost processes, its been devoted to increase the mobility of organic
application to display has been discussed. materials by improving deposition conditions [4,
5]. At the same time as a result of this effort,
Keywords: Bottom and Top Contact Structures of
OTFTs, Contact Resistance, Mobility, Organic
mobility exceeding 1 cm2/V.sec for Pentacene
Materials, Organic Thin Film Transistors. [6], this is of comparable value to amorphous
hydrogenated silicon (a-H:Si) and 0.1 cm2/V.sec
for poly (3-hexylthyophine) P3HT [7]. In addition
1. INTRODUCTION to mobility, other ways of improving performance
of OTFTs such as channel length scaling and
Organic electronics has the potential to create active layer thickness have also attracted
new range of devices, circuits and their considerable attention [8]. This paper first

VLP0109-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

describe the different structures of organic thin


film transistors in section 2, various organic and
polymer materials used for active semiconductor
and dielectric layer in section 3, operation and
characteristics in section 4. Finally parameters
and display application has been discussed in
section 5 and 6 respectively.

2. OTFT STRUCTURES
(a)

OTFTs adopt the architecture of thin film


transistor (TFT), which has proven it’s
adaptability with low conductivity materials. It
contains three electrodes source, drain and gate, a
dielectric layer and active organic
semiconducting (OSC) layer. The structure can be
top gate or bottom gate and further both
architectures can be divided into top contact and
bottom contact alternatives as depicted in fig.1 (a) (b)
and (b). The deposition of organic semiconductor Fig.1 Schematic cross-section of OTFT structures with
on the insulator is much easier than the reverse pentacene as active semiconductor, Al2O3 as dielectric and
due to fragile nature of organic semiconductors; gold contact electrodes. (a) Bottom Gate Top Contact
(BGTC) (b) Bottom Gate Bottom Contact (BGBC).
hence bottom gate architecture is built in majority
for current OTFTs.
Well known structure for standard silicon 3. ORGANIC MATERIALS
MOSFETs is top-gate-top-contact (TGTC),
however for simulation of OTFT bottom-gate- The performance of OTFTs depends on
top-contact (BGTC) and bottom-gate-bottom- their constituent organic semiconductors and
contact (BGBC) architecture has been modeled materials of insulator. Following materials are
mostly. Certain advantages and disadvantages are explained here for different layers of OTFTS.
associated with each of OTFT structures. In terms
of field effect mobility among both the structures, 3.1. SUBSTRATE
BGTC structure shows better performance in
For substrates quartz, polycarbonate,
comparison with BGBC structure. The better
polyethylene naphthalate (PEN), glass, silicon
field effect mobility for top contact OTFT is due
wafer and polyimide materials can be used [14,
to less contact resistance than that of a bottom
15]. Inorganic substrates have high melting point
contact one [9]. The performance of OTFTs in a
and good flatness where as polymer substrates
BGBC bottom contact device structure is
have high toughness, flexibility and light weight.
generally observed to be lower by two orders of
magnitude than to the top contact device
configuration [10-13]. 3.2. CONTACT ELECTRODE
To improve electrical characteristics, ohmic
contact can be formed between gold (Au) and
organic semiconductor because the work function
of Au is 5.0ev and HOMO of most of the organic
VLP0109-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

semiconducting materials is around this level. effort has gone into the development of organic
Adding nickel on gold improves adhesion of the n-channel OTFTs because this allows the
gold on the oxide. Platinum electrodes are implementation of complementary circuits with
inferior to gold electrodes. Aluminum shows low static power consumption [9, 18]. Table-2
slightly higher electron mobility (2.2cm2/Vs) at gives mobility and current on/off ratio for some
room temperature in single crystals [15, 16]. n-type semiconductors.

TABLE-2 MOBILITY (µ) AND CURRENT ON/OFF


3.3. P-TYPE ORGANIC SEMICONDUCTORS RATIO FOR SOME N-TYPE SEMICONDUCTORS [17].
Organic thin film transistors fabricated with Material Mobility Ion/Ioff
(cm2V-1s-
light weight flexible substrates are expected to 1
)
replace hydrogenated amorphous TFT Pc2Lu 2×10-4 NR
applications on glass substrates. Table-1 shows (Lutetiumbisphthalocyanines)
the mobility and on/off current ratio measured TCNQ 3×10-5 4-450
from OTFT by using p- type organic molecules (tetracyanoquinodimethane)
C60 0.08 106
deposited by different techniques. Among all
F16CuPc 0.03 5×104
investigated oligomeric and polymeric materials,
pentacene thin films have demonstrated the best
electrical performance. Pentacene exhibits typical
3.5. MATERIALS FOR DIELECTRIC
p-channel semiconductor characteristics.
Organic polymers having good processability
TABLE-1 MOBILITY (µ) AND CURRENT ON/OFF and dielectric properties, such as poly methyl
RATIO FOR SOME P-TYPE SEMICONDUCTORS [17]. methacrylate (PMMA), poly vinyl phenol (PVP),
Material Mobility Ion/Ioff polyimide (PI), and poly vinyl alcohol (PVA)
cm2/V s have been extensively employed as the gate
Pentacene 3.2 109
Copper phthalocyanine 0.01-0.02 NR
insulator. Switchig voltage of OTFTs increase
Polythiophene 10-5 >102 with low dielectric constant of insulators. Some
αω-dihexyl-hexathiophene 0.13 >104 important dielectric materials with their dielectric
P3HT 0.1 106 constant are polyimide - 2.6, PMMA - 2.65,
Al2O3 - 9, and SOG (spin on glass) - 3.9.[17].
3.4. N-TYPE ORGANIC SEMICONDUCTORS
It is surprising to note that most of the work 4. OPERATION AND CHARACTERISTICS
to date has focused on p-type organic materials,
whereas some effort has been guided towards the
preparation of novel n-type semiconductor 4.1. OPERATION
materials recently. While designing n-type TFTs cannot accommodate a bend bending
devices, semiconductor must be utilized which due to absence of bulk region [19]. The
can allow the injection of electrons into its conducting channel is formed by an inversion
LUMO. Gold has been optimized for source and layer in MOSFETs while in TFTs, it is because of
drain electrodes [10], and it has a work function accumulation. Depending upon the polarity of the
of 5.0ev and since most n-type materials have gate voltage they can operate in unipolar carrier
solid state electron affinity levels 4.0ev. (electron or hole) accumulation modes. In a thin
Thus charge injection into the semiconductor film FET or accumulation type FET, charge-
would be limit by the energy barrier of voltage relation is simply given as:
approximately 1ev, is a another issue associted ρ (x) = [V(x) – Vg] Cox (1)
with complexity of n-type devices. Substantial
VLP0109-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

With ρ and V the local charge per area and


voltage in the channel respectively. Polymeric
material such as Pentacene acts as p-type
semiconductor having holes as majority carriers.

Fig. 3 Output characteristics of organic thin film transistor


with Pentacene as semiconductor layer, Al2O3 dielectric
material and gold as contacts.
Fig.2 Top contact OTFT operation with pentacene as active
semiconductor layer.
Characteristics shows a linear (ohmic)
When a negative gate voltage is applied, an region with dependency of ID on VDS for low
electric field is formed across the dielectric, drain-source bias voltage (VDS << VGS) and
causing an accumulation region of holes at the saturation of ID occurs at high drain voltages (VDS
dielectric-semiconductor boundary shown in fig. >VGS). The biasing voltages and current polarity
2. Applying a voltage to the source-drain is considered as per behavior of device similar to
terminals allows a current to flow across this NMOS or PMOS.
accumulation layer between the contacts.
Basically OTFT operates like a capacitor, when a 5. PARAMETERS
voltage is applied to the gate an equal (but of
opposite sign) charge is induced at both side of 5.1. MOBILITY
the insulator [20]. Field effect mobility is a key parameter to
determine the processing speed of organic
4.2. CHARACTERISTICS devices. Mobility of carriers can be modulated by
Despite the fact that the transport physics in gate voltage; it tends to increase when gate bias
organic /polymeric TFTs is different from that in increases [20]. By many decades quoted values
silicon MOSFETs, the current-voltage for effective mobility for organic transistors vary
characteristics can to first order be described with in the range of 10-5 to 10 cm2/V s. Mobility
the same formalism: depends on many other factors such as gate
ID = µWCi/L [(VGS – Vth) VDS – V2DS/2] (2) biasing, method of fabrication and the method of
evaluation of the mobility from the simulation
For VGS – Vth > VDS (linear regime)
and experiments [21]. The bias dependent
ID = µWCi/2L (VGS – Vth) (3) mobility, expressed as power law for polymer
For VDS > VGS – Vth > 0 (saturation regime) based field effect transistor is given by:
Where, W is the channel width, L is the channel
µ (VGS) = µ0 (VGS – VT) γ (4)
length, Ci is the gate dielectric capacitance and µ
is the carrier mobility in the semiconductor. The The parameter γ is usually estimated in the
currene-voltage (Id -Vds) characteristics of OTFT range of 0.2 – 0.5 for different OTFTs/PFETs
is similar to inorganic based FETs at gate bias [22, 23]. TFTs exhibits mobility up to 0.4 cm2V-
1 -1
voltage VGS higher than a threshold voltage Vt, as s at low operating voltages (5V) [24, 25]. The
illusrated in fig. 3. mobility increases from very low values about

VLP0109-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

0.02 cm2V-1s-1 at VG = -14 V to 1.26 cm2V-1s-1 at contact resistance appears to be almost


-146 V [26]. independent of the gate bias in bottom contact
structures. Necliudov et al. measured the contact
resistance as 1.3×108 Ohm µm with mobility of
5.2 ON/OFF CURRENT RATIO
approximately 0.9 cm2/V s for bottom contact
The ratio of current in the accumulation mode Pentacene OTFT, consistent with an injection
over the current in the depletion mode is called barrier of between 0.2 and 0.3eV in the
Ion/Ioff. Current ratio depends upon various factors simulation, additionally it has been quoted that
such as materials, channel length, and thickness top contact resistance is strongly depend on gate
of semiconductor. Short channel devices shows voltage [22] and much less than the bottom
higher on/off current ratio over devices having contact resistance at high gate bias.
large length of conducting channel [6]. This ratio
increases with decrease in the thickness of
semiconducting layer. For memory and display 5.5. EFFECT OF CHANNEL LENGTH
applications high on/off current ratio is more Drain current strongly depends upon the
important requirement than high mobility. It has semiconductor used for channel and it can be
been quoted that on/off current ratio has been modulated by length of the conducting channel.
measured as 108 for BGBC thin film transistor M. Austin et al quoted drain current dependence
structure with Pentacene as organic on the length of channel for P3HT (poly (3-
semiconductor, cross linked PVP as insulator and hexylthiophene)) in OTFTs with different
gold contacts for source and drain [29]. One has channel lengths of 1000nm and 70nm. It has been
observed it around 109 for Pentacene as active shown that saturation region is present for long
organic semiconductor [31]. channel (1000nm) device but no saturation region
appears in the short channel (70nm) device. Long
channel devices are relatively immune to high
5.3. THRESHOLD VOLTAGE
contact resistance and when scaled to smaller
To extract information about impurity channel lengths, the device performance may
concentrations, interface states and traps it is degrade [28]. The on/off current ratio is higher
common practice to use threshold voltage and sub for short channel devices over long channel
threshold current as device evaluation parameters. devices.
In MOSFETs, the sub threshold current
exponentially depends on the gate-bias as well as
the drain-source bias because below threshold the 5.6 EFFECT OF ACTIVE LAYER THICKNESS
free carrier density exponentially depends on the Electrical parameters of OTFT does not solely
local bias. The threshold voltage (Vt) of OTFTs depend upon gate capacitance, these can be
varies with either the gate insulator capacitance modulated by film thickness and charge injection
[27] or the thickness of the organic film [20]. from the source electrode. There are trends which
can be expressed as a function of the product of
thickness of polymeric film and gate capacitance
5.4. CONTACT RESISTANCE
per unit area. It has been observed that with
Ideally the contact resistance should be ohmic increasing the permittivity of gate insulator and
and small in order to make enable the whole thickness of organic material, the mobility
voltage applied to the device, contributes to the decreases in OTFTs [21].
transport current. For top contact devices it
strongly depends upon gate bias and sharply
increases at low gate-source voltage, while
VLP0109-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

6. OTFT FOR DISPLAY DEVICES resistance. Top contact OTFT shows better field
effect mobility due to less contact resistance than
that of a bottom contact one.
The companies currently developed a very
diverse set of substrate, drive element and display It has been quoted that on/off current ratio is
mode technologies in order to realize flexible higher for short channel devices over long
display. E-paper display market is expected to channel devices. For memory and display
show 46.9% annual average growth rate from applications high on/off current ratio is more
US$ 260 million in 2010 to US$ 2.1 billion in important requirement than high mobility and this
2015 and US$ 7 billion in 2020. OTFTs can be ratio should be more than 108. In spite of
used to make good displays of LCD or E-paper as numerous advantages such as, large area
there need high on/off current ratio [29]. coverage, structural flexibility and especially low
cost, certain limitations like instability, lower
carrier mobility, and shorter lifetimes are
TABLE-3 DISPLAY APPLICATIONS WITH OTFTS
WITH PENTACENE AS OSC [29]
associated with organic material based devices
need to be resolve to commercialize OTFTs
App. Specification Organization
based applications.
OLED 4*4 pixel on PC NHK (Japan)
OLED 8*8 pixel on glass Pioneer (Japan)
LCD 64*128 on plastic ERSO (Taiwan)
REFERENCES
LCD 15 in. full color XGA on Samsung
glass (Korea) [1] M. Jamal Deen, “Plastic microelectronics with organic
LCD 1.4 in. 80*80 RGB on glass Hitachi (Japan) and polymeric thin film transistors,” Proc. 26th
international conference on microelectronics, MIEL,
2008.
Table-3 summarizes the display prototypes [2] Yoshiro Yamashita, “Organic semiconductors for
organic field effect transistor,” Sci. Technol. Adv.
using OTFTs and LCD (made with OTFT matrix Mater. vol.10, pp-024313, 2009.
array) and active matrix organic light emitting [3] H. Klauk, D. J. Gundlach, and T. N. Jackson, “Fast
diode (AMOLED) with dot matrix patterns. organic thin-film transistor circuits,” IEEE Electron
Organic/polymer LEDs displays have the Device Lett., vol. 20, pp. 289-291, 1999.
potential to replace LCDs and become the next [4] A. R. Brown, A. Pomp, C. M. Hart, and D. M. De
Leeuw, “Logic gates made from polymer transistors
dominant force in flat panel display due to require and their use in ring oscillators,” Science, vol. 270, pp.
fewer steps in fabrication processes and have 972-974, 1995.
lower material costs than LCD [30]. [5] Y. Sun, Y. Liu and D. Zhu, “Advances in organic field-
effect transistors, ” J. mater. chem. , vol. 15, pp. 53-
65, 2005.
7. CONCLUSION [6] Y. Y. Lin, D. J. Gundlach, S. F. Nelson, and T. N.
Jackson, “Stacked pentacene layer organic thin film
Organic/polymer electronics is a very transistors,” IEEE Electron Device Lett., vol. 18, pp.
promising alternative to crystalline, 606–608, Dec. 1997.
polycrystalline and amorphous silicon processes. [7] Z. Xie, M. Abdou, A. Lu, M. J. Deen, S. Holdcroft,
Moreover, there are no restrictions as to the “Electrical Characteristics of Poly (3-Hexylthiophene)
Thin Film MISFETs,” Canadian J. of Physics, vol. 70
dimensions of the device. It has been observed
no. 10 & ndash; 11, pp. 1171-1177, 1992.
that with increasing the permittivity of gate [8] O. Marinov, M. J. Deen, and R. Datars, “Compact
insulator and thickness of organic material, the modeling of charge mobility in organic thin-film
mobility decreases in OTFTs. The effect of transistors,” J. Appl. Phys. , vol. 106, no. 6, pp.
channel length has been discussed; long channel 064501-1–064501-13, Sep. 2009.
devices are relatively immune to high contact
VLP0109-6
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

[9] H. Klauk, “Organic thin film transistor,” Chem. Soc. [22] P. Necliudov, M. Shur, D. Gundlach, and T. Jackson,
Rev., 39, pp. 2643-2666, 2010. “Modeling of organic thin film transistors of different
[10] O. Marinov, M. J. Deen, and B. Iniguez, “Charge designs,” J. Appl. Phys., vol. 88, no. 11, pp. 6594–
transport in organic and polymer thin-film transistors: 6597, Dec. 2000.
Recent issues,” Proc. Inst. Elect. Eng. Circuits Devices [23] De Leeuw, D. Gelinck, G. Geuns, T. Van Veenendaal,
Syst., vol. 152, no. 3, pp. 189–209, Jun. 2005. E. Cantatore, E. and B. Huisman, “Polymeric
[11] N. Karl, “Charge Carrier Transport in Organic integrated circuits: fabrication and first
Semiconductors,” Synth. Met. , vol. 649, pp . 133- characterization,” IEEE-IEDM, 2002, pp. 293–296.
134, 2002. [24] C. D. Dimitrakopoulos, S. purushothaman, J. Kymissis,
[12] R. A. Street and A. Salleo,” Contact effects in polymer A. Calleggari and J. M. Shaw, “Low-Voltage Organic
transistors,”Appl. Phys. Lett. vol. 81, no. 15, pp. 2887, Transistors on Plastic Comprising High-Dielectric
2002. Constant Gate Insulators,” Science, vol. 283 no. 5403
[13] S. F. Nelson, Y. Y. Lin, D. J. Gundlach and T. N. pp. 822-824, February 5, 1999.
Jackson, “Temperature independent transport in high [25] C. D. Dimitrakopoulos, I. J. Kymissis, S.
mobility pentacene transistors,” Appl. Phys. Lett. vol. Purushothaman, D. A. Neumayer, P. R. Duncombe,
72, no.15, pp.1854, 1998. and R. B. Laibowitz, "Low-Voltage, High-Mobility
[14] F. Garnier, “Thin-Film Transistors Based on Organic Pentacene Transistors with Solution-Processed High
Conjugated Semiconductors, Chem. Phys., 227, 253, Dielectric Constant Insulators," Adv. Mater. 11, 1372,
1998. 1999.
[15] H. Klauk, D. J. Gundlach, M. Bonse, C. C. Kuo, and [26]C. D. Dimitrakopoulos and P. R. L. Malenfant,
T. N. Jackson, “A reduced complexity process for “Organic Thin Film Transistors for Large Area
organic thin film transistors,” Appl. Phys. Lett., 76, Electronics,” Adv. Mater. vol. 14, pp. 99-117, 2002.
1692, 2000. [27] M. J. Deen, O. Marinov, Jianfei Yu, S. Holdcroft and
[16] J. H. Schon, Ch. Kloc, and B. Batlogg, “On the W. Woods, “Low-frequency noise in polymer
intrinsic limits of pentacene field-effect transistors,” transistors,” IEEE Trans. on Electron Devices, vol. 48,
Organic Electronics., vol.1, no. 57, 2000. no. 8, pp. 1688-1694, 2001.
[17] C. Shekar, T. Lee and S. W. Rhee, “Organic thin film [28] I. G. Hill, “Numerical simulations of contact resistance
transistors, material, processes and devices,” Korean J. in organic thin-film transistors,” Appl. Phys. Lett. Vol.
Chem. Engg., vol. 21, no. 1, pp. 267-287, 2004. 87, pp. 163505-1-163505-3, 2005.
[18] G. Horowitz, “Organic field-effect transistors,” Adv. [29] Jin Jang and S. H. Han, ”High performance OTFT and
Mater. vol. 5, pp. 365-377, 1998. its application,” Current Applied Physics, 6S1, pp.
[19] P. Stallinga, and H. L. Gomes, “Modelling electrical e17-e21, 2006.
characteristics of thin-film-field-effect transistor, I. [30] A. Afzali, C. D. Dimitrakopoulos and T. L., Breen,
Trap-free materials,” Synthetic Metals, 156, pp. 1305- “High-performance, solution-processed organic thin
1315, 2006. film transistors from a novel pentacene Precursor,” J.
[20] G. Horowitz, “Organic thin film transistors: From Am. Chem. Soc., vol. 124, pp. 8812, 2002.
theory to real devices”, J. Mater. Res., vol. 19, no. 7, [31] J. H. Schon, S. Berg, Ch. Kloc, and B. Batlogg,
pp. 1946-1962, Jul 2004. “Ambipolar pentacene field-effect transistors and
[21] O. Marinov, M. J. Deen, and B. Iniguez, “Performance inverters,” Science, vol. 287, pp. 1022, 2000.
of organic thin film transistors,” J. Vac. Sci. Technol.,
vol. 24, no. 4, pp. 1728–1733, 2006.

VLP0109-7
CHARACTERIZATION OF 4T SRAM CELL
Setu Garg1, Prof.S.N.Sharan2, Garima Chandel3 Member IEEE, Hridesh Verma4
1
GCET, Greater Noida,2GNIT, Greater Noida, 3,4ABES IT Ghaziabad, India.
1
gargsetu06@gmail.com, 2snathsharan@yahoo.com, 3garimachandel@rediffmail.com,
4
hridesh.verma@gmail.com

ABSTRACT — The Static Random Acess Memory represents a b-bit input/output (I/O -port). The I/O-
discussed in this paper is based on a Four- port consists of b I/O-blocks, i.e. one block per bit of
Transistor SRAM cell. This paper focuses on the the output word. Each bit of the I/O-port can be
various important parameters viz., Static Noise connected to one out of 2w bit lines by a 2w-to-1
Margin Analysis and Bit Line Leakage current column or bit line multiplexer. Any SRAM cell can
analysis to characterize Four-Transistor SRAM be accessed by an address word which is (h + b) bits
cell. Maximum allowable SNM is needed to be long. This address is applied to the control logic
investigated for efficient operation of SRAM cell. block which controls all the memory operations, e.g.
The purpose of this analysis is to measure the write, read, enable, data- in, data-out.
SNM of bit cell without flipping the cell contents. .
Bit line leakage current analysis is also done.
Analysis involves bit cell contribution to column II. BASIC SRAM ARCHITECTURE
leakage and margin available for sum of total cell
leakage current in a long column. The A typical static random access memory
performance and results have been validated (SRAM) architecture is as shown in Figure 1. It
through simulations using ELDO tool from consists of a matrix of memory cells arranged in an
Mentor Graphics Corporation. array of 2N rows by 2M columns. The total size of the
memory array is 2M x 2N bits. During a read
Index Terms – SRAM, Bit Line, Static Noise operation, one of the 2N rows (Word lines) is selected
Margin, DC Source, Word Line by the row address decoders by decoding the row
addresses. All the memory cells in the given word
line are enabled. The column decoder selects one of
I. INTRODUCTION the 2M columns and the value of the selected memory
Static random-access memory (SRAM) is a cell is read out by the sense amplifier. The data into
critical component across a wide range of and out of the memory array is controlled by the
microelectronics applications from consumer Read-Write control circuit.
appliances to high-end workstation and
microprocessor applications. For almost all fields of
applications, semiconductor memory has been a key
enabling technology. It is forecasted that embedded
memory in SoC designs will cover up to 90% of the
total chip area. A representative example is the use of
cache memory in microprocessors. The operational
speed could be significantly improved by the
application of on-chip cache memory that
temporarily stored a fraction of the data and
instruction content of the main memory.

The SRAM consists of an array of static


memory cells which are connected by horizontal
word lines and vertical bit lines. To select a word line
out of 2h, a h-bit address has to be applied. The
output data is usually organized as a word of b-bits.
From the architectural point of view the output word
Figure 1. Static Random Access Memory Architecture
III. SCHEMATIC AND READ/WRITE If the cell was storing a logic’0’ the voltage level
OPERATION OF 4T SRAM CELL of BL will be lower than BL’ so the sense amplifier
will output a logic‘0’. If the cell was storing logic’1’
then the voltage level of BL will be higher than BL’
In four Transistor SRAM celll two NMOS then the sense amplifier will output a logic’1’
transistors are used as pass transistors to access the
cell and two PMOS transistors which are used as
drivers to the cell. IV. EXPERIMENTS AND RESULT

A. Static Noise Margin Analysis

SNM quantifies the maximum level of voltage


nose which can be present at the internal nodes of a
bit cell without flipping the cell contents. Figure 3
shows the location Q and Q´, the noise margin
sources in the 4T SRAM cell schematic. The purpose
of this analysis is to measure the SNM of bit cell. A
SRAM cell should be designed such that under all
conditions some SNM is reserved to cope with
dynamic disturbances caused by a particle, cross talk,
voltage supply ripple and thermal noise. I have done
SNM analysis for 6T and 4T cell for 0.18µ
Figure2. Schematic Of The Cell technology node. The method and results are shown
below.

A. WRITE OPERATION

In order to store a logic ‘1’ to the cell, BL is


charged to Vdd and BL’ is charged to ground and vise
versa for storing ‘0’. Then the word line is switched
to Vdd to turn on the NMOS access transistors. When
the access transistors are turned on, the values of the
bit lines are written into Q and Q’. The node that is
storing logic’1’ will not go to full Vdd because of a
voltage drop across the NMOS access transistor.
After the write operation the word line voltage is
reset to ground to turn off the NMOS access
transistors. The node with the logic’1’ stored is
pulled up to full Vdd through the PMOS driver
transistors.

B. READ OPERATION

The read operation of the cell is different from Figure 3. 4T SRAM cell simulated structure
that of 6T cell. To read from the cell the bit lines are B. Method to calculate Static Noise Margin
charged to ground instead of Vdd and the word line
voltage is set to Vdd to turn on the NMOS access To analyze Static Noise Margin, introduce a DC
transistors. The node with logic’1’ stored will pull the noise source inside the SRAM cell and see where the
voltages on the corresponding bit line up to a high cell flips .Put the WL (Word Line) at Vdd . Bit Line
(not Vdd because of the voltage drop across the and Bit Line’ (BL and BL’) are connected to ground.
NMOS access transistor) voltage level. The other bit Iinitialize Q’ with Vdd and Q with 0. Now slowly
line is pulled to ground. The sense amplifier detects increase VX from 0 and monitor points Q and Q’ to
which bit line is at high voltage and which bit line is investigate where the cell flips. Static Noise Margin
at ground. is measured to be 362.3279 mV.
C. Bit Line Leakage Current Analysis voltage supply ripple and thermal noise. Static Noise
Margin is measured to be 362.3279 mV.
The purpose of this analysis is to characterize For BLCC it is also seen the margin available for
the bit cell contribution to column leakage. The main the sum of total cell leakage currents in a long
purpose of this test is to see the margin available for column during a read operation.
the sum of total cell leakage currents in a long The Bit Line Leakage Current for 4T SRAM
column (from unselected WLs) during a read cell is measured to be 7.1441 pA. Objective is also to
operation. This simulation should be used as keep Bit line Leakage Current as low as possible.
guidelines for designing the maximum number of
physical rows in a SRAM array.
VI. REFERENCES

[1]. Neil H. E. Weste and Kamran Eshraghian, “Principles of


CMOS VLSI Design,” Second-Edition, Pearson Education Asia,
2002.

[2]. S. M. kang and Y. leblebici, “CMOS Digital Integrated


Circuits,” Third Edition, Tata McGraw –Hill, 2002.

[3]. Tegze P. Haraszti, “CMOS Memory Circuits”, Kluwer


Academic Publishers, 2000 .

[4]. Semiconductor Memories, A handbook of


design, manufacture and application By “Betty
Prince”.

[5]. Stephan De Beer, Monuko du Plessis, and Evert Seevinck,”


An SRAM Array Based on a Four-Transistor CMOS SRAM
Cell”, IEEE Transactions On Circuits and Systems—Fundamental
Theory and Applications, Vol. 50, No. 9, September 2003.

[6]. Jinshen Yang ,Li Chen,”A New Loadless 4-Transistor


SRAM Cell with a 0.18 μm CMOS Technology”, IEEE,2007.

[7]. Ding-Ming Kwai ,”Review of 6T SRAM Cell” ,Intellectual


Property Library Company ,June 3, 2005.
Figure 4. Bit Line Leakage Current Calculation
[8]. T-H Joubert, E Seevinck, M du Plessis, “A CMOS
REDUCED-AREA SRAM CELL”, ISGAS 2000 - IEEE
D. Method to calculate Bit Line Leakage Current International Symposium on Circuits and Systems, May 28-31,
2000, Geneva, Switzerland.
To do Bit Line Leakage Current Analysis
initialize the output Q to ‘0’ and Q’ to Vdd . At this [9]. Bharadwaj S. Amrutur and Mark A. Horowitz,” Speed and
Power Scaling of SRAM’s”, IEEE Transactions on Solid State
time Word Line (WL) is in off condition and Circuits, Vol.. 35, No. 2, Febraury 2000.
therefore set to 0. BL and BL’ are connected to
ground. Now leakage current is measured as the
current through MN4 (pass transistor facing the ‘1’).
The Bit Line Leakage Current for 4T SRAM cell is
measured to be 7.1441 pA.

V. CONCLUSION

The two basic parameters static noise margin and


bit line leakage current are successfully measured.
All the simulations are done in ELDO tool from
Mentor Graphics Corporation. Both the parameters
discussed in this paper are very important in
characterization of 4T SRAM cell.
A SRAM cell is designed such that under all
conditions some SNM is reserved to cope with
dynamic disturbances caused by particle, cross talk,
Quantitative Analysis and Optimization Techniques
for On-Chip Cache Leakage Power

Vikas Tiwari Shyam Akashe Rajkumar Rajoriya


M.Tech (VLSI Design) Associate Professor. Assist. Professor
ITM, Gwalior, India ITM, Gwalior ITM, Gwalior
e-mail: vikas.esvd@gmail.com e-mail:vlsi.shyam@gmail.com

Abstract—On-chip L1 and L2 caches represent a sizeable A potentially important source of this power dissipation is
fraction of the total power consumption of microprocessors. In on-chip caches, because larger on-chip caches are being
nanometer-scale technology, the sub threshold leakage power is integrated onto the chip. For example, an Intel processor for
becoming one of the dominant total power consumption com- server applications has 1 and 6 MB on-chip L2 and L3 caches,
ponents of those caches. In this study, we present optimization
techniques to reduce the sub threshold leakage power of on-chip respectively1; subthreshold leakage power is dissipated by all
caches assuming that there are multiple threshold voltages, ’s, of the subbanks even if they are not accessed, while dynamic
available. First, we show a cache leakage optimization technique power is dissipated when a cache subbank is accessed. To
that examines the tradeoff between access time and sub threshold alleviate this problem, transistors in caches could be designed
leakage power by assigning distinct ’s to each of the four for low subthreshold leakage, for example, by assigning them
main cache components—address bus drivers, data bus drivers,
decoders, and static random access memory (SRAM) cell arrays with a higher threshold voltage or by controlling the with
sense amplifiers. Second, we show optimization techniques to reduce adaptive body biasing or, if a better balance of speed and power
the leakage power of L1 and L2 on-chip caches without affecting the is required, by employing dual [3]–[7]. Traditionally, at
average memory access time. The key results are: most two ’s—one low and one high —have been avail-
1) two additional high ’s are enough to minimize leakage in a able in high-performance process technologies, allowing cache
single cache—3 ’s if we include a nominal low for micro- designers only limited flexibility for suppressing subthreshold
processor core logic; 2) if L1 size is fixed, increasing L2 size can
result in much lower leakage without reducing average memory leakage current. To further improve the subthreshold leakage,
access time; 3) if L2 size is fixed, reducing L1 size may result in several circuit and microarchitectural techniques [8]–[13] have
lower leakage without loss of the average memory access time for the therefore been proposed targeted at the subthreshold leakage
SPEC2K benchmarks; and 4) smaller L1 and larger L2 caches than are power reduction of L1 caches.
typical in today’s processors result in significant leakage and One consequence of the increasing importance of sub-
dynamic power reduction without affecting the average memory
access time. threshold leakage current is that, the number of available ’s
in future process technologies will increase. Next-generation
Keywords—Microprocessor memory hierarchy, multiple
65-nm processes are expected to support three ’s (one
threshold voltage, on-chip caches, SRAM, sub threshold
leakage power. low and two high ’s) and future processes are likely to
provide designers with even more choices. This increase
provides new flexibility for subthreshold leakage power re-
duction methods, allowing new tradeoffs between the of
I. INTRODUCTION
different parts of a cache and between different levels in the
NTIL VERY recently, only dynamic power has been a cache hierarchy. The availability of additional ’s suggests a
U significant source of power consumption, and Moore’s
law has helped to control it. Shrinking processor technology
new examination of the tradeoff between cache size and
reduce power loss from subthreshold leakage current.
to

below 100 nm has allowed, and actually required, reducing the In this study, we present systematic techniques for assigning
supply voltage to reduce dynamic power consumption. How- multiple ’s to memory hierarchies to minimize power dis-
ever, smaller geometries with a low-threshold voltage exacer- sipation, in particular subthreshold leakage [14]. Based on our
bate leakage, so static power is beginning to dominate the power techniques, we provide a detailed quantitative tradeoff analysis
consumption equation [1]. For example, a 90-nm Pentium 4 con- between access time and subthreshold leakage power of on-chip
sumes 110 W, and roughly 40% of the total power dissipation caches as a function of the number and the strength of .
is consumed by leakage power [2]. The excessive heat dissipa- Although the qualitative trends of subthreshold leakage power
tion by the leakage power in the high-end 90-nm Pentium 4 pro- versus access time tradeoff are well known, this paper provides a
detailed quantitative analysis to determine the optimal number
cessor forced Intel Corporation to adopt more expensive power
of ’s for given design constraints and to justify the cost of
delivery, cooling, and packaging systems.
extra ’s. First, we examine optimal leakage power dissipa-
tion for various access times in on-chip SRAM caches, when
more than one high is available. Then, we show how many
high ’s are needed, in addition to a nominal required for
the processor’s general logic circuits and how much should
be increased for effective leakage power reduction for
TABLE I
CACHE ORGANIZATIONS FOR EACH CACHE SIZE

Fig. 1. Cache subbank organization.


various cache access time points. Second, we present how cache
leakage power can be reduced while maintaining the same av- data bus drivers, decoders, and 6T-SRAM cell arrays with sense
erage memory access time of a processor memory system using amplifiers. Fig. 1 illustrates the cache subbank organization used
L1 and L2 cache access statistics for SPEC2K workloads [15]. in this study.
The reminder of this study is organized as follows. Section II The circuit topology and the ratios of transistors in
explains our on-chip cache subthreshold leakage power and ac- the decoder circuits are based on the CACTI model but opti-
cess time-modeling methodologies. Section III presents a sub- mized for the 70-nm technology. In addition, modern techniques
threshold leakage power optimization technique for a given ac- for lower voltage are employed for the bitline precharge and
cess time constraint and provides a quantitative tradeoff analysis sense-amplifier circuits. For the address and data bus intercon-
of on-chip cache subthreshold leakage power and access time. nects, we employed an H-tree topology and inserted repeaters
Section IV presents two-level cache leakage power optimization on each branch of the buses to optimize the interconnect delay
techniques using cache access statistics. Section V discusses fu- of cache buses. To obtain the interconnect capacitance and re-
ture directions for this line of work and adds some concluding sistance of long wires such as bitlines, wordlines, address, and
remarks. data buses, the lengths of the interconnects are estimated using
SRAM cell dimensions of 1.42 m 0.72 m and the cache or-
II. ON-CHIP CACHE LEAKAGE POWER AND ACCESS ganizations in Table I. Then, for given interconnect length, the
TIME MODELS predictor provided in footnote 2 is used to estimate the intercon-
nect capacitance and resistance.
To examine tradeoffs between subthreshold leakage power
HSPICE simulations were run extensively to obtain leakage
and access time of a processor cache memory system, we need
power and access time (or delay) models for wide ranges of
circuit models to estimate the subthreshold leakage power and
cache sizes and ’s for their four components. We considered
access time of caches. Rather than starting from scratch, we
’s between 0.2 and 0.5 V in steps of 0.05 V at 1-V nominal
could have built on a widely used cache memory model called
supply voltage. We measured the leakage power and the delay
―CACTI‖ [16]. This model estimates access time, dynamic en-
of each cache component separately.
ergy dissipation, and area of caches for given cache configura-
tion parameters such as total size, line size, associativity, and
A. Leakage Power Models
number of ports. However, it is based on an outdated 0.8- m
CMOS technology and it applies linear scaling to obtain the fig- Fig. 2 shows versus leakage power of the 7 128,
ures for smaller process technologies. Furthermore, it does not 8 256, and 9 512 row decoders that we designed. The
provide access time and leakage power when multiple ’s are HSPICE simulation results shown in Fig. 2 agree with the
available. To address these shortcomings, we designed caches exponential decay in leakage power with a linear increase of
with the 70-nm Berkeley predictive technology model (BPTM)2 that is characteristic of general CMOS circuits
in anticipation of the next generation of process technology.
Then, we derived our subthreshold leakage power and access (1)
time models based on the HSPICE simulations of the designed To obtain an approximated analytic equation for leakage power
cache circuits. as a function of , we measured the leakage power of the de-
The designed caches ranged from 16 to 1024 KB in size. The coders at each discrete point, and we applied an exponen-
bitlines and wordlines were segmented to improve access time,
tially decaying curve fitting method to the measured leakage
and subbanks were employed to reduce dynamic power dissipa-
power as follows:
tion [17] as well; see Table I for the cache subbank organization
used in this study. The caches were broken into four components (2)
for the purposes of assigning distinct ’s: address bus drivers,
where , and are constants derived from using Origin
6.1, which is a scientific graphing and analysis software curve-
TABLE II
CACHE COMPONENT LEAKAGE POWER MODEL COEFFICIENTS AT 70 C DIE TEMPERATURE AND A TYPICAL CORNER FOR EACH CACHE SIZE

Fig. 2. Leakage power dissipation of the 7 128, 8 256, and 9 Fig. 3. Delay time of 7 128, 8 256, and 9 512 decoders.
512 decoders.

of the leakage power of all the components. Assuming that we


fitting package3—the -squared error is less than 0.001 for each apply four distinct ’s, the analytic approximated equation
fitted curves. for leakage power (LP) is
The rest of the cache components—address driver, data
driver, and 6T SRAM cell array—show the same leakage
power trend characteristics as the decoder of Fig. 2; leakage (3)
power decreases exponentially with the linear increase of .
Hence, an identical curve-fitting method can be applied for where , and represent the ’s for address
these components to derive leakage power models like (2). The bus drivers, data bus drivers, decoders, and 6T-SRAM cell ar-
coefficients for all of the components in (2) can be found in rays, respectively. Each exponential term evaluates the leakage
Table II. power dissipation of one of the four components.
Once all of the approximated analytic leakage power models
for each component are derived for a cache size, the total B. Access Time Models
leakage power of the cache can be approximated as the sum
Fig. 3 shows versus delay time of the 7 128, 8 256,
and 9 512 row decoders that we designed. Basically, the
TABLE III
CACHE COMPONENT DELAY MODEL COEFFICIENTS AT 70C DIE TEMPERATURE AND A TYPICAL CORNER FOR EACH CACHE SIZE

CMOS circuit delay of ultra deep submicrometer short-channel


transistors is

(4)

where , and 4 are constants depending on the technology


and transistor sizes. The measured delay time trends in Fig. 3
agree with (4). However, the circuit delay or access time also fits
very well to an exponential growth function with a very small
exponent over our range of interest. It was convenient for some
of our optimizations to approximate delay this way.
To obtain an approximated analytic equation for delay time
as a function of , we measured the delay time of the decoders
at each discrete point, and we fit the following exponential
curve to the measured delay time:
Fig. 4. Access time and leakage power versus cache size of the
(5) baseline caches.

where , and are constants derived using the same tech- apply four distinct ’s, the analytic approximated equation for
nique as that used for the leakage power models. the access time (AT) is
The rest of the cache components show the same delay trend
characteristics as the decoder case of Fig. 3. Hence, the same (6)
curve-fitting technique can be applied for those components to where , and represent the ’s for address
derive approximated delay time models as functions of like bus drivers, data bus drivers, decoders, and 6T-SRAM cell ar-
(5). The coefficients for all the components in (5) can be found rays, respectively. Each exponential term corresponds to the
in Table III. delay time of one of the four components.
Once all of the approximated delay time models for each We also define baseline caches in which the of all the
component are extracted for a specific cache size, total delay cache components is set to a low- (0.2 V). Fig. 4 shows the
or access time of the cache can be approximated as a sum of the access time and the leakage power of the baseline caches. The
delay times of all the cache components. Assuming that we can cache access time grows logarithmically and the leakage power
increases linearly with the cache size. Those trends agree with
4 was around 2 in submicrometer technology, but it has been decreased to those of earlier studies on SRAM design. In Fig. 4, we assume a
about 1.3 in the current generation deep-submicrometer technology. direct-mapped cache organization and consider only the leakage
power of data arrays, disregarding the leakage of the tag com-
parators and other cache control logic.

III. CACHE LEAKAGE OPTIMIZATION WITH MULTIPLE


ASSIGNMENTS
A. Methodology
In this section, we present a leakage power optimization tech-
nique assuming that we can assign multiple ’s to a cache. To
find the minimum leakage power of caches using a maximum
of four distinct ’s under a specified target access time con-
straint, we formulate the problem as follows:

(7)
Fig. 5. Normalized optimum LP and V versus normalized AT of 512-KB
constraints caches—schemes I and II.

(8) • Scheme IV: Assigning four distinct high ’s to all four


cache components. This requires at least five ’s if we
where , and represent the ’s for address include a nominal or low for the processor logic.
bus drivers, data bus drivers, decoders, and 6T-SRAM arrays,
respectively.
B. Leakage Power Optimization and Quantitative Tradeoff
There exist numerous combinations of , and
Analysis
satisfying a specific target access time. Among those
combinations, we find a quadruple of , and In Fig. 5, we plot the normalized optimum leakage power and
producing minimum leakage power using a numerical optimiza- at different target access times (125%, 150%, 175%, and so
tion method (e.g., Matlab’s fmincon function). We allowed the forth) of 512-KB caches employing schemes I and II. The op-
combination that satisfies a specified access time error range timum leakage power and the are obtained using (7) and (8)
within 5%. We can repeat this procedure with modified objec- of Section III-A. The parenthesized I and II in Fig. 5 represent
tive and constraint functions to find an optimal combination the schemes I and II, respectively. In the graph, the normalized
for cache memories that have only two or three distinct ’s. minimum leakage power and the access time of 100% corre-
Assuming that we can assign distinct ’s to each compo- spond to the access time and the leakage power of a 512-KB
nent of the cache, it is important to determine how many ’s baseline cache designed with a low (0.2 V) for all four cache
are cost-effective because an extra mask and process step are components—the fasted but leakiest cache. The 125% access
needed for each additional . To examine the dependence of time in the axis means that the cache is 25% slower than the
the optimization results on access time, we sweep the target ac- baseline cache.
cess time from the fastest possible (assigning a low of 0.2 V According to the trends shown in Fig. 5, the leakage power
to all the cache components) to the slowest possible (assigning decreases exponentially as the increases linearly; note that
a high of 0.5 V to all the cache components). We present the axis is a logarithmic scale. The optimization results for the
here the summary of the assignment schemes we examined different cache sizes show almost the same normalized optimum
in this study. leakage power and trends as those of the 512-KB caches in
• Scheme I: Assigning a high- to all of the cache compo- Fig. 5 as long as the same assignment scheme is applied; see
nents including address bus drivers, data bus drivers, de- Table IV for the normalized leakage power of all the cache sizes.
coders, and 6T-SRAM cell arrays. This requires 2 ’s if Comparing two schemes—scheme I and II—the 512-KB cache
we include a nominal or low for the processor’s gen- with scheme II dissipates less leakage power than the one with
eral logic circuits. scheme I at the same access time point when the normalized ac-
• Scheme II: Assigning a high- only to the 6T-SRAM cell cess time constraint is less than 155%. For example, at the 125%
arrays that dominates leakage power but not the overall access time point, scheme II shows 6% leakage dissipation of
cache delay and assigning a default- or low- (0.2 V) to the baseline 512-KB cache and scheme I shows 13% leakage
the rest of the transistors. This requires at least two ’s if dissipation—a 2 difference. However, scheme I shows better
we include a nominal or low for the processor’s logic. leakage power reduction beyond a 155% normalized access time
• Scheme III: Assigning a high- to the 6T-SRAM cell ar- point.
rays and assigning another high- to the peripheral com- Fig. 6 shows the normalized optimum leakage power and
ponents—address bus drivers, data bus drivers, and de- versus normalized access time trends for a 512-KB cache of
coders of the cache. This requires at least three ’s if scheme III. The optimum leakage power and the ’s are ob-
we include a nominal or low for the processor. tained using (7) and (8) of Section III-A. In Fig. 6, the of
TABLE IV
PERCENTAGE LEAKAGE POWER OF SCHEMES I–IV NORMALIZED TO LEAKAGE POWER OF EACH CACHE SIZE AT THE 100% AT POINT

Fig. 6. Normalized optimum LP and V versus normalized AT of 512-KB Fig. 7. Normalized optimum LP and V versus normalized AT —scheme IV.
caches—schemes I and III.

reduce leakage power beyond the 155% access time point, be-
the SRAM cell array, denoted as array in the graph, starts to in- cause the leakage power of the peripheral components, where a
crease first. This implies that the SRAM cell array is responsible low is used, becomes substantial beyond this point.
for the most significant fraction of total cache leakage power, Fig. 7 shows the normalized optimum leakage power and
but it has the least impact on increasing the total cache access versus normalized access time trends for a 512-KB cache of
time. After the of the SRAM cell arrays are saturated to the scheme IV. The optimum leakage power and the ’s are ob-
maximum allowed point (0.5 V), the of the peripheral com- tained again using (7) and (8) of Section III-A. In scheme IV,
ponents labeled as peri in the graph is increased further to reduce we can assign up to 4 distinct ’s for leakage power opti-
further leakage power in the peripheral components. However, mization. According to the results shown in Fig. 7, the of
this just increases the access time without much further cache the 6T-SRAM cell arrays starts to increase first similar to the
leakage reduction. For example, the leakage power is not de- scheme III case. Among the peripheral components, the for
creased over the 215% access time point where the for the the data bus starts to increase first. This implies that the data
peripheral circuit has not reached the maximum value (0.5 V) bus consisting of 128 b—the assumed bus width between the L2
in this 512-KB cache case. and L1 caches—has the second most significant impact on the
This leakage power and versus access time trends also ex- leakage power. Even though the address bus has the same struc-
plain the leakage optimization results shown in Fig. 5: scheme II ture, the number of bits in the address bus is much smaller than
shows a better optimization result than scheme I does when the the data bus. Hence, the leakage power impact of the address
normalized access time is less than 155%, but it does not beyond bus much less than the data bus. However, in the case of smaller
155% access time point. Recall that scheme I assigns a high- caches (e.g., 16–64 KB caches) where the data bus width is 32 b,
to all the cache components. It sacrifices more access time un- both the data and address bus have almost the same impact on
necessarily by increasing the of the peripheral components the leakage power. Therefore, the trends for both the data and
with little leakage reduction at the same access time. However, address buses will be the same. These trends suggest the di-
scheme II assigns the high- to just the SRAM cell arrays rection of optimizations that reduce cache leakage power.
that are responsible for a greater fraction of total cache leakage Table IV summarizes the normalized cache leakage power of
power but affects access time less. However, scheme II cannot schemes I–IV. As expected, we can reduce more leakage power
TABLE V
CACHE DYNAMIC ENERGY CONSUMPTION PER ACCESS AND LEAKAGE POWER
DISSIPATION AT 70 C DIE TEMPERATURE AND A TYPICAL CORNER FOR EACH CACHE SIZE

while achieving the same access time by having more ’s to where HitTime and HitTime are the access time of L1 and
control. If the access time is fixed, the caches of schemes III and L2 caches, Miss Rate and Miss Rate are the miss rate of L1
IV always show 38%–72% better leakage optimization results and L2 caches, and Miss Penalty is the external memory
than those of scheme I. There are a few things we should note access and data transfer time. Note that the local miss rate5 is
from this comparison study. First, as the target access time is used as the Miss Rate .
increased to more than the 150% point in scheme II, caches dis- Similarly, we measure the average memory access energy
sipate more leakage power than those employing scheme I. This (AMAE) to compare the dynamic energy dissipation of each
implies that the cache peripheral components consume nonneg- memory system configuration. Assuming that the L1 cache is
ligible leakage power. The leakage power of those components accessed every cycle, the AMAE represents the average en-
becomes substantial when we cut down the leakage power of the ergy dissipation per access in the entire microprocessor memory
6T-SRAM arrays significantly. Second, the slowest cache ac- system that includes L1, L2, and main memory. We can estimate
cess time of scheme II ends around 150% in small-size caches. average memory access energy, as follow:
This means that the peripheral components also play important
roles in both cache leakage power and access time. In other
words, increasing the of 6T-SRAM cell arrays alone gives us
diminishing returns at some point without reducing the leakage
power further. This is why the caches of scheme I give even (10)
better results than those of scheme II as increases. Finally,
there is a negligible difference between caches of schemes III where Hit Energy is average energy dissipation per access
and IV in terms of leakage power reduction. This implies that given in Table V. We assume a two-channel 1066-MHz
scheme III employing two distinct high ’s—three ’s if 256-MB RAMBUS DRAM RIMM whose sustained transfer
we include a nominal or low for the processor—is enough rate is 4.2 GB/s [19] to derive the main memory access time
to minimize leakage. Finally, as illustrated in Figs. 5–7 and and dynamic energy dissipation per access. Though the sus-
Table IV, each cache shows a wide range of optimal leakage tained transfer rate is quite high, we should also consider the
power consumption depending on target access time. Hence, the RAS/CAS latency of the memory, which is about 20 ns. For the
right tradeoff point between the leakage power and the access energy dissipation per access, we used the number given in [20],
time of the caches will be determined by either system design which is 3.57 nJ per access. The dynamic energy dissipation
specifications or constraints. per access can vary depending on the number of RIMMs. We
assume that one RIMM is installed. See Section IV–B and note
that more RIMMs are favorable for our optimization technique,
IV. LEAKAGE OPTIMIZATION TECHNIQUES because our technique prefers a larger L2 cache to a smaller
FOR TWO-LEVEL CACHES
one for leakage power reduction. The larger L2 cache accesses
A. Methodology DRAM less frequently than the smaller one, resulting in less
energy consumption for accessing the external DRAM. Hence,
In a processor memory system, the average memory access
if more RIMM modules are installed implying more energy
time (AMAT) [18] is a key metric for measuring the overall
dissipation per DRAM access, a larger L2 cache will allow
memory system performance. To evaluate the performance or
even more energy to be saved.
AMAT, it is essential to examine the cache miss characteristics
To obtain L1 and L2 cache miss rates, we use the Simple-
of realistic applications, because the performance or AMAT is a
Scalar/Alpha 3.0 tool set [21], which is a suite of functional and
function of L1 and L2 cache miss rates and cache access times.
timing simulation tools for the Alpha AXP ISA. In addition, we
In our study, we assume that the memory system hierarchy con-
collected the results from all 25 of the SPEC2K benchmarks [15]
sists of separate L1 instruction and data caches with a unified L2
to perform our evaluation. All SPEC programs were compiled
cache. Then, the average performance of the processor memory
for a Compaq Alpha AXP-21 264 processor using the Compaq
system can be measured or compared with the AMAT repre-
C and Fortran compilers under the OSF/1 V4.0 operating system
sented by
using full compiler optimizations . We completed the ex-
ecution for each benchmark application to get reliable L2 cache
miss rates, because L2 cache accesses are far less frequent than
5This rate is simply the number of misses in a cache divided by the total
(9) number of memory accesses to this cache.
TABLE VI
AVERAGE L1 AND L2 CACHE MISS RATES
FROM THE ENTIRE SPEC2K BENCHMARKS

Fig. 8. L2 leakage power optimization at a fixed L1 size (16 KB). (1) and (2)
are the leakage power consumption of the 256- and 512-KB caches at the same
AMAT as the baseline 128-KB cache, respectively.
L1 cache accesses; an insufficient number of L2 accesses may
result in unrepresentatively higher L2 cache miss rates.
Table VI shows the average L1 and L2 cache miss rates from equally important constraint in many situations [22]. In this ar-
the entire SPEC2 K benchmarks for 16-, 32-, and 64-KB L1 gument, we assume that the same AMAT will approximately
caches, respectively. We used direct-mapped L1 instruction give us the same execution time for a fixed processor core, L1
caches and four-way set associative L1 data caches. Also, we cache size, and benchmark program, so that we can fairly com-
used eight-way set associative L2 caches. For simplicity, each pare the total leakage energy consumption as well.
L1 cache miss rate is obtained by taking the sum of the number Fig. 8 shows the leakage power versus AMAT of L2 caches
of total instruction and data cache misses and dividing by the with a fixed L1 cache size—16 KB. The leakage power opti-
sum of total instruction and data cache accesses; a 16-KB L1 mization for individual caches is based on scheme III that re-
means instruction and data caches are each 16 KB in size. Since quires two additional distinct high ’s for L2. Assuming the
an L2 miss rate is a function of the L1 cache miss rate, we AMAT of the fastest 128-KB L2 cache designed with low-
measure the separate L2 cache miss rates for each L1 cache size (0.2 V) as a baseline, we compare the leakage power of other
configuration. Those cache miss characteristics will definitely caches at the same AMAT point; see the (1) and (2) points in
affect the leakage optimization direction of two-level cache Fig. 8. The (1) and (2) points are the leakage power consump-
memory systems. tion of the cache system with the 256- and 512-KB caches at
the same AMAT as the baseline 128-KB cache system. As can
B. L2 Cache Leakage Power Optimization be seen from the plots, the AMAT can be maintained while the
leakage power can be reduced by replacing the baseline 128-KB
Since an L2 cache’s contribution to leakage power dominates L2 cache with a 256-KB L2 cache that is intentionally slowed
due to their size, we will examine the leakage power optimiza- down by increasing its ’s to reduce leakage.
tion of the L2 cache first. Consider caches designed with low- This replacement with the double-sized L2 cache reduces
(0.2 V) devices and a baseline cache memory system consisting the leakage power by 70% compared to the fastest but leakiest
of 16 and 128 KB for L1 and L2 caches, respectively. Then, 128-KB L2 cache with the same AMAT. Similarly, the use of a
we have leakage power consumption and AMAT corresponding 512-KB L2 cache can further reduce leakage compared to the
to this configuration. Increasing of the 128-KB L2 cache 256-KB cache; see the vertical line in Fig. 8.
will reduce the leakage power of the L2 cache, but it will in- Finally, the employment of larger L2 caches also reduces
crease the AMAT of the cache memory system because of the in- the average dynamic power of the memory system, because
creased access or hit time. However, there is a way to reduce the the larger L2 caches reduce the number of external memory
leakage power of the cache memory system without increasing accesses that consume a significant amount of dynamic energy.
the AMAT that significantly impacts on the execution time of Table VII summarizes the results for the normalized leakage
the system. power and normalized average memory access energy for each
The key to reducing leakage power without increasing AMAT L1 cache size designed using scheme III at a fixed AMAT. To
is to compensate for the increased L2 access time by reducing compare leakage power and AMAE, the following standard
the cache miss rate of the cache memory system. To reduce the cache configurations were used: 128-KB L2 with 16-KB L1,
miss rate, we can increase the L2 cache size. The main memory 256-KB L2 with 32-KB L1, and 512-KB L2 with 64-KB L1.
access penalty is quite significant in term of both time and en- The shaded numbers represent the baseline L2 configuration,
ergy. Hence, even a slight reduction of L2 cache miss rates re- leakage power, and AMAE. Table VII shows the counterintu-
sults in a significant improvement in the AMAT. We note that itive results that we can reduce both leakage power and AMAE
although area was one of the most important design constraints by employing larger L2 caches while maintaining the same
in the past, this trend is changing and power is becoming an AMAT.
TABLE VII TABLE VIII
L2 CACHE NORMALIZED LEAKAGE AND AMAE L1 CACHE NORMALIZED LEAKAGE AND AMAE
AT THE FIXED L1 SIZE (16 KB) AND AMAT AT THE FIXED L2 SIZE (512 KB) AND AMAT

system still has the same AMAT. Similarly, a slowed 16-KB


cache with increased ’s can replace a 32-KB cache without
changing the AMAT of the L1/L2 hierarchy. The new system
consumes much less leakage power; see points (1) and (2) in
Fig. 9, which are the leakage power consumption of the cache
system with the 32- and 16-KB caches at the same AMAT as
the baseline cache system.
Table VIII shows the results for normalized leakage power
and AMAE as a percentage of each fast but leaky L1 cache
size using scheme III with fixed AMATs. The comparisons were
performed in the same manner as Table VII. The shaded num-
bers represent the baseline L1 configuration, leakage power,
and AMAE. According to the comparisons, we can reduce both
leakage power and AMAE by employing smaller L1 caches.
This is the inverse of the case for L2 caches, where the leakage
of the overall memory system can be reduced by increasing their
Fig. 9. L1 leakage power optimization at a fixed L2 size (512 KB). (1) and (2)
size. However, it should be noted that these results are only valid
are the leakage power consumption of the 32- and 16-KB caches at the same within the specific set of sizes and simulation environment given
AMAT as the baseline 64-KB cache, respectively. in this discussion. First, a 4-KB L1 cache will have a cache
miss rate that is much higher than a 16-KB cache, but its access
C. L1 Cache Leakage Power Optimization time will not be sufficiently smaller to make the tradeoff worth-
while. Also, the normalized AMAE is rather high because the
It is rather difficult to improve the L1 cache miss rates fur- total power fraction of L1 caches is relatively small compared to
ther, because they are already very low for 16-, 32-, and 64-KB L2 caches. Second, many SPEC2K benchmark programs have
caches in the case when SPEC2K benchmarks are run. Hence, very high locality compared to real-world larger size applica-
the access time of caches become a dominant factor in deter- tions. This results in quite low cache miss rates for small-size
mining the AMAT. For example, the access time of a 64-KB L1 caches as shown in Table VI. Third, the operating system
L1 cache increases by 48% compared to the fastest 16-KB L1 (OS) context switching was not modeled due to our limited sim-
cache, because the access time is very sensitive to size in small ulation environment. The context switching typically increases
caches. Essentially, cache access time increases logarithmically cache miss rates, because cache flushing increases cold start
with size, but has a steeper slope for smaller caches than for misses. These factors must be considered if one is to perform
larger caches. This observation explains why the AMAT of a realistic cache leakage power optimizations with the proposed
cache hierarchy with a smaller L1 cache can be faster than one techniques.
with a larger L1 caches for a certain range of cache sizes (e.g.,
16 or 64 KB).
V. CONCLUSION
Fig. 9 shows the leakage power versus the AMAT of 16-, 32-,
and 64-KB L1 caches using scheme III each with a fixed L2 In this study, we examined the leakage power and access time
cache of size 512 KB. Like the comparison performed in Section tradeoff for caches where multiple ’s are allowed. We used
IV–B, the leakage power of different caches is compared at the curve fitting techniques to model subthreshold leakage power
same AMAT point. The plots show that leakage power can be and access time. Our results show that two extra distinct high
reduced by replacing the fastest 64-KB L1 cache with a 32-KB ’s for caches—3 ’s including the for the micropro-
L1 cache that is intentionally slowed down by increasing its cessor core logic—are sufficient to yield a significant reduction
’s to reduce the leakage power—the resulting cache memory in leakage power. Such an arrangement can reduce the leakage
power by as much as 91%. We also show that smaller L1 and On-Chip Caches,‖, Western Res. Lab. Res. Rep. 93/5, 1993.
[17] K. Ghose and M. Kamble, ―Reducing power in superscalar processor
larger L2 caches than are typical in today’s processors result caches using subbanking, multiple line buffers and bit-line segmenta-
in significant leakage and dynamic power reduction without af- tion,‖ in Proc. IEEE Int. Symp. Low Power Electronic and Design, 1999,
fecting the average memory access time. Given that the pro- pp. 70–75.
cessor core may need a distinct , and each of the caches may
need up to two ’s (scheme III) we could require up to five
distinct ’s for the leakage power optimization of two-level
cache memory systems.
Even though the modeling and optimization techniques pre-
sented in this study have been performed using continuous-do-
main functions, the actual cache latencies are integer numbers of
processor clock cycles. Cache designers or architects can choose
an appropriate discrete point from the continuous-domain re-
sults depending on their target processor core clock frequency.
Furthermore, the circuit techniques combined with microar-
chitectural level controls exemplified by drowsy caches [10] are
designed to reduce the leakage power of L1 caches when sac-
rificing access time is not an option. Such an approach is less
attractive for L2 caches. The same effect can be obtained more
simply by using high- circuits.

REFERENCES
[1] N. S. Kim et al., ―Leakage current: Moore’s law meets static power,‖
IEEE Computer , vol. 36, no. 12, pp. 68–75, Dec. 2003.
[2] G. Sery, S. Borkar, and V. De, ―Life is CMOS: Why chase life after?,‖
in Proc. IEEE Design Automation Conf., 2002, pp. 78–83.
[3] S. Mutoh et al., ―1-V power supply high-speed digital circuit technology
with multithreshold-voltage CMOS,‖ IEEE J. Solid-State Circuits, vol.
30, no. 8, pp. 847–854, Aug. 1995.
[4] T. Douseki, N. Shibata, and J. Yamada, ―A 0.5–1 V MTCMOS/SIMOX
SRAM macro with multi-Vth memory cells,‖ in Proc. IEEE Int. SOI
Conf., 2000, pp. 24–25.
[5] K. Nii et al., ―A low power SRAM using auto-backgate-
controlled
MT-CMOS,‖ in Proc. IEEE Int. Symp. Low Power Electronic Device,
1998, pp. 293–298.
[6] H. Mizuno et al., ―An 18- A standby current 1.8-V, 200-MHz micropro-
cessor with self-substrate-biased data-retention mode,‖ IEEE J. Solid-
State Circuits, vol. 34, no. 11, pp. 1492–1500, Nov. 1999.
[7] F. Hamzaoglu et al., ―Analysis of dual-V SRAM cells with full-swing
single-ended bit line sensing for on-chip cache,‖ IEEE Trans. Very Large
Scale (VLSI) Syst., vol. 10, no. 2, pp. 91–95, Apr. 2002.
[8] M. Powell et al., ―Gated-V : A circuit technique to reduce leakage
in deep-submicron cache memories,‖ in Proc. IEEE Int. Symp. Lower
Power Electronics & Design, 2000, pp. 90–95.
[9] A. Agarwal, L. Hai, and K. Roy, ―A single-V low-leakage
gated-ground cache for deep submicron,‖ IEEE J. Solid-State
Cir- cuits, vol. 38, no. 2, pp. 319–328, Feb. 2003.
[10] N. S. Kim et al., ―Drowsy instruction caches,‖ in Proc. IEEE Int. Symp.
Microarchitecture, 2002, pp. 219–230.
[11] S. Yang et al., ―An integrated circuit/architecture approach to reducing
leakage in deep-submicron high-performance I-caches,‖ in
Proc.
IEEE Int. Symp. High-Performance Computer Architecture, 2001, pp.
147–157.
[12] S. Kaxiras et al., ―Cache decay: Exploiting generational behavior to re-
duce cache leakage power,‖ in Proc. IEEE Int. Symp. Computer Archi-
tecture, 2001, pp. 240–251.
[13] H. Zhou et al., ―Adaptive mode-control: A static-power-efficient cache
design,‖ in Proc. IEEE Parallel Architecture and Compilation
Tech.,
2001, pp. 61–70.
[14] N. S. Kim et al., ―Leakage power optimization techniques for ultra deep
sub-micron multi-level caches,‖ in Proc. IEEE Int. Conf.
Computer
Aided Design, 2003, pp. 627–632.
[15] Standard Performance Evaluation Corporation [Online]. Available:
http://www.specbench.org
[16] S. Wilton et al., ―An Enhanced Access and Cycle Time Model
for
[18] J. Hennessy et al., Computer Architecture—A Quantitative Approach,
3rd ed. San Mateo, CA: Morgan Kaufmann, 2003, pp. 406–408.
[19] 800/1066 MHz RDRAM Advanced Information (2002). [Online]. Avail-
able: http://www.rambus.com
[20] V. Delaluz et al., ―Compiler-directed array interleaving for reducing en-
ergy in multi-bank memories,‖ in Proc. IEEE Asia South Pacific Design
Automation Conf., 2002, pp. 288–293.
[21] T. Austin et al., ―SimpleScalar: An infrastructure for computer system
modeling,‖ IEEE Computer, vol. 35, no. 2, pp. 59–67, Feb. 2002.
[22] T. Mudge, ―Power: A first class design constraint,‖ IEEE Computer, vol.
34, no. 4, pp. 52–57, Apr. 2001.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Fault Tolerant Design for Analog and digital Circuits


1 2 3 4
Dr.Anand Mohan ,Akhilesh Pathak, Tarang Agarwal, Trailokya Nath Sasamal
1,2,3,4
Department of Electronics Engineering
Institute Of Technology, BHU
Varanasi,India
Email-{pathak.akhilesh, agarwaltarang07,sasamal.trailokyanath}@gmail.com,

Abstract—Reliability has been important in many applications The basic block diagram of TMR system has been shown in
and it has been convenient as the size and cost of chips has been figure 1 as follows:
reduced drastically , reliability in electronics circuits is achieved
through fault tolerant where the system itself is able to tolerate
the fault and mask the error, fault tolerant in circuits is achieved
by various redundancy methods( as hardware , software ,
information, and time) but these redundant methods are
different for analog and digital systems so in this paper we have
discussed the important method for analog and digital circuits to
make them fault tolerable. In this paper digital fault tolerant
design has been explained with majority and minority voting and
how fault is injected in the circuits for testing using VHDL.
Analog fault tolerant design has been explained with the help of
fuzzification. The platform used for digital circuits is Xilinx-12.4i
(ISE) and for analog is MATLAB. Figure 1
Here the most important part is voting unit which plays an
Index Terms: Triple modular redundancy, Majority & Minority, important role in reliability of system, as the results in analog
Voter Fuzzification. systems and digital systems are different so this voting unit
plays a distinguished part in both these systems another thing is
I. INTRODUCTION that the voting unit is not redundant here so what happens if it
fails? So these are parts of discussion of this paper.
During lifetime of a system it is tested and diagnosed on
numerous occasions. For the system to perform its intended The distribution of this paper is as follows. In Section II,
mission with high availability, testing and diagnosis must be we make a short review of the most common fault tolerant
quick and effective. A sensible way to ensure this is to specify technique with its mathematical expression that how reliability
testing as one of the system functions– in other words, self-test. is increased as this is the basic method for both digital and
Reliability, availability, and safety (RAS) are the major factors analog systems Section III describes the fault tolerance in
for consideration in system design to provide continuous digital circuit’s environment and how faults are injected in
correct operation [1]. Since faults cannot be completely FPGA circuits for testing. In Section IV, the fault tolerant
eliminated, critical systems always employ fault tolerance technique for analog systems has been discussed with the help
techniques to guarantee high reliability and availability. Fault of fuzzy logic. The discussion of the results for both analog
tolerance (FT) techniques try to keep the system operational and digital circuits is provided in Section V. And finally the
despite the presence of faults [2]. FT can be achieved through future work and scope have been explained in Section VI.
hiding the occurrence of faults and preventing it from
generating errors (fault-masking), or through fault detection
and fault repairing. II. TRIPLE MODULAR REDUNDANCY
The basic block diagram of TMR system has been shown
There are various methods to make a system fault tolerable but above let the reliability of a single module is . Now the
the most basic is TMR method where the module which has above TMR system will give the correct output if either two or
to be made reliable, is made redundant by taking three three modules will perform correct operation so if the
identical modules in parallel in both hardware and software reliability of above system is then
and so the reliability of system increases as it can give the 3 2 3 3 0
right output even on failure of one module. R S R 1 R 2 m m 3 R 1 R
m m
2 3
3R 2Rm m

VLP0112-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

The reliability of above system will be greater than single be compared using minority voter circuits (figure3). The
module if minority voters also take in three inputs, the primary path and
two other redundant paths in question. If the primary path is in
the majority with one other redundant path then the output is
So the reliability of overall system will be greater than the low. If the primary path is in the minority in comparison with
single module if > 0.5. There is a single voter unit in the two other redundant paths then the output is high. Figure 4
above circuit so if this voter unit fails the complete circuit will below is a schematic and truth table of the minority voting
fail so it is important to consider the reliability of voting unit circuit.
also.
Let the reliability of voter unit is , so the reliability of
triplicated TMR will be greater than TMR system if:

Or 2

2 >3-2
These are the mathematical conditions for a triplicated system
to be more reliable as compared to TMR system.

III. FAULT TOLERANT DESIGN IN DIGITAL CIRCUITS Figure 4

This minority voter output is fed into the control signal


of a tri-state buffer with an inverted control input. If the path
in question is the minority then the tri-state buffer will be
placed into high-impedance. If the path in question is in the
majority then its corresponding tri-state buffer will allow the
path to follow through to output. These three outputs will
connect together outside of the FPGA into a wired-OR
fashion. Figure shows the minority voters controlling the tri-
state buffers which feed outside to the wired-OR gate.
Figure 2
The above truth table and circuit diagram (figure 2) shows the IV. FAULT TOLERANT DESIGN IN ANALOG CIRCUITS
basic TMR system. Here the voting unit is not redundant so if Voting on the results of redundant modules with discrete
it fails the circuit will not be able to give the correct output. So values is straightforward, and is referred to as exact voting.
to make the circuit more reliable triplicated TMR with The 3- input exact majority voter for example produces a
minority voting is used the basic circuit diagram of it has been correct output when 2-out of-3 of its inputs are equal.
shown below: However, exact voting on the results of redundant modules
with real number outputs is not appropriate. For data derived
directly from noisy sources, for the outputs which are read by
digital computers, for the output of replicated remote sensors
in fault tolerant data acquisition systems, or for the output of
diversely implemented software programs which handle
floating point arithmetic, an exact match is generally
impossible. So in case of analog signals exact match of results
from redundant modules is generally impossible. Various
solutions have been proposed for it, and most common method
used is median-selector algorithm method, it selects the mid
value of the voter inputs and then uses this value directly as
the voter output. Another solution for handling approximate
redundant value is the use of inexact (threshold) voters. In this
Figure 3 technique if the difference between outputs of two modules is
The method of using TMR with only one majority voter less than a threshold value then they will be in agreement
circuit is still flawed; this is because the SEU not only could otherwise in disagreement, so to make it more reliable
affect the redundant modules but can also affect the voting dynamic threshold method is used where the threshold value is
circuit itself. To alleviate this issue the majority voting circuit
must also be redundant. These redundant majority voters must

VLP0112-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

not fixed but it varies according to the module outputs, fuzzy


voting comes in this category.
This fuzzy voting mechanism shows the better availability and
A. fuzzy voter
safety than previous methods for small and medium errors but
Fuzzy voter described here uses fuzzy logic to calculate the it does not show much effective result for large errors so if
weights of modules and the final output[10]. instead of taking constant parameters [p, q, and r] we can
The basic block diagram of the fuzzy voting has been shown make them variable then this system will be able to show
in figure 5: better performance even with larger errors.

V. RESULTS
The results for digital circuits are as follows:

Implementation Results:
The basic circuit used for description of reliability is ALU;
here single module of ALU, Triple module of ALU and
Figure 5 Triplicated TMR of ALU has been implemented. The tables
The final output y will be calculated on the basis of weights of below show how much area is utilized on FPGA board in
voter inputs as: terms of slices/LUTs.
The faults have been injected in the circuit by adding extra
component to the actual circuit so that logic of circuit is
changed this is known as SABOTEUR METHOD.

Here the value of weights will lie in the range of [0, 1], Where Circuit Implementation without TMR:
0 means that the particular module is completely in
disagreement with other modules while 1 means that module is
in complete agreement with other modules. The membership XUPV5-LX110T Used Available utilization
of difference of input pairs [8] has been defined as: Speed Grade-3
Number of Slice
LUTs 46 69,120 1%

Number of
BUFG/BUFGCT 1 32 3%
RLs

Number of
22 17,280 1%
occupied Slices

Number of
53 640 8%
bonded IOBs
The membership of output w has been defined as:
Circuit Implementation with TMR:

XUPV5- Used Available utilization


LX110T
Speed
Grade-3
Number of
Slice LUTs 49 69,120 1%

Number of
So the weight will be calculated with the fuzzy rules as: BUFG/BUF 1 32 3%
GCTRLs

Number of
occupied 27 17,280 1%
Slices

Number of
bonded 53 640 8%
IOBs

VLP0112-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Circuit Implementation with Triplicated TMR:


Second graph has been plotted for correct output 5 and max
XUPV5-LX110T Used Available utilization allowed error is 0.5, error has been injected in modules in
Speed Grade-3 range [-1, +1]:

Number of Slice
LUTs 49 69,120 1%

Number of
BUFG/BUFGCT 1 32 3%
RLs

Number of
28 17,280 1%
occupied Slices

Number of
107 640 16%
bonded IOBs

Comparison of maximum path delay:

XUPV5-LX110T
Without With TMR With Triplicated
Speed Grade-3 TMR TMR

22 27 28
Utilized area
(slices)
Both the graphs show that improved fuzzy logic shows the
5.143ns 5.618ns 5.150ns
Maximum path better results as compared to existing fuzzy logic even in
delay (ns) presence of larger errors as shown in second graph.

VI. FUTURE WORK


The results for analog circuits are as follows:
The demand of reliability is increasing day by day even
A comparison of results of existing fuzzy voter and improved
in less critical systems, so in future to make a system more
fuzzy voter has been shown here with the basic formula as: reliable survivability approach will be dominating where even
if the system fails the critical part should not go down. So the
Performance= (1 - ) next step in digital circuits in this project will be survivability
while in case of analog circuits the concept of both fuzzy logic
and genetic may come together to make a system more reliable
First graph has been plotted for correct output 1 with injecting
errors in TMR modules and max allowed error was 0.1, error REFERENCES
Has been injected in modules in range [-0.5, 0.5]
[1] Two Flows for Partial Reconfiguration: Module Based or
Difference Based, Xilinx
[2] J.C. Baraza , J. Gracia, D. Gil, P.J. Gil , “A prototype of a VHDL-based
fault injection tool: description and application.
[3] Tobias Becker, Wayne Luk1 and Peter Y.K. Cheung,
“Enhancing Relocatability of Partial Bitstreams for Run-Time
Reconfiguration”.
[4] F. Lima, C. Carmichael, J. Fabula, R. Padovani, R. Reis,” A Fault
Injection Analysis of Virtex FPGA TMR Design Methodology”.
[5] C. Carmichael. Triple Modular Redundancy Design Techniques
for Virtex FPGAs. Xilinx, xapp197 (v1.0) edition, 2001
[6] Khaled Elshafey and Ahmed Elhosiny.” on-line testing and diagnosis of
microcontrollers”
[7] Fabian Vargas, Alexandre ,Amory Raoul ,” Estimating Circuit Fault-
Tolerance by Means of Transient-Fault Injection in VHDL”
[8] “Fuzzy logic with engineering applications” by Timothy J Ross.
[9] “Fuzzy sets and fuzzy logic theory and applications” by George J. Klir
and Bo Yuan.
[10] “A fuzzy voting scheme for hardware and software fault tolerant
systems”, G. Latif-Shabgahi, A.J. Hirst / Fuzzy Sets and Systems 150
(2005) 579–598
[11] “Fuzzy logic tutorial” from MATLAB

VLP0112-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0112-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Floating Point Arithmetic Operations Using


VHDL
S.C. Yadav, S. S. Chauhan1, A. R. Khan2
Electronics & Communication Engg.,1,2
Graphic Era University
566/6 Bell Road, Dehradun (India)
1
subhash.yadav775@gmail.com, sudakarnith@gmail.com , abdulamu@gmail.com
Abstract- 2) Exponent
In this paper an asynchronous programmable chip capable The exponent field needs to represent both positive
of performing floating point arithmetic operations has and negative exponents. To do this, a bias is added to
been designed. An asynchronous chip is the one wherein the actual exponent in order to get the stored
the operations performed are not clock dependent and exponent. For IEEE single-precision floating point,
hence are faster. The developed chip is operated by loading this value is 127. Thus, an exponent of zero means
the proper values of control and status registers. The result that 127is stored in the exponent field. A stored value
is obtained by reading the result register of 200 indicates an exponent of (200-127), or 73. For
I. INTRODUCTION reasons exponents of -127 (all 0s) and +128 (all 1s)
A OBJECTIVE are reserved for special number. For double precision,
The objective is to design an asynchronous the exponent field is 11 bits, and has a bias of 1023.
programmable chip, capable of performing IEEE: 754 – 3) Mantissa
1985 standard based floating point arithmetic The mantissa known as the significand, represents the
operations. precision bits of the number. It is composed of an
The complete design of the chip constitutes the implicit leading bit and the fraction bits.
individual modules developed for floating point To find out the value of the implicit leading bit,
addition/subtraction, multiplication and division. consider that any number can be expressed in
The language of choice is VHDL. scientific notation in many different ways.
B FLOATING POINT In order to maximize the quantity of representable
Floating point system was developed to provide high numbers, floating-point numbers are typically stored
resolution over a large dynamic range. Floating point in normalized form. This basically puts the radix point
system often can provide a solution when fixed point after the first non-zero digit. In normalized form, five
system, with their limited precision and dynamic range is represented as 5.0 × 100.
fails. Floating point systems comply with the published A nice little optimization is available to us in base
single or double precision IEEE floating point standard. two, since the only possible non-zero digit is 1. Thus,
There are basically two types of IEEE floating point we can just assume a leading digit of 1, and don't need
Representation. to represent it explicitly. As a result, the mantissa has
(1) Single Precision effectively 24 bits of resolution, by way of 23 fraction
(2) Double Precision bits.
Single Precision Special Values
IEEE reserves exponent field values of all 0s and all
The IEEE single precision floating point standard 1s to denote special values in the floating-point
representation requires a 32 bit word, which may be scheme.
represented as numbered from 0 to 31, left to right. The i) Zero
Zero is not directly representable in the straight
first bit is the sign bit, S, the next eight bits are the format, due to the assumption of a leading 1 ( need to
exponent bits, 'E', and the final 23 bits are the fraction specify a true zero mantissa to yield a value of zero).
Zero is a special value denoted with an exponent field
'F':
of zero and a fraction field of zero. Note that -0 and
S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF +0 are distinct values, though they both compare as
31 30 23 22 0 equal.
A standard floating point word consists of In particular,
(1) Sign-Bit (s)
0 00000000 00000000000000000000000 = 0
(2) Exponent (e)
1 00000000 00000000000000000000000 = -0
(3) Normalized Mantissa (m)
ii) Denormalized
1) Sign Bit If the exponent is all 0’s, but the fraction is non-zero
The sign bit is as simple as it gets. 0 denotes a positive (else it would be interpreted as zero), then the value is
number; 1 denotes a negative number. Flipping the a denormalized number, which does not have an
value of this bit flips the sign of the number. assumed leading 1 before the binary point. Thus, this

VLP0113-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

represents a number (-1)s × 0.f × 2-126, where s is the number created by prefixing F with an
sign bit and f is the fraction. For double precision,
implicit leading 1 and a binary point.
denormalized numbers are of the form (-1)s × 0.f × 2-
1022
. From this you can interpret zero as a special type of If E=0 and F is nonzero, then V=(-1)S * 2 (-
126)
denormalized number. * (0.F) These are "unnormalized" values.
If 0<E<255 then V=(-1)S* 2 (E-127) * (1.F) If E=0 and F is zero and S is 1, then V= -0
where "1.F" is intended to represent the binary If E=0 and F is zero and S is 0, then V= 0
number created by prefixing F with an implicit 0 00000000 00000000000000000000000 = 0
leading 1 and a binary point. 1 00000000 00000000000000000000000 = -0
If E=0 and F is nonzero, then V=(-1)S * 2 (-126) 0 11111111 00000000000000000000000 = Infinity
* (0.F) These are "unnormalized" values. 1 11111111 00000000000000000000000 = -Infinity.
iii) Infinity 0 11111111 00000100000000000000000 = NaN
The values +∞ and -∞ are denoted with an exponent of
1 11111111 00100010001001010101010 = NaN
all 1s and a fraction of all 0s. The sign bit distinguishes
between negative infinity and positive infinity. Being 0 10000000 00000000000000000000000 = +1 * 2 (128-127) * 1.0 = 2
able to denote infinity as a specific value is useful 0 00000001 00000000000000000000000 = +1 * 2 (1-127) * 1.0 = 2(-126)
because it allows operations to continue past overflow 0 00000000 10000000000000000000000 = +1 * 2 (-126) * 0.1 = 2(-127)
situations .Operations with infinite values are well
0 00000000 00000000000000000000001 = +1 * 2 (-126) *
defined in IEEE floating point .
0.00000000000000000000001 = 2(-149) (Smallest positive value)
0 11111111 00000000000000000000000 = Infinity
Special Operations
1 11111111 00000000000000000000000 = -Infinity Operations on special numbers are well-defined by
iv) Not A Number IEEE. In the simplest case, any operation with a NaN
The value NaN (Not a Number) is used to represent a yields a NaN result. Other operations are as follows:
value that does not represent a real number. NaN's are Table 1
represented by a bit pattern with an exponent of all 1s Special Operations in floating point
and a non-zero fraction. Operation Result
0 11111111 00000100000000000000000 = NaN
n ÷ ±Infinity 0
1 11111111 00100010001001010101010 = NaN
There are two categories of NaN: QNaN (Quiet NaN) ±Infinity × ±Infinity ±Infinity
and SNaN (Signalling NaN).
±nonzero ÷ 0 ±Infinity
a) QNaN is a NaN with the most significant
fraction bit set. QNaN's propagate freely Infinity + Infinity Infinity
through most arithmetic operations. These
±0 ÷ ±0 NaN
values pop out of an operation when the result
is not mathematically defined. Infinity – Infinity NaN
b) SNaN is a NaN with the most significant
fraction bit clear. It is used to signal an ±Infinity ÷ ±Infinity NaN
exception when used in operations. SNaN's can ±Infinity × 0 NaN
be handy to assign to uninitialized variables to
trap premature usage. Double Precision
Semantically, QNaN's
The IEEE double precision floating point standard
denote indeterminate operations, while SNaN's denote
invalid operations. representation requires a 64 bit word, which may be
Summary: represented as numbered from 0 to 63, left to right.
The value V represented by the word may be The first bit is the sign bit, S, the next eleven bits are
determined as follows: the exponent bits, 'E', and the final 52 bits are the
If E=255 and F is nonzero, then V=NaN ("Not fraction 'F'.
a number")
If E=255 and F is zero and S is 1, then V= - The value V represented by the word may be
Infinity
determined as follows:
If E=255 and F is zero and S is 0, then V=
If E=2047 and F is nonzero, then V=NaN
Infinity
("Not a number")
If 0<E<255 then V=(-1)S * 2 (E-127) * (1.F)
If E=2047 and F is zero and S is 1, then V= -
where "1.F" is intended to represent the binary
Infinity

VLP0113-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

If E=2047 and F is zero and S is 0, then V= Mode flag is low when being used for signed
Infinity operation and high when being used for floating point
operation.
If 0<E<2047 then V=(-1)S * 2 (E-1023) * (1.F)
X representrs don’t care condition.
where "1.F" is intended to represent the binary
Operation to be performed by the chip is selected
number created by prefixing F with an implicit
using the last three bits of the control register.
leading 1 and a binary point. Table 2
If E=0 and F is nonzero, then V=(-1)S * 2 (-1022) Opcodes for various operations.
Op2 Op1 Op0 Operation selected
* (0.F) These are "unnormalized" values.
If E=0 and F is zero and S is 1, then V= -0 0 0 0 Addition
If E=0 and F is zero and S is 0, then V= 0 0 0 1 Subtraction

0 1 0 Multiplication
II WORKING PRINCIPLE
0 1 1 Division
A MY CHIP
Chip consists of 2 unidirectional buses, each 32 bits to Status Register
accommodate the input and the output. It consists of a 2
bit address bus for selecting the desired register in the F1F F2F RF NAN OF UF DE Z
chip. Fig. 3 Status Register Format
Signal description:
1) r/w ( read/write) signal to perform the read or The flags of the status register are defined as:
write operation . A high indicates read F1F flag is high when operand 1 is loaded on the chip.
operation and the low indicates write F2F flag is high when operand 2 is loaded on the chip.
operation. RF flag when high indicates the completion of the
2) rst (Reset) signal to reset the chip contents. selected operation by the chip.
NAN flag is high when the content of the result
3) Int (Interrupt) signal to interrupt the processor
register is wrong i.e. NaN (not a number) condition
about some abnormality in the functioning of has been encountered.
the chip. OF flag is high when the content of the result register
exceeds the higher bound limit.
UF flag is high when the content of the result register
crosses the lower bound limit or when a denormalized
number is encountered .
DE flag is high when division by zero (0) error
occurs.
Z flag is high when the result of the operation is zero.
Register Mapping
Table 3
Access codes for registers in my _chip module
Read/write Address Bus Register
X 00 F1
X 01 F2
1 10 RES
0 10 Control Register
X 11 Status Register
When address bus is loaded with 00 then register F1 is
port mapped for read or write operation. The mapping
Fig. 1 Block Diagram of My Chip of register F2 has been done using address bus code
01.
The optimization of address bus has been done for the
Control Register: code 10 where RES register is mapped only for read
operation and control register only for write operation.
IE X X Mode X Op2 Op1 Op0
Status register has been portmapped for address bus
Fig. 2 Control Register format.
code 11.
B FLOATING POINT ARITHMETIC OPERATION
The flags of the control register are defined as:
B.1 Addition & Subtraction:
IE stands for Interrupt Enable. When this flag is low (0) Addition and Subtraction are performed
no interrupt is generated and when this flag is high (1) using module fp_ads. Steps to perform the
interrupt is generated under certain conditions. addition & subtraction operation are:

VLP0113-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Step 1: Check which exponent is bigger and At this point, all the contents of the chip registers are
shifts the mantissa of the smaller number till erased and the chip is ready afresh for a new
the difference between the two numbers is calculation/computation.
reached. If the exponents are equal, then both Step 2: To load the first operand onto the chip register
numbers mantissa’s are checked for the bigger mapping is required making read/write signal low and
one. the loading address bus with 00.
Step 2: Add the exponents of the two numbers. Step 3: To load the second operand onto the chip
If sign bit of both numbers is same otherwise read/write signal is kept low while the status of
we subtract them. Same operation is performed address bus is changed to 01 for the required register
with the mantissa of the input operands. mapping.
Step 3: The abnormality of negative exponents Step 4: To select the operation to be performed last
is resolved by shifting the required number of three bits of the control register are taken into account
bits to get the correct result. To see whether while the address bus indicates 11 and the read/write
result has encountered an overflow error signal is low.
boundary conditions are checked. Refer Table 2. For opcodes of various operations.
B.2 Multiplication Step 5: A start signal is generated by checking the
Multiplication of floating point numbers is F1F and F2F flag of the status register to commence
done by using module fp_mul. Steps for the the selected operation while the address bus shows 10
multiplication operation are as follows: and the read /write signal is low.
Step 1: When we multiply two numbers having Step 6: The confirmation of operation completion is
the same base their powers are added. Similarly checked by the status of the RF flag of the status
here we add the exponents of the two operands. register which should be high for successful
Step 2: Booth multiplication (shift and add) completion of operation while the read/write signal is
technique is employed to multiply the high and the address bus indicates 10.
mantissa’s of the two numbers along with the Step 7: The result of the arithmetic operation done is
‘hidden bit’. Mantissa multiplication result is viewed by checking the dataout signal while the
saved in a 49 bit temporary register. read/write signal is high and the address bus indicates
Step 3: Negative exponents abnormality is 11
removed to get the resultant number mantissa Step 8: The previous entered input values can be
and exponent. viewed by keeping the read/write signal high while
B.3 Division keeping address bus 00 for operand 1 and 01 for
Division operation is performed using module operand 2.
fp_div utilizing fixed point division technique. IV RESULTS AND DISCUSSIONS
Steps to divide p by q, both of n+1 bits are as A ADDITION
follow:
Step 1: Store the numbers p & q in temporary registers
p_temp & q_temp of 2n+1 bits each
respectively.
Step 2: Compare the values of p_temp & q_temp.
If p_temp > q_temp subtract q_temp from
p_temp and store 1 in the quotient register and
move to the next iteration.
If p_temp<q_temp store 0 in the quotient
register and move to the next iteration.
Step 3:After n+1 iterations quotient is saved in quotient
register and remainder is saved in p_temp.

There are three components used in this design:


i) fp_ads used for floating point addition and
subtraction operation.
ii) fp_mul used for floating point multiplication
operation.
iii) fp_div used for floating point division
operation.
III PROGRAMMING THE CHIP
Chip programming consists of a series of step which
must be followed for the efficient functioning of the
chip.Chip programming consists of the following steps: Fig. 4: Floating Point Addition Simulation
Step 1: Chip is made available for the floating point
arithmetic operations by making rst (reset) signal low.

VLP0113-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Table 5.
SIMULATION EXAMPLE FOR FP ADDITION.
Base 10 Sign Exponent Mantissa HEX
Bit Bits Bits Equivalent
F1 4444.44 0 1000 0001 0101 458AE385
1011 1100
0111 0000
101
F2 5555.56 0 1000 0101 1011 45AD9C7A
1011 0011
1000 1111
010
RES 10000 0 1000 0011 1000 461C3FFF
1100 1000
0000 0000
000
B SUBTRACTION

Fig. 6: Floating Point Multiplication Simulation


TABLE 7
SIMULATION EXAMPLE FOR FP MULTIPLICATION
Base 10 Sign Exponent Mantissa HEX
Bit Bits Bits Equivalent
F1 148.75 0 1000 0010 4314C000
0110 1001
1000
0000
0000 000
F2 1092.86 0 1000 0001 44889B85
1001 0001
0011
0111
0000 101
RES 162562.925 0 1001 0011 481EC0BB
0000 1101
1000
0001
0111 011

D DIVISION
Fig. 5:Floating Point Subtraction Simulation

TABLE 6:
SIMULATION EXAMPLE FOR FP SUBTRACTION.

Base Sign Expone Mantissa HEX


10 Bit nt Bits Bits Equivalent
F1 85.73 0 1000 0101 0111 42AB8517
0101 0000
1010 0011
111
F2 49.96 1 1000 1000 1111 C247D70A
0100 1010
1110 0001
010
RES 35.80 0 1000 0001 1110 420F3334
1100 0110
0110 0110
011

C MULTIPLICATION

VLP0113-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Fig. 7: Floating Point Division Simulation

Fig.8. Simulation of My_chip


V CONCLUSION
Floating point operations are widely used in the
digital signal processing applications and can be
implemented using PDPs (Programmable Digital
Processors). But a large amount of data processing is
required because of complex computations. This
affects the cost, speed and flexibility of the DSP
systems. In this paper floating point arithmetic
operations have been successfully simulated using
ModelSim .
Future Aspects of project
Future aspects should include the following:
1) Fast Fourier Transform computation.
2) Digital Signal Processing.
3) Infinite Impulse Response (IIR) and Finite
Impulse Response (FIR) filter design.
4) Digital Image Processing.

VLP0113-6
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

REFERENCES
[1] Digital System Design Using VHDL by Charles H. Roth
Jr.
[2] The Design Warrior’s Guide to FPGA by Clive ‘Max’
Maxfield.
[3] FPGA Based System Design by Wayne Wolf.
[4] A VHDL Primer by Jayaram Bhaskar.
[5] Circuit Design With VHDL by Volnei A. Pedroni.

VLP0113-7
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

A Complete CMOS Based Low Power Supply


Bandgap Voltage Reference Circuit
Implemented On TSMC 0.35-μm Process
Kshitij Bhargava#1, Kirmender Singh*2
ECE Department (Microelectronics And Embedded Technology)
Jaypee Institute Of Information Technology University, Noida
India
kbhargava3@gmail.com
1
kshitijbhargava.09305160@gmail.com
2
kirmender.singh@jiit.ac.in

Abstract— A complete CMOS based low power voltage/current reference on-chip becomes a non-
supply bandgap voltage reference circuit trivial task. Numerous approaches to achieve low
implemented on TSMC 0.35μm CMOS process voltage supply drift as well as low temperature drift
is presented in this paper. The designed circuit voltage reference have been proposed till date. But
employs a start-up circuit, a beta-multiplier most of them have used BJT devices implemented
circuit(PTAT circuit) and a MOS based in standard CMOS process to implement reference
differential amplifier. This circuit provides a circuits [1-3] which occupies large wafer area.
nominal reference voltage of 323 mV at 2V Moreover, some of the implementations using non-
supply voltage. Experimental results show that standard CMOS process require higher cost owing
the temperature coefficient is 1.16 ppm / ºC in to extra process steps [4-5] .
the temperature range from -20 ºC to +90 ºC. This paper presents a complete MOS based
The value of PSRR achieved without any bandgap voltage reference circuit with the same
filtering capacitor is -21dB at 10KHz. The area general working principle of positive and negative
occupied by the design is 0.027mm² and power temperature coefficient voltages nullifying each
consumption is 62.24μW at room temperature other to give a near about zero temperature
(25 ºC). coefficient reference voltage along with a suitable
technique to minimize the power supply
Keywords— Bandgap voltage reference, PTAT, dependence of this reference voltage[6].
CMOS, PSRR. The major parts of the circuit involves a start-up
circuit, a beta-multiplier circuit made up of NMOS
1. INTRODUCTION and PMOS current mirror circuits, and a differential
The high-precision voltage reference circuit is an amplifier to enhance power supply rejection
important component in mixed-mode applications. capability of the reference voltage.
A stable reference circuit provides a reliable Section II. describes the proposed voltage
reference voltage, and low supply voltage makes reference circuit design along with the detailed
the integration with low voltage analog and digital description of its subparts viz. start-up circuit, the
circuits possible. Such reference circuits should beta-multiplier circuit and the differential amplifier
exhibit little dependence on process, supply voltage, circuit.
and temperature variations (PVT). With steadily Section III illustrates the experimental results.
decreasing power supply voltages in deep Section IV concludes the paper.
submicron CMOS technologies, a design of any

VLP0114-1
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

2. Proposed Reference Circuit MU3 turns off. This is very important since the
start-up circuit should not obstruct the normal
operation of the beta-multiplier circuit(which is
explained in the next subsection).

2.2 Beta-Multiplier Circuit (PTAT Circuit)

The basic building block of any bandgap voltage


reference circuit is a current mirror circuit. The
proposed circuit shows a NMOS current mirror
stacked just below a PMOS current mirror. The
purpose of using such a configuration is explained
below.
To obtain the desired value of IPTAT current it
becomes very essential to be able to force the same
value of current through M1 and M2. This can be
achieved by using a PMOS current mirror. We can
write,
Figure 1: The Proposed Reference Circuit
VGS1=VGS2+IPTAT.Rout (1)
2.1 Start-up Circuit
And,
In any self-biased circuit, when the power supply
is just turned on, the current flowing in the circuit is IPTAT=(2/R²out.β1).[1-√β1/√β2]² (2)
zero. In this circuit at this moment the gates of M1
and M2 are at ground while that of M3 and M4 are Where,
at VDD. This forces the value of IPTAT current to β=μn .Cox. (W/L) (3)
be zero initially. But since this voltage reference
can be used as precision power supply voltage in The equation(1) holds good only if VGS1>
many analog circuits, this unwanted state of the VGS2. To ensure this we have to use a beta-
reference circuit can lead to undesired operating multiplier circuit which can efficiently increase the
points of the transistors. Thus, a start-up circuit is value of transistor gain ‗β‘ in M2, which is
required to turn on the transistors M1 and M2 in the generally achieved by simply increasing the width
initial moments of the circuit operation. of the transistor M2 such that W2 = K .W1. This
In the proposed circuit a start-up circuit has been will eventually help in achieving the desired value
used which consists of transistors MU1, MU2 and of IPTAT current even at low value of gate to
MU3. When the supply voltage VDD is just turned source voltage of M2.
on the gate of MU1 is at the zero potential and so it
is in the off state. On the other hand at this moment
2.3 Reference Voltage Generation Principle
the gate terminal of MU2 is somewhere between
VDD and VDD – Vth,p . The transistor MU3 acts The reference voltage is generated by adding up
like an NMOS switch and leaks the current from the two voltages one with positive temperature
gates of M3 and M4 into the gates of M1 and M2 coefficient and other with negative temperature
and produces the desired value of IPTAT right coefficient. The drop across resistor Rout i.e
from the starting of circuit operation. When all the VPTAT will provide a positive temperature
transistors gets settled to a stable operating points coefficient voltage and the drain-to-source voltage
this start-up circuit automatically stops functioning (VDS5) of a diode connected NMOS transistor M5
because MU1 starts conducting and due to this

VLP0114-2
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

will give a negative temperature coefficient voltage. Figure 2 : Differential Amplifier Circuit
These two opposite temperature coefficient voltages TABLE 1: Component Values Of Proposed
will give a reference voltage of very small Reference Circuit
temperature coefficient value. Mathematically, the
reference voltage can be expressed as, Component Values

VREF=VPTAT+VDS5 (4) MU1 50/2


MU2 10/20
And, MU3 10/1
M1 50/2
ƏVREF/ƏT=ƏVPTAT/ƏT+ƏVDS5/ƏT (5) M2 210/2
M3 100/2
2.4 Differential Amplifier M4 100/2
M5 2.85/0.35
To reduce the sensitivity of reference voltage to Rout 8k
the power supply variation (PSRR Improvement)
we need to reduce the variations in the drain-to-
source voltages of devices M1 and M2 with change
3. EXPERIMENTAL RESULTS
in VDD. For this purpose a MOS based
differential amplifier has been used whose output is The proposed temperature insensitive voltage
connected to the common gate terminal of M3 and reference circuit shown in Figure.1 generates a
M4. voltage of 323 mV at room temperature 25 ºC.
The differential amplifier compares the drain Figure.3 shows the reference voltage variation with
voltages of M1 and M2 and regulate them to temperature for the range -55 ºC to +125 ºC.
become equal. Figure.4 shows the reference voltage variation with
temperature at three different corner conditions viz.
fast corner(FF), typical(TT) and slow corner(SS).
This circuit operates at a low supply voltage of 2V
and the temperature coefficient of the reference
voltage is only 1.16 ppm/ ºC within the temperature
range of -20 ºC to +90 ºC and the value of PSRR is
-21dB at 10 KHz frequency. The power
consumption of the circuit is 62.24 μW. The area
occupied by the design on silicon wafer is 0.027
mm².

VLP0114-3
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

Figure 3: Reference Voltage Versus Temperature provide a stable reference voltage of 323mV within
Curve the temperature range -20ºC to +90ºC with the
power supply rejection value of -21dB at 10KHz
Table2: Performance Summary of the Proposed Hertz frequency. The proposed reference circuit
Design provides a stable reference voltage having very
small temperature drift. Such circuit can be used for
Parameter [7] [8] This Work applications which requires a stable voltage
Technology(μm) 0.6 0.5 0.35 reference such as MEMS based temperature sensors
Supply Voltage 1.4V 2.6V 2V and low dropout regulators.
Reference Voltage 0.309V 1.21V 0.323V
Temperature 36.9 613 1.16 REFERENCES
Coefficient(ppm/ºC) [1] Karel E. Kuijk, ―A Precision Reference Voltage
PSRR -47 dB -30 -21dB at Source‖ , IEEE Journal Of Solid-State Circuits,
at 100 dB at 10 KHz Vol. SC-8, No. 3, June 1973, pp. 222-226.
Hz 100 [2] Allen, P.E. & Holberg, D.R (2002). ―CMOS
Hz Analog Circuit Design‖. New York : Oxford.
Active Area(mm²) 0.055 0.045 0.027 [3] Matthew C. Guyton and Hae-Seung Lee, MIT ,
―Bandgap Current Reference‖ , March 2003.
[4] Lee, I., Kim G., & Kim, W. (1994)
―Exponential curvature compensated BiCMOS
bandgap reference‖ IEEE Journal Of Solid-
State Circuits, 29, 1396-1403.
[5] Malcovati, P.,Maloberti, F., Fiocchi, C., Pruzzi,
M. (2001). ―Curvature-compensated BiCMOS
bandgap with 1-V supply voltage‖, IEEE
Journal Of Solid-State Circuits, 36(7), 1076-
1081.
[6] Allen-Holberg, ―CMOS Analog Circuit
Design‖, Second Edition.
[7] Stair, R., Connelly, J.A. , & Pulkin M. (2000)
Figure 4 : Reference Voltage under the three ―A Current Mode CMOS Voltage Reference‖.
corner conditions In proceedings of Southwest Symposium on
Mixed-Signal Design (pp. 23-26)
4. CONCLUSIONS [8] Kimberly Jane S.Udy, Patricia Angela Reyes-
Abu and Wen Yaw Chung, ―A High Precision
A high precision temperature insensitive voltage
Temperature Insensitive Current And Voltage
reference circuit has been presented in this paper.
Reference Generator‖. In proceedings of
The circuit was designed using TSMC 0.35μm
World Academy Of Science, Engineering and
CMOS technology and experimental results were
Technology 2009.
illustrated. It shows that the proposed circuit can

VLP0114-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Performance Analysis of Carbon Nanotube FET


Harish Kr. Mishra1, S.P. Gangwar2, Dr. Harsh V. Singh3,
1
M.Tech. Student, Department of Electronics Engineering, KNIT, Sultanpur-228 118
2,3
Assistant Professor, Department of Electronics Engineering, KNIT, Sultanpur-228 118
Phone No. (+91)9415177465, (+91)9515763939 Email: rblharish@gmail.com, harshvikram@gmail.com

Abstract: We Study field-effect transistors based on individual single and multi-wall carbon nanotubes and analyzed
their performance. Transport through the nanotubes is dominated by holes and by varying the gate voltage;
we successfully modulated the conductance of a single wall device by more than 5 orders of magnitude.
Multi-wall nanotubes show typically no gate effect.
Keywords: Carbon nanotubes, Semiconductor, Singlewall Nanotube, Multiwall Nanotube, FET

Carbon nanotubes (NT) are a new form of carbon with unique


electrical and mechanical properties [1].They can be considered as 1
the result of folding graphite layers into carbon cylinders and may
be composed of a single wall nanotube ( SWNTs), or multiwall
nanotubes.( MWNTs).Depending on the folding angle and the Characteristics I – VSD of a device consisting of a single SWNT With
diameter, nanotubes can be metallic or semiconducting. a diameter of 1.6 nm for several values of the gate voltage. At VG5 0 V,
the I-VSD curve is linear with a resistance of R5 2.9 MV. For VG, 0 V,
The band gap semiconducting NTs decreases with increasing
diameter. In this paper we study on the fabrication and The I-VSD curves remain linear, whereas they become increasingly
performance of a SWNT-based FET and explore whether MWNTs nonlinear for VG at 0 V up to a point where the current becomes un
can be utilized as the active element of carbon-based FETs. Despite measurably small, indicating a controllable transition between a quasi
their large diameter, we find that structurally deformed MWNTs metallic and an insulating state of the NT. Figure 2 b shows transfer
may well be employed in NT-FETs. Based on the output and characteristics I – VG of our NT device for different source–drain
transfer characteristics of our NT devices. voltages.

The SWNTs used in our study were produced by laser ablation of The behavior is similar to that of a p-channel metal oxid
graphite doped with cobalt and nickel catalysts [7]. For cleaning, semiconductor FET [9]. The source drain current decreases strongly
the SWNTs were ultrasonically treated in anH2SO4/H2O2 solution. with increasing gate voltage, which not only demonstrates that the NT
MWNTs were produced by an arc-discharge evaporation technique device operates as a Feld Effect Transistor but also that transport
[8] and used without further treatment. The NTs were dispersed by through the semiconducting SWNT is dominated by positive carriers
sonication in dichlroethane and then spread on a substrate with pre holes.
defined electrodes. A schematic cross section of a NT device is
shown in Fig. 1. The conductance modulation of our SWNT-FET exceeds 5 orders of
magnitude. For VG, 0 V, the I – VG curves saturate indicating that the
They consist of either an individual SWNT or MWNT bridging two contact resistance RC at the metal electrodes starts to dominate the total
electrodes deposited on a 140 nm thick gate oxide film on a doped resistance R5 RNT 1 2 RC of the device. Here, RNT denotes the gate-
Si wafer, which is used as a back gate. The 30 nm thick Au dependent resistance of the NT. The saturation value of the current
electrodes were defined using electron beam lithography. For corresponds to RC' 1.1 MV. Similar contact resistances were previously
imaging, we used an atomic force microscope operating in the found for metallic SWNTs [4]. The origin of the holes is an important
noncontact mode. question to address. One possibility is that the carrier concentration is
inherent to the NT.
The source–drain current I through the NTs was measured at room
temperature as a function of the bias voltage VSD and the gate voltage
VG. Figure 2 a shows the output

FIG.1. Schematic cross section of the FET devices. A single NT of either


MW or SW type bridges the gap between two gold electrodes.

VLP0115-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

per 250 carbon atoms in the NT. For comparison, in graphite there
4
is only 1 hole per 10 atoms [11]. The large hole density suggests
that the NT is degenerate and/or that it is doped with acceptors, for
example, as a result of its processing [12].

FIG. 2. Output and transfer characteristics of a SWNT-FET: an I – VSD


curves measured for VG526, 0, 1, 2, 3, 4, 5, and 6 V. b I – VG curves for
VSD510– 100 mV in steps of 10 mV. The inset shows that the gate
Modulates the conductance by 5 orders of magnitude (VSD510 mV).

The higher work function of gold leads to the generation of holes


in the NT by electron transfer from the NT to the gold
Electrodes [2]. Assuming that the band-bending length in their
SWNT is neither very short nor very long, At VG50 V, the
Device is ‘‘on’’ and the Fermi energy is close to the valence-
band edge throughout the NT. If indeed the band-bending length
is comparable to the length of the SWNT, a positive gate voltage
would generate an energy barrier of an appreciable fraction of
eVG in the center of the tube since the gate/NT distance is shorter
than the source/drain separation. The threshold voltage VG,T
required to suppress hole conduction by depleting the tube
center would be determined by the thermal energy available
for overcoming this barrier. Thus, VG,T should be much lower
than the 6 V .

In this case, we expect a fairly homogeneous hole distribution


along the NT independent of the gate voltage. An
Estimate of the hole density can then be obtained by writing
the total charge on the NT as Q5 CVG,T , where C is the NT
capacitance and VG,T the threshold voltage necessary to
completely deplete the tube. The NT capacitance per unit
length with respect to the back gate is C /L' 2 pee0/ln(2h/r),
with r and L being the NT radius and length, and h and e the
10
thickness and the average dielectric constant of th device.
Using L 5 300 nm, r50.8 nm, h 5 140 nm, and e'2.5, we
evaluate a one-dimensional hole density of p5 Q/eL '9 3
6 2
10 cm from VG,T56 V. This value corresponds to about 1 hole
VLP0115-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

the dominant conduction process is hole transport. In contrast to the


SWNT device, this MWNT-FET could not be completely depleted. The
I – VSD curve remained linear independent of the gate voltage not
shown. Between VG52 35 and 25 V, the resistance increased only from
R5 76 to 120 kV, corresponding to a conductance modulation by about a
factor 2.

FIG.3. I – VG curve of a typical MWNT device curve A in comparison


with that of a collapsed MWNT of similar cross section curve B .

We can estimate the mobility of the holes from the transconductance


2
of the FET. In the linear region, it is given by dI/dVG 5 mh( C/L )VSD.
Subtracting the contact resistance. we obtain a NT transconductance
2
of dI/dVG5 1.7310 9 A/V at VSD510 mV, corresponding to a hole
2
mobility of 20 cm /V s. This value is close to the mobility in heavily
9
p-doped silicon of comparable hole density, but considerably smaller
4 2
than the 10 cm /V s observed in graphite[11].The low value of the
NT mobility is consistent with our initial assumption of diffusive
transport and suggests that the SWNT contains a large number of
scatterers, possibly related to defects in the NT or disorder at the
NT/gate–oxide interface due to roughness. Such deformations can
lead to local electronic structure changes,[13]which may act as
scattering centers.

The low mobility is surprising in view of the coherence length of


more than 1 mm reported on the basis of energy quantization a long a
metallic SWNT at low temperature[4].However, we note that there
have been no transport experiments on individual SWNTs that
provide evidence for ballistic transport at room temperature e.g., by
observing conductance
Quantization [1] Having demonstrated FET operation for a SWNT,
We move on to explore whether transport through MWNTs can
be controlled by a gate electrode. The band gap of NTs has
been predicted to decrease with increasing tube
diameter[1].Therefore, MWNTs with diameters of 10 nm or more are
expected to show metallic rather than semiconducting behavior at
room temperature. We study a number of MWNT devices with
resistances of R; 100 kV. Most of these devices showed no gate
action, and a typical I – VG
curve is plotted in Fig. 3 curve A.

Structural deformations of NTs change their electronic properties.


Curve B in Fig. 3 shows that this can lead to a significant gate effect
in MWNTs. As is the case for the SWNT-FET, the source–drain
current of this MWNT-FE decreases with increasing gate voltage, i.e.

VLP0115-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

slab-shaped geometry of the collapsed tube. Using L5 1.1 mm, r5 5


nm, and a threshold voltage of VG,T'8 V to deplete the bottom layer,
7 2
we obtain p' 1.7310 cm for its hole density.

FIG. 4. A Noncontact AFM image of the MWNT-FET. b and c

Close up views showing three twists in the collapsed Nanotube

The gate effect reaches a sharp maximum between VG


52 15 and 0 V [13].To explain this peculiar behavior, we
consider the AFM image, Fig. 4a of the MWNT-FET. The device
consists of a collapsed MWNT, which bridges the gap between
two Au electrodes separated by about 1 mm.

This nanostripe is 3 nm high from which we conclude that it has


four or five shells and it exhibits a number of twists at Figs. 4b
and c which allow us to determine its width to be 12 nm. Based
on the structural information summarized in Fig.4d, we propose
the following explanation for the behavior of the MWNT-FET.
Since the intershell interaction in MWNTs is weak, it is
reasonable to assume that transport is confined to the outermost
shell of the nanostripe [12].The conductance modulation of
about 2 indicates that the bottom ‘‘plate’’ of the outermost shell
is depleted by the gate, whereas the top layer is less affected
due to screening by the inner shells and the bottom layer as
long as it is conducting.

Our model implies that the bottom ‘‘plate’’ is decoupled from


the top layer, which may be the consequence of lateral
quantization effects perpendicular to the tube axis. Using R5
RNT1 2 RC for the ‘‘on’’ state (VG5215 V) and R 52 RNT1 2 RC
for the ‘‘off’’ state of the MWNT-FET (VG5 0 V), we estimate
a resistance of RNT532 kV for the outer shell of the NT and
deduce a contact resistance of RC 5 23 kV. Finally, we proceed
analogously to the SWNT-FET analysis to evaluate the hole
density and mobility of the collapsed MWNT. Numerical
calculations show that the capacitance per unit length is
reasonably well described by C/L 5 2 pee0/ln(2h/r) despite the

VLP0115-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

12 -H. He, J. Klinowski, M. Forster, and A. Lerf, Chem. Phys. Lett.


2
From the transconductance of dI/dVG5 3.53 10 8 V/A at VSD5 50 mV, 287, 53 1998
2
We estimate a mobility of mh' 220 cm /V s. The hole density is
similar to the SWNT but the mobility is higher, which suggests a 13 J. E. Fischer, H. Dai, A. Thess, R. Lee, N. M. Hanjani, D. L. Dehaas,
reduced number of scatterers. This may arise from the fact that the and R. E. Smalley, Phys. Rev. B 55, R4921 -1997
MWNTs were not ultrasonically treated in acids. Furthermore, they
do not deform as much as SWNTs in order to conform to roughness
3
at the NT/gate–oxide interface.

Conclusion: Transport in the Nanotubes is dominated by holes and,


at room temperature, it appears to be diffusive. Using the gate
electrode, the conductance of a SWNT-FET could be modulated by
more than 5 orders of magnitude. An analysis of the transfer
characteristics of the FETs suggests that the NTs have a higher carrier
density than graphite and a hole mobility comparable to heavily p-
doped silicon. Large-diameter MWNTs show typically no gate effect,
but structural deformations can modify their electronic structure
sufficiently to allow FET behavior.

References :

1- M. S. Dresselhaus, G. Dresselhaus, and P. C. Eklund, Science of


Fullerenes and Carbon Nanotubes Academic, San Diego,1996

2 - J. W. G. Wildo¨er, L. C. Venema, A. G. Rinzler, R. E. Smalley,


and C. Dekker, Nature ~London! 391, 59 ~1998!.

3 -T. W. Odom, J.-L. Huang, P. Kim, and C. M. Lieber, Nature


London 391, 62 ~1998.

4-S. J. Tans, M. H. Devoret, H. Dai, A. Thess, R. E. Smalley, L. J.


Geerligs,and C. Dekker, Nature ~London! 386, 474 ~1997

5-M. Bockrath, D. H. Cobden, P. L. McEuen, N. G. Chopra, A. Zettl,


A. Thess, and R. E. Smalley, Science 275, 1922 ~1997

6-S. J. Tans, A. R. M. Verschueren, and C. Dekker, Nature


~London! 393,49 ~1998

7-T. Guo, P. Nikolaev, A. Thess, D. T. Colbert, and R. E. Smalley,


Chem. Phys. Lett. 243, 49 ~1995

8-D. T. Colbert, J. Zhang, S. M. McClure, P. Nikolaev, J. H. Hafner,


D. W. Owens, P. G. Kotula, C. B. Carter, J. H. Weaver, A. G.
Rinzler, and R. E.Smalley, Science 266, 1218 ~1994

9-S. M. Sze, Physics of Semiconductor Devices ~Wiley, New York,


1981

10 -This expression was inferred from P. M. Morse and H. Feshbach,


Methods of Theoretical Physics ~McGraw–Hill, New York, 1953

11 - N. B. Brandt, S. M. Chudinov, and Ya. G. Ponomarev,


Semimetals, 1. Graphite and its Compounds ~North-Holland,
Amsterdam, 1988
VLP0115-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011

POWER AWARE PHYSICAL MODEL FOR EMBEDDED


SYSTEMS
Asstt Prof Yasmeen Hasan
Mtech(Electronic Circuits &Systems (VLSI))
DEPT OF ECE, INTEGRAL UNIVERSITY, LUCKNOW
Email: yasmeen.hasan9@gmail.com
Abstract- In this work we have proposed a exponential rise in heat density is creating
geometric model that is employed to devise vast difficulties in reliability and
a scheme for identifying the hotspots and manufacturing costs. At any power
zones in a chip. These spots or zone need to dissipation level, heat being generated must
be guarded thermally to ensure be removed from the surface of the
performance and reliability of the microprocessor die, and for all but the
embedded system. The model namely lowest-power designs today, these cooling
continuous unit sphere model has been solutions have become expensive. For high-
presented taking into account that the 3D performance processors, cooling solutions
region of the system is uniform, thereby are rising at $1–3 or more per watt of heat
reflecting on the possible locations of heat dissipated [3, 8], meaning that cooling costs
sources and the target observation points. are rising exponentially and threaten the
computer industry‟s ability to deploy new
The experimental results for the – systems.
continuous domain establish that a region
which does not contain any heat sources Thermal aware floorplanning[6]
may become hotter than the regions reduces the on chip hotspot by a
containing the thermal sources. Thus a significant amount through lateral
hotspot may appear away from the active spreading. In the traditional design
sources, and placing heat sinks or cooling methodology, worst case assumption are
system near the active thermal sources used to ensure that the system operates
alone may not suffice to tackle thermal normally in all corner cases, which
imbalance. results in excessive design margin by
imposing extreme design constraints.
Keywords:Embeddedsystems,continuous With the shift in design paradigm, worst
model,floorplanning,Finemesh(FM),Corse case assumptions and post design
mesh(CM),Hotspots etc. solutions are no longer sufficient to
2: Introduction address thermal and power issues. It has
become important to take into
In recent years, power density in consideration right from the starting and
microprocessors has doubled every three address them at all levels of design
years [1,2,3], and this rate is expected to cycle.
increase within one to two generations as
feature sizes and frequencies scale faster In this paper we have proposed a geometric
than operating voltages [4,7]. Because model which is employed to devise a
energy consumed by the microprocessor is scheme for identifying the hotspots in an
converted into heat, the corresponding embedded system.We propose a model here

VLP0201-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

which may facilitate in identifying the hot amount of heat from S received within the
spots/zones in a VLSI chip. In the unit sphere centered at the point T. This unit
continuous domain we have used the is the same as that of the distance between S
concept of a unit sphere model to calculate and T, and may be related to the minimum
the local thermal effect at a point due to the dimension of the chip. The cumulative heat
heat being dissipated from several point received at the point T is evaluated as the
heat sources distributed over the chip linear superposition of the amounts received
plane. We establish that a point on a chip at T from all heat – generating sources on
can become very hot due to the conduction the chip.
effects of other heat sources, although it may As illustrated with Fig. 2, let a heat source at
not have a heat source in its immediate a point S generate an amount Q, henceforth
vicinity. In this model, the heat loss due to denoted as the strength of the source S. Let
radiation has been ignored. If it is to be the target point T be at a Euclidian distance
considered, an appropriate heat loss function d from S. Let CT and Cs intersect at the two
has to be incorporated points A and B.
Then the area cut out on the surface of the
sphere CS is equal to the product of solid
angle with its vertex at the center of the
sphere Cs and the square of the sphere‟s
radius
A …. (1)

Fig1: SIDEVIEW OF A TYPICAL


PACKAGE[9]

2.1: Time Invariant Heat Sources

The study is made with the assumption that


there are constantly active (i.e. always on)
heat generating sources placed randomly
throughout the chip .For continuous
thermal sources; we also assume that the
heat from the sources is being propagated
through the 3D surface of the chip without
being dissipated in the ambience. The Fig 2: Unit Sphere Model of Heat Received at a
objective is to identify the zones in the Point T
chip, which have heat content greater than a
certain threshold. Where formed by the conical surface of the
3.2: Continuous Spatial Domain
spherical sector and d is the radius of the
The position of a heat source may be any source sphere.
point on the chip which is assumed to be an A complete sphere forms a solid angle of 4
embedded system. In the unit sphere model,
the contribution of a point heat source S at If the
any target point T is expressed as the solid angle is not formed by the entire sphere, but only

VLP0201-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

by a conical surface of a spherical sector, the angle in 1=21-cosθ)


this case is equal to the ratio of the sector‟s spherical
surface to the square of the sphere‟s radius [5].) By … (6)
denoting the plane angle at the vertex of the spherical Putting eqn (6) in eqn (5) we get
sector as θ, it is possible to express its height h as The contribution of heat from S to T is
… (2) =Q
where r is the radius of the source sphere. ... (7)
Therefore the spherical area of the sector
can be represented as
Our concerns are the hottest points on the chip.
Intuitively, the source points definitely belong to the
A= above class. But the more pertinent question is
… (3) whether these are the only points that need to be
considered. The question may be re-phrased as
follows: does there exist any non-source point on
the floor with heat content greater than that of any
of the source points?

The observations reported, answer in the


affirmative. Before we proceed further, we point out
two special cases of the unit sphere model based on
the distance d between S and T:

0.5<d<1 and (2) 0<d<0.5

Fig 3: Section of a cone and a spherical cap


inside a sphere

By denoting the solid angle which subtends


the spherical surface of the sector as we
obtain
… (4)
Thus the contribution of heat from S at T is
Q Case1:1/2<d<1 Case
… (5) 2:0<1/2<d
Where is the surface area of the sphere S. Fig 4: Special cases of the unit sphere
model

Consider OCTB in the figure 10 In the boundary case when S lies on CT is


(CTB)2 = (OB)2 +(OCT)2 equal to , as SAT becomes an equilateral
( 1)2 = (d(1-cosθ))2 + triangle
2 2
d sin θ Q= (1-cosθ)

VLP0201-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

NO OF THRESH TOTA PRO PROB HOTS HOT %HOT %HOT


SOUR OLD L BES E POT SPOT SPOT SPOT structure(8 in cuboids and 16 in octagonal
CES PROB POIN POIN IN IN IN IN prism).
VALUE ES TS TS FM CM FM CM
POIN IN IN CM 4: EXPERIMENTAL RESULTS
TS FM Using the Continuous Domain formula we
try to find the hottest spot along a given
2029 2916 2000 0.45
5 1.24876 9198 2562 0.19% direction in a 3D structure. Here we had
160 0 000 %
taken cuboids of different dimensions and
2058 5832 2000 1679 0.81
10 1.24338 8376 0.41% placed an active sources of unit strength at
320 0 000 8 %
each of its 8 vertices. The target points are
2116 1166 2000 4223 1204 1.72
20 1.26821 0.68% taken along the longest diagonal.
640 40 000 8 8 %
2233 2332 2000 9397 1903 4.21 RESULTS FOR THE CONTINUOUS
40 1.26441 0.82%
280 80 000 2 0 % DOMAIN
2291 2916 2000 1182 2596 5.16
50 1.20101 1.13%
600 00 000 62 9 % We performed more experiments in the
TABLE 1: RESULTS FOR THE CONTINUOUS continuous domain model implemented in C
DOMAIN
to simulate the effect of active sources
placed at random points on the 3D floor.
Q= (1- cos)
Keeping the
dimensions of the 3D structure the same we
=
varied the number of sources from 5 to
… (8)
50.We have studied five trail runs, keeping
Hence in case (1) the angle 2θ as defined
the number and range of the power strength
earlier will be greater than, and
of the active sources fixed, just allowing the
consequently more than of the heat
position of the sources to vary.
emanating from S reaches the unit sphere
We actually considered a fine grid around
centered at T. In case (2) T is nearer to S
each source point and evaluated the
and hence the sphere with radius „d
cumulative power at each of those points
„around S will now lie entirely within the
along with the source points. Also across the
Unit sphere at T. Hence the unit sphere CT
whole
receives the entire heat of S in this case.
floor we considered a relatively coarse grid
Using the formula derived above we
and evaluated the power at all the grid points
calculated the cumulative heat received at
of this coarse grid.
each point along the diagonal joining any
The formula derived from the unit sphere
two vertices of the geometrical 3D
model has been used for the calculation. In
structure taken into consideration .In this
table 1 we have reported our results. The
work we proceeded by taking a regular
threshold value is the minimum of the total
cuboid and a regular Octagonal prism
power at the active source points including
structure. We worked out with the formula
the contribution from all other sources.
by taking the above mentioned 3D
structures of different dimensions. While
5: CONCLUSION
proceeding with this approach we consider
In this work we have proposed a model in
the medium throughout the geometrical
the continuous domain to model the thermal
structure as isotropic.
behavior in an embedded system. The
An active source of unit strength (Q0=1) was
hotspots were usually concentrated near the
placed at each of the vertices of the 3D

VLP0201-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

active source points, but some points away Circuits”, Proc. Of IEEE Int. conf. on
from the source were found to be much Computer Aided (ICCAD), pp. 124-127,
hotter than the sources itself. The 1999.
randomness of the source did not affect the [5] Solid Angle “, on the Wikipedia, the
result much. One important aspect we have free encyclopedia Website.
observed in all the models is that there are [6] T. Sherwood, E. Perelman, and B.
zones in the chip which become much hotter Calder. Basic block distribution analysis to
even without containing a heat source. We find periodic behavior and simulation points
conclude that it may not be enough to guard in applications. In Proc. PACT, Sept. 2001.
only the active regions to make the chip [7] SIA. International Technology Roadmap
thermally stronger. This also requires the for Semiconductors,2001.
need for more efficient power and thermal [8] S. Gunther, F. Binns, D. M. Carmean,
management techniques and J. C. Hall. Managing the impact of
increasing with three stacked channels,”
References Microelectron, 1991
[1] S. Borkar. Design challenges of microprocessor power consumption. Intel
technology scaling. IEEE Micro, pp. 23–29, Tech. J., Q1 2001.
Jul.–Aug. 1999. [9] Fig: 1.From: K. Skadron,S.Velusam,�
[2] G. Roos, B. Hoefflinger, M. Schubert, K. Sankaranarayanan and D. Tarjan.
and R. Zingg, “Manufacturability of 3D- “Temperature-Aware
epitaxial-lateral-overgrowth CMOS circuits Microarchitecture”.Published in the
[3] R. Mahajan. Thermal management of Proceedings of the 30th International
CPUs: A perspective on trends, needs and Symposium on Computer Architectures,
opportunities, Oct. 2002. Keynote June 9–11, 2003 in San Diego, California,
presentation,THERMINIC-8. USA.
[4] Y.K.Cheng and S.M.Kang, “An Efficient
Method for Hot-spot Identification in ULSI

VLP0201-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Efficient Power Utilisation By Controlling


Industrial And Home Appliances Using GSM and
Microcontroller
Raj Singh Yadav* and Nidhi Mishra**
*B.Tech IIIrd Year, **Assistant Professor
Krishna Institute of Engineering and Technology
Electronics and Communication Department
Ghaziabad-201206, India
yadav.raj31@gmail.com
nidhi.mishra9@gmail.com

Abstract:- In the present IT age, we are in The motivation was to make possible the users
need of fully automatic system for remotely to automate their homes having universal
controlling and monitoring appliances. This access. The home appliances control system
paper mainly focuses on the remotely with an reasonable cost was thought to be built
controlling the industrial and home that should be mobile providing remote access
appliances and making efficient utilisation of to the appliances. There was a need to
power supply[1]. This system is SMS based automate home and industry so that user can
using GSM (Global System for Mobile take advantage of the advancement in such a
Communication) and uses a wireless way that a person getting off the office does not
technology. It provides an perfect solution to get melted with the hot climate. The motive of
the problem faced by home owner when they this paper is to propose a system that allows
forget to switch off their home appliances user to be control home appliances universally
while going out of home. It is one of the via SMS using GSM technology and make a
emerging and new application of GSM efficient utilisation of power supply. A design
technology. It is of great use for efficient and implementation of SMS based control for
utilisation of power in industry and cutting monitoring systems is proposed in[2]. This
down the electric bill. Here we are paper has three modules involving sensing unit
representing a design of a stand alone for monitoring the complex applications, a
embedded system that can monitor and processing unit that was microcontroller and a
control different appliances installed at communication module that used GSM module
industries and home using built-in input and or cell phone. The primary health-care
output pheripherals. Basically this system management for the rural population is
allows the home owner and industry owner explored in [3]. Providing PHC services to the
to control and monitor their appliances rural population by the use of the mobile web-
remotely via mobile phone by sending technologies was prposed in the paper [3]. The
command in form of SMS message and system above involves the use of SMS and cell
receiving the appliances current status. The phone technology for information management,
software used for simulation is ecllispse with transactional exchange and personal
a java run time environment. communication. Internet and wireless
communications have been utilized in home
Keywords- GSM , SMS, Signal Processing automations [6-8].
and Embedded System . In this paper , I have tried to
implement a method in which a
I. INTRODUCTION acknowledgement from receiver could be
received without any additional cost. It would
The objective of this paper is to control home be beneficial on the user aspect to receive a
appliances remotely and reduce the power feedback from the receiver.
wastages by providing cost effective solution.

VLP0202-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

II. HOME APPLIANCE CONTROL of commands to turn a specific appliancels


SYSTEM WITHOUT FEEDBACK ON/OFF [5]. The working of this system can be
explained as:- Microcontroller, GSM module
We proposed home and industrial and Mobile phone.
appliance control system based on GSM Microcontroller being the main component
network technology for transmission of SMS has home appliances control system installed on
from sender to receiver. The GSM network it. Appliances control is responsible for
provides full duplex link to support the user everywhere access of appliances. Systems work
requirement[4] SMS sending and receiving is on GSM technology for transmission of
used for universal access of appliances and commands from sender to receiver.
allowing remotely monitoring and controlling
the appliances at home. The home appliance GSM module is a plug and play device and is
control system consists of mainly three attached with the help of port RS232 to the
following components:- microcontroller, GSM Microcontroller which then communicates with
module and mobile device. Microcontroller is the Microcontroller via this port. GSM module
used for storing software program coding on is like a link responsible for enabling/ disabling
which the system is functioning. GSM module of SMS capability.
is used for receiving the message from the user.
Mobile device is used for sending the command Mobile device with a GSM sim
which has to be performed by the communicates with the GSM module via radio
microcontroller. waves. The method of communication is
wireless and mechanism works on the GSM
III.PROPOSED PAPER WITH technology. Cell phone has an authorised SIM
FEEDBACK SYSTEM card and a GSM subscription. Sender transmits
In this proposed paper, the system is capable instructions via SMS and the system takes
enough to give feedback to user about the action against those instructions.
condition of the home appliance according to
the user‟s needs and requirements. The current
status of the appliances can be checked. The IV. CONSTRAINTS OF HOME
working of feedback system can be explained APPLIANCES
with help of below fig.[1] CONTROL SYSTEM
The system functionality is based on GSM
technology and microcontroller and it needs a
power supply so the technological constraints
must be kept in mind. The system is helpless to
power failure but this disruption can be avoided
by attaching the voltage source thus allowing
users to avail the great advantage of this
system.

V. RESULTS AND SIMULATION

The result of the system can be explained as:-


Fig:- Diagram for Home Appliances control The system will check various GSM hardware
system with feedback [1] tests and will run to check the all the hardware
component support. The system then opens the
This system has basically two units. They serial port RS232 for communication with the
are transmitter and receiver unit with a GSM module. On successful port opening the
feedback system. The message consists of a set

VLP0202-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

system communicates with the GSM Module  Remote Controlling capability of the
but there is no communicate if the run fails. system allowed user to switch on/off
and check the status through simulating
The system checks support for battery the appliance as directed by the
level, signal strength, GSM Module and other incoming SMS.
components by SMS sending and receiving
capability. If these tests succeed the system  The system automatically performed tests
gives response of „Ok‟, if it fails then „ERROR‟ and checked support for available
is returned. The remote user sent SMS with features, hardware and SMS sending
security code (as defined in the program code) and receiving capability and configured
from a cell phone on the home appliances system accordingly.
control system to turn on/off the specified
appliance and the system performed the The program code is written using high
respective function by simulating the appliance level language like C, C++ and the compiler
on/off as directed by the user. converts it into machine code and it is stored in
microcontroller . The software used is ecllipses
with a java run time enviroment. The code is
Appliances SMS System Feedback transferred from the computer to
send by Response Message microcontroller with help of USB port,
User (current USBtiny and RS232 device. The compiler used
status) is AVRdude. The program code can be edited
Air AC on AC AC on and compiled using the ecllipse software . The
conditioner AC off button AC off sender and receiver GSM number with the
simulated security code is defined in the program code.
to on/off
Light Light on Light Light on VI. CONCLUSION
Light button Light off
off simulated In the paper low cost, secure, universally
to on/off accessible, remotely controlled with a feedback
Fan Fan on Fan Fan on solution for automation of homes has been
Fan off button Fan off introduced[1]. The target of achieving the
simulated control over home appliances remotely using
to on/off the SMS-based system is possible by this
system. GSM technology capable solution has
Fig. Results of home appliances control proved to be controlled remotely, provide home
system with feedback response[1]. automation and is cost-effective as it can reduce
the electric bill by efficient utilisation of the
home appliances. The appliances are used only
Achieved analytical results:- when they are required. It is of great use for the
industrial appliances also. Hence we can
 System allowed the provision of security conclude that the required objectives and goal
such that system took no action against of home appliances control system have been
the instructions received from SMS achieved.
without security code or if the SMS
received is from unregistered number.
The required task was performed only VII. FUTURE DIRECTION
when the SMS with correct security
code instructed the system. The basic level of home appliance
control and remote monitoring with feedback
has been implemented. In case of remote

VLP0202-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

monitoring other home appliances can also be Appliances Monitoring & Control
monitored and controlled such that if the level System IEEE. Pp.237
of temperature rises above certain level then it
should generate SMS or sensors can also be 6) Liang, Li-Chen Fu and Chao-Lin W, “An
applied that can detect gas, smoke or fire in integrated, flexible, and Internet-based
case of emergency the system will control architecture for home
automatically generate SMS. automation system in the Internet era”,
In future the system will be small box The IEEE Proceedings of the
containing the microcontroller and GSM International Conference on Robotics
Module with a reduced size. and Automation, Volume: 2,2002, pp:
1101 -1106.

REFERENCES 7) W. Qinglong, F.Y. Wang and; L Yueton,


“A mobile-agent based distributed
1) Tahmina Begum, Md. Shazzat Hossain, intelligent control system architecture
Md. Bashir Uddin and Md. Shaheen for home automation”, The IEEE
Hasan Chowdhury “Design and International Conference on Systems,
Development of Activation and Man, and Cybernetics”, Volume: 3, 200
Monitoring of Home Automation 1, pp: 1599 - 1605.
System via SMS through
Microcontroller” in 2009 International 8) R. Shepherd “Bluetooth wireless
Conference on Computers and Devices technology in the home”, Electronics &
for Communication Communication Engineering Journal,
V. 13, I. 5, Oct 2001, pp: 195 -203.
2) B. Ciubotaru-Petrescu, D.Chiciudean,
R.Cioarga, D. Stanescu. “Wireless
Solutions for Telemetry in Civil
Equipment and Infrastructure
Monitoring” in 3rd Romanian
Hungarian Joint Symposium on Applied
Computational Intelligence (SACI) May
25-26, 2006.

3) Z. Alkar, U. Buhur, (2005). “An Internet


Based Wireless Home Automation
System for Multifunctional Devices” in
IEEE Consumer Electronics, 51(4),
1169-1174.

4) A.Alheraish, W. Alomar, and M. Abu-


Al-Ela “Programmable Logic
Controller System for Controlling and
Monitoring Home Application Using
Mobile Network” in IMTC 2006 -
Instrumentation and Measurement
Technology Conference Sorrento, Italy
24-27 April 2006 , pp. 469

5) A.R. AI-Ali & M. AL Rousan . M.


Mohandes GSM-Based Wireless Home

VLP0202-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0202-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Design and Implementation of Radix-2 & Radix-4


Booth Multipliers Using VHDL
S. S. Chauhan1, S.C. Yadav2, A. R. Khan3
Graphic Era University (E&CE Deptt.) 1, 2, 3
sudakarnith@gmail.com, subhash.yadav775@gmail.com, abdulamu@gmail.com

Abstract This paper presents Low power consumption and the benefit of constant operation speed irrespective of the
smaller area are some of the most important criteria for the size of’ the multiplier. The clock speed is only determined
fabrication of DSP systems and high performance systems. by the digit size which is already fixed before the design is
Optimizing the speed and area of the multiplier is a major implemented.
design issue. However, area and speed are usually conflicting
constraints so that improving speed results mostly in larger
areas. In this paper, we try to determine the best solution to 2. THE BASIC TRANSVERSAL FILTER
this problem by comparing a few multipliers.
This project presents an efficient implementation of high An N-Tap transversal was assumed as the basis for this
speed multiplier using the shift and add method, Radix_2, adaptive filter. The value of N is determined by practical
Radix_4 modified Booth multiplier algorithm. In this paper considerations. An FIR filter was chosen because of its
we compare the working of the three multiplier by stability. The use of the transversal structure allows
implementing each of them separately in Transversal FIR relatively straight forward construction of the filter, as
filter. shown in figure 1.
Index Terms-Transversal FIR Filter, Booth algorithms,
VHDL, Xilinx.

1. INTRODUCTION

Multipliers are key components of many high


performance systems such as FIR filters, microprocessors,
digital signal processors, etc. A system’s performance is
generally determined by the performance of the multiplier
because the multiplier is generally the slowest clement in Figure 1: Transversal FIR Filter
the system. Furthermore, it is generally the most area
consuming. Hence, optimizing the speed and area of the As the input, coefficients and output of the filter are all
multiplier is a major design issue. However, area and assumed to be complex valued, and then the natural choice
speed are usually conflicting constraints so that improving for the property measurement is the modulus, or
speed results mostly in larger areas. As a result, a whole instantaneous amplitude. If y (k) is the complex valued
spectrum of multipliers with different area-speed filter output, then |y(k)| denotes the amplitude. The
constraints has been designed with fully parallel. convergence error p (k) can be defined as follows:
Multipliers at one end of the spectrum and fully serial Aykpk−=)(
multipliers at the other end. In between are digit serial where the A is the amplitude in the absence of signal
multipliers where single digits consisting of several bits degradations. The error p (k) should be zero when the
are operated on. These multipliers have moderate envelope has the proper value, and non-zero otherwise.
performance in both speed and area. However, existing The error carries sign information to indicate which
digit serial multipliers have been plagued by complicated direction the envelope is in error. The adaptive algorithm
switching systems and/or irregularities in design. Radix is defined by specifying a performance/cost/fitness
2^n multipliers which operate on digits in a parallel function based on the error p (k) and then developing a
fashion instead of bits bring the pipelining to the digit level procedure that adjusts the filter impulse response so as to
and avoid most of’ the above problems. They were minimize or maximize that performance function.
introduced by M. K. Ibrahim. These structures are iterative Yk = 10iNi=−=Σwk (i) xk-i
and modular. The pipelining done at the digit level brings

VLP0301-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

The gradient search algorithm was selected to simplify the data quantizes, etc. One typical AC (multiply-accumulate)
filter design. The filter coefficient update equation is given architecture is illustrated in figure. It consists of
by: multiplying 2 values, then adding the result to the
WK+1 = WK – μ eK XK previously accumulated value, which must then be
Where XK is the filter input at sample k, eK is the error term restored in the registers for future accumulations. Another
at sample k = pk . yk and μ is the step size for updating the feature of MAC circuit is that it must check for overflow,
weights value. which might happen when the number of MAC operation
is large. This design can be done using component because
3. MULTIPLIERS we have already design each of the units shown in figure.
However since it is relatively simple circuit, it can also be
3.1. BINARY Multiplier designed directly. In any case the MAC circuit, as a whole,
can be used as a component in application like digital
A Binary multiplier is an electronic hardware device filters and neural networks
used in digital electronics or a computer or other electronic
device to perform rapid multiplication of two numbers in 3.3. Architecture OF A RADIX 2^n Multiplier
binary representation. It is built using binary adders.
The rules for binary multiplication can be stated as The architecture of a radix 2^n multiplier is given in
follows the Figure. This block diagram shows the multiplication of
(i) If the multiplier digit is a 1, the multiplicand is two numbers with four digits each. These numbers are
simply copied down and represents the product. denoted as V and U while the digit size was chosen as four
(ii) If the multiplier digit is a 0 the product is also 0. bits. The reason for this will become apparent in the
For designing a multiplier circuit we should have following sections. Each circle in the figure corresponds to
circuitry to provide or do the following three things: a radix cell which is the heart of the design. Every radix
It should be capable identifying whether a bit 0 or 1 cell has four digit inputs and two digit outputs. The input
is. digits are also fed through the corresponding cells. The
It should be capable of shifting left partial dots in the figure represent latches for pipelining. Every
products. dot consists of four latches. The ellipses represent adders
It should be able to add all the partial products to which are included to calculate the higher order bits. They
give the products as sum of partial products. do not fit the regularity of the design as they are used to
It should examine the sign bits. If they are alike, the “terminate” the design at the boundary. The outputs are
sign of the product will be a positive, if the sign bits again in terms of four bit digits and are shown by W’s. The
are opposite product will be negative. The sign bit 1’s denote the clock period at which the data appear.
of the product stored with above criteria should be
displayed along with the product. From the above
discussion we observe that it is not necessary to
wait until all the partial products have been formed
before summing them. In fact the addition of
partial product can be carried out as soon as the
partial product is formed.
Binary multiplication (eg n=4)
p=a×b
an−1 an−2…. a1 a0
bn−1 bn−2…. b1 b0
pn−1 pn−2…. p1 p0
where a – multiplicand, b– multiplier, p – product
xxxx a
xxxx b
---------
x x x x b0a20
xxxx b1a21
xxxx b2a22
xxxx b3a23
--------------- Figure 2: Radix 2n multiplier architecture
xxxxxxxx p
3.4. BOOTH MULTIPLIER
3.2. Multiply Accumulate Circuit
The decision to use a Radix-4 modified Booth
Multiplication followed by accumulation is an algorithm rather than Radix-2 Booth algorithm is that in
operation in many digital systems, particularly those Radix-4, the number of partial products is reduced to n/2.
highly interconnected like digital filters, neural networks, Though Wallace Tree structure multipliers could be used

VLP0301-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

but in this format, the multiplier array becomes very large III Shift X circular right shifts because this will prevent us
and requires large numbers of logic gates and U V X X-1 from using
interconnecting wires which makes the chip design large 0000 0000 1100 0 two registers
and slows down the operating speed. 0000 0000 0110 0 for the X
0000 0000 0011 0 value.
3.5. BOOTH MULTIPLICATION ALGORITHM:

(a) Booth Multiplication Algorithm for radix-2

Booth algorithm gives a procedure for multiplying U V X X-1


Binary integers in signed –2’s complement representation. 0000 0000 1100 0
We will illustrate the booth algorithm with the following 0000 0000 0110 0
example: 0000 0000 0011 0
Example: 2ten*(-4) ten 1110 0000 0011 0
0010two*1100two 1111 0000 1001 1
Step 1: Making the Booth table Repeat the same
I. From the two numbers, pick the number with the steps until the four
smallest difference between a series of consecutive cycles are completed.
numbers, and make it a multiplier.i.e., 0010 -- From 0 to 0
no change, 0 to 1 one change, 1 to 0 another change, so
there are two changes on this one 1100 -- From 1 to 1 no
change, 1 to 0 one change, 0 to 0 no change, so there is
Shift only
only one change on this one. Therefore, multiplication of 2
x (– 4), where 2ten (0010two) is the multiplicand and (– 4)ten
(1100two) is the multiplier.
II. Let X = 1100 (multiplier) Let Y = 0010 (multiplicand)
Take the 2’s complement of Y and call it –Y
–Y = 1110
III. Load the X value in the table.
IV. Load 0 for X-1 value it should be the previous first
least significant bit of X Add-Y (0000+1110) = 1110
V. Load 0 in U and V rows which will have the product of
X and Y at the end of operation. Shift only
VI. Make four rows for each cycle; this is because we are
multiplying four bits numbers.
U V X X-1
U V X X-1
0000 0000 1100 0 Load the value
0000 0000 1100 0
1st cycle
0000 0000 0110 0
2nd cycle
0000 0000 0011 0
3rd cycle
1110 0000 0011 0
4th cycle
1111 0000 1001 1
Step 2: Booth Algorithm Shift only
1111 1000 1100 1
Booth algorithm requires examination of the multiplier
bits, and shifting of the partial product. Prior to the shifting, We have finished four cycles, so the answer is shown,
the multiplicand may be added to partial product, in the last rows of U and V which is: 11111000two.
subtracted from the partial product, or left unchanged Note: By the forth cycle; the two algorithms have the
according to the following rules: same values in the product register.
Look at the first least significant bits of the multiplier “X”,
and the previous least (b) Booth Multiplication Algorithm for radix-4:
significant bits of the multiplier “X - 1”.
I 0 0 Shift only One of the solutions of realizing high speed multipliers is
1 1 Shift only. to enhance parallelism which helps to decrease the number
0 1 Add Y to U, and shift of subsequent calculation stages. The original version of
1 0 Subtract Y from U, and shift or add (-Y) to U the Booth algorithm (Radix-2) had two drawbacks. They
and shift are:
II Take U & V together and shift arithmetic right shift (i) The number of add subtract operations and the number
which preserves the sign bit of 2’s complement number. of shift operations becomes variable and becomes
Thus a positive number remains positive, and a negative inconvenient in designing parallel multipliers.
number remains negative. (ii) The algorithm becomes inefficient when there are
isolated 1’s. These problems are overcome by using

VLP0301-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

modified Radix-4 Booth algorithm which scan strings of


three bits with the algorithm given below:
1) Extend the sign bit 1 position if necessary to ensure that
n is even.
2) Append a 0 to the right of the LSB of the multiplier.
3) According to the value of each vector, each Partial
Product will be 0, +y, -y, +2y or -2y.
The negative values of y are made by taking the 2’s
complement and in this paper Carry-look-ahead (CLA)
fast adders are used. The multiplication of y is done by
shifting y by one bit to the left. Thus, in any case, in
designing a n-bit parallel multipliers, only n/2 partial
products are generated.

In case of parallel multipliers, the total area is much less


than that of serial multipliers. Hence the power
consumption is also less. This is clearly depicted in our
results. This speeds up the calculation and makes the
Table1: Radix-4 modified Booth Algorithms scheme for odd values of i. system faster. While comparing the radix 2 and the radix 4
booth multipliers we found that radix 4 consumes lesser
X(i) X(i-1) X(i-2) y power than that of radix 2. This is because it uses almost
0 0 0 +0 half number of iteration and adders when compared to
0 0 1 +y radix 2.When all the three multipliers were compared we
0 1 0 +y found that array multipliers are most power consuming
0 1 1 +2y and have the maximum area. This is because it uses a large
1 0 0 -y number of adders. As a result it slows down the system
1 0 1 -y because now the system has to do a lot of calculation.
1 1 0 -2y Multipliers are one the most important component of
1 1 1 +0 many systems. So we always need to find a better solution
in case of multipliers. Our multipliers should always
consume less power and cover less power. So through our
4. RESULTS & CONCLUSION project we try to determine which of the three algorithms
Number of Slices 229 works the best. In the end we determine that radix 4
modified booth algorithm works the best.
Number of 4 input LUTs 300
Number of bounded input 16
REFRENCES
Number of bounded output 16
CLB Logic Power 47mW 1. Y. C. Lim, “Single-Precision Multiplier with Reduced Circuit
Complexity for Signal Processing Applications, ” IEEE Trans.
This paper gives a clear concept of different multiplier and Computers, vol. 41, no. 10, pp. 1333-1336, Oct. 1992.
2. J. Isoaho, J. Pasanen, O. Vainio, and H. Tenhunen, “DSP System
their implementation in tap delay FIR filter. We found that Integration and Prototyping with FPGAs,” Journal of VLSI Signal
the parallel multipliers are much option than the serial Processing, Vol. 6, pp. 155-172, 1993.
multiplier. We concluded this from the result of power 3. S. S. Kidambi, F. El-Guibaly, and A. Antonious, “Area-Efficient
consumption and the total area. The power consumption Multipliers for Digital Signal Processing Applications, ” IEEE Trans.
Circuits and Systems-II: Analog and Digital Signal Processing, vol.
for radix-2 and radix-4 multiplier as shown on Table 2 and 43, no. 2, pp. 90-95, Feb. 1996.
Table 3 respectively. 4. J. E. Stine and O. M. Duverne, “Variations on Truncated
Multiplication,” in Proc. Euromicro Symposium on Digital System
Table 2: Results of Radix-2 multiplier Design, 2003, pp. 112-119.
5. C. Ebeling, C. Fisher, G. Xing, M. Shen, and H. Liu, “Implementing an
Number of Slices 130 OFDM Receiver on the RaPiD Reconfigurable Architecture,” IEEE
Number of 4 input LUTs 249 Trans. on Computers,Vol. 53, No. 11, pp. 1436-1448, 2004.
Number of bounded input 16 6. Xilinx Staff, “Celebrating 20 years of innovation,” Xcell Journal, No.
48, Spring 2004.
Number of bounded output 17 7. S. Knapp, “Using Programmable Logic to Accelerate DSP Functions,”
CLB Logic Power 79mW http://www.xilinx.com/appnotes/dspintro.pdf

Table 3: Results of Radix-4 multiplier

Multiplier output

VLP0301-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

A Novel Approach to Design of a Multiplier Using


Reversible Logic Gates
S. S. Chauhan1, S.C. Yadav2, A. R. Khan3
Graphic Era University (E&CE Deptt.) 1, 2, 3
sudakarnith@gmail.com, subhash.yadav775@gmail.com, abdulamu@gmail.com

Abstract Reversible logic gates are very much in demand for Due to these restrictions, synthesis of reversible circuits
the future computing technologies as they are known to can be carried out from the inputs towards the outputs and
produce zero power dissipation under ideal conditions. This vice versa.
paper proposes an improved design of a multiplier using
reversible logic gates. Multipliers are very essential for the
2. BACKGROUND OF REVERSIBLE CIRCUITS
construction of various computational units of a quantum
computer. The quantum cost of a reversible logic circuit can
be minimized by reducing the number of reversible logic An n×n reversible circuit consists of n inputs and n
gates. For this two 4*4 reversible logic gates called a DPG outputs with mapping of each input assignment to a unique
gate and a BVF gate are used. output assignment and vice versa. Also in the synthesis of
reversible circuits direct fan-out is not allowed as
Index Terms- Reversible logic circuits; Quantum computing; one–to-many concept is not reversible. However fanout in
Nanotechnology. reversible circuits is achieved using additional gates. A
reversible circuit should be designed using minimum
1. INTRODUCTION number of reversible logic gates.
Reversible logic has received great attention in the
recent years due to their ability to reduce the power A. Reversible Gates and Circuits
dissipation which is the main requirement in low power
VLSI design. Quantum computers are constructed using There are two main types of reversible gates: Toffoli [3]
reversible logic circuits. It has wide applications in low and Fredkin [4]. An n×n Toffoli gate passes the first (n-1)
power CMOS and Optical information processing, inputs to outputs unaltered (as control signals) and for the
quantum computation and nanotechnology. R. Landauer last output the nth input inverts (as target signal) if all the
[1] demonstrated that high technology circuits and previous (n-1) signals are „1‟. Assuming xi as
systems constructed using irreversible hardware result in input and yi as output, then [3]:
loss of one bit of information dissipates KTln2 joules of yi= xi 1< i < n-1
energy where K is the Boltzmann‟s constant and T is the yn= xn + (x1,x2….xn)
absolute temperature at which the operation is performed. Toffoli Gate: A 3*3 Toffoli gate [3] as shown in figure 1.
The heat generated due to the loss of one bit of information The input vector is I (A, B, C) and the output vector is O (P,
is very small at room temperature but when the number of Q, R). The outputs are defined by P=A, Q=B, R=AB xor C.
bits is more as in the case of high speed computational Quantum cost of a Toffoli gate is 5.
works the heat dissipated by them will be so large that it
affects the performance and results in the reduction of
lifetime of the components. Furthermore, Bennett [2]
showed that reversible circuits do not lose information due
to the one-to-one mapping between inputs and outputs;
Fig.1 Toffoli Gate
hence no extra energy loss.
In the design of reversible circuits two restrictions should A Toffoli gate with one (two) input(s) is also known as
be considered: NOT (CNOT or Feynman) gate respectively.
Fan-out is not permitted Fredkin Gate: A 3*3 Fredkin gate [4] as shown in figure
Loops are not permitted 2. The input vector is I (A, B, C) and the output vector is O
(P, Q, R). The output is defined by P=A, Q=A′ B xor AC
and R=A′ C xor AB. Quantum cost of a Fredkin gate is 5.

VLP0302-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

0 1 0 1 0 1 0 1
0 1 1 0 0 1 1 1
0 1 1 1 0 1 0 0
1 0 0 0 1 1 1 0
1 0 0 1 1 1 0 1
Fig.2 Fredkin Gate 1 0 1 0 1 1 1 1
1 0 1 1 1 1 0 0
BVF Gate: A 4 * 4 BVF gate as shown in figure 3. This is
1 1 0 0 1 0 0 1
a reversible double XOR gate and can be used for
1 1 0 1 1 0 1 1
duplication of the required inputs to meet the fan-out 1 1 1 0 1 0 0 0
requirements. The input vector is I (A, B, C, D), the output 1 1 1 1 1 0 1 0
vector is O (P, Q, R, S) and the output is defined by P = A,
Q = A xor B, R = C and S = C xor D. Quantum cost of a B. REVERSIBLE GATES IMPLEMENTED USING
BVF gate is 2. In the proposed design this gate is used to ELEMENTARY QUANTUM GATES
copy the operand bits and it is shown that the number of
gates required to copy is reduced by 50% with same Reversible implementations of 3×3 Toffoli, Peres and
quantum cost. Fredkin gates using elementary quantum gates are
shown
in figure 6, figure 7, and figure 8 respectively.

Fig.3 BVF Gate V V V+


++
Peres Gate: A 3*3 Peres gate [10] as shown in figure 4. Fig.6 Implementation of the 3×3 Toffoli gate [11]
The input vector is I (A, B, C) and the output vector is
O (P, Q, R). The output is defined by P = A, Q = A xor
B and R=AB xor C. Quantum cost of a Peres gate is
4. In the proposed design Peres gate is used
because of its lowest quantum cost.

V V V+
++
Fig.7 Implementation of the 3×3 Peres gate [12]

Fig.4 Peres Gate


Double Peres gate: A Double Peres Gate as shown
in figure 5. The inputs and outputs are as shown in
Table-1.The full adder using DPG is obtained with
C=0 and D= Cin and its quantum cost is calculated to
be equal to 6 from its quantum realization [11] shown V V V+
in figure 5. ++

Fig.8 Implementation of the 3×3 Fredkin gate [11, 13]

3. PARALLEL MULTIPLIERS

There are two types of multipliers which are known as


sequential and parallel multipliers. The first type
Fig.5 Double Peres Gate iteratively computes the final product. It needs to use
feedbacks and loops to compensate for the iterative
Table 1 Truth Table of Double Peres Gate
portion. This design is too slow and not suitable for the
Inputs Outputs reversible implementation. The second type (i.e., parallel
A B C D P Q R S multiplier), conventionally, consists of two main steps:
0 0 0 0 0 0 0 0
Partial product generation
0 0 0 1 0 0 1 0
Multi-operand addition
0 0 1 0 0 0 0 1
0 0 1 1 0 0 1 1 Algorithm 1 (The n×n parallel multiplier):
0 1 0 0 0 1 1 0 Inputs: Two n-bit operands
X: xn-1…….. x1, x0 , Y: yn-1…….. y1, y0

VLP0302-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Output: A 2n-bit product Z: zn-1…….. z1, z0


I. Generate n partial products
P: pin-1…….. pi1, pi0 where, 0 < i < n-1
Such that pij = xj* yi
II. Produce the final product Z= Σ pi
where, 0 < i < n-1
The operation of a 4*4 reversible multiplier is shown in
figure 9. It consists of 16 Partial product bits of the X and
Y inputs to perform 4 * 4 multiplications. However, it can
extended to any other n * n reversible multiplier.

Fig.11 Fan-out circuit to duplicate the operand bits


x3 x2 x1 x0
y3 y2 y1 y0 Partial Product
Generation
3.2 Multi-operand Addition (MOA)
p03 p02 p01 p00
p13 p12 p11 p10
As discussed in previous section, next step is an noperand
p23 p22 p21 p20
addition. To implement this part of circuit, we use carry
p33 p32 p31 p30 Multi-
Operand
save adder (CSA). The CSA tree reduces the four
Addition operands to two. Thereafter, a Carry Propagating Adder
z7 z6 z5 z4 z3 z2 z1 z0
(CPA) adds these two operands and produces the final
8-bit product. The proposed four operand adder shown in
Fig.9 The operation of the 4×4 parallel multiplier
figure 12 uses Double Perer Gate (DPG ) gate as a
reversible full adder and Peres gate as half adder.
3.1 Partial Product Generation

Partial products can be generated in parallel using 16


Peres gates as shown in figure 10.

Fig.12 Four-operand Addition

The proposed reversible multiplier circuit uses 8


reversible DPG gates and 4 Peres gates. The Peres gate
half adder
has quantum cost of 4 and the DPG adder has quantum
cost of 6 and the total quantum cost of this circuit is 64.

4. RESULTS & DISCUSSION


Fig.10 Partial product generator using Peres gates
An important point that should be considered is that in We have encountered three different designs for
an n×n parallel multiplier (in reversible logic) for reversible multipliers in literatures where all of them, for
generating partial products in parallel, n copies of each bit the sake of simplicity, have implemented their design for a
of the operands are needed. Therefore, some fan-out gates 4-bit multiplier. Therefore, here in this section, we
are needed. The number of fan-out gates needed for the compare our proposed multiplier with prior counterparts
reversible 4×4 multiplier is 24. It uses 4*4 BVF gates with based on the 4-bit reversible multiplier. In order to have a
two constant inputs as shown in figure 11. reasonable comparison, first, we examine the detailed
implementation of the previous works. Next, compare the
proposed design based on the quantum cost, and the
number of garbage outputs with the previously mentioned
cases as follows:

VLP0302-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

a half-adder for the multi-operand addition phase. Table-2


A. Reversible 4-bit multiplier of [10] gives the comparative study of partial product generation
of the circuit.
For the partial product generation phase of their multiplier, TABLE.2 Partial product generation
they used 24 gates of 2×2 Toffoli (TOF2), for preparing
the essential fan-outs. Moreover, 16 Fredkin gates are used Partial No No of Quantum
so as to generate the partial products. For the Product of Garbage cost
multi-operand addition phase they used three 4-bit binary generation gates outputs QC
adders, where each of them is composed of 4 TSG, plus N GO
and extra TSG for the generation of the most significant bit Proposed 20 32 88
of the final product. TSG [10] 40 32 104
By and large, the overall gate consumption of their MKG [11] 40 32 88
reversible multiplier is equal to (24×TOF2) + HNG [12] 40 32 88
(16×Fredkin) + (13×TSG). The overall critical path of
Table-3 gives the comparative study of multi-operand
their multiplier consists of two TOF2, a Fredkin gate, and
addition of the proposed design with other existing
seven TSG gates. Unfortunately, there is no reference for
designs.
how the TSG can be implemented. Moreover, there is
nothing mentioned in [14] about how a TSG can be built TABLE.3 Multi-operand addition
(MOA)
by means of elementary 2×2 reversible/quantum gates. For
Multi-operand No No of Quantum
the sake of a fair comparison we assume the QC, and GO
of a TSG gate as equal as that of a fulladder. Nevertheless, addition of Garbage cost
we believe that the QC, and GO of a TSG gate are much (MOA) gates outputs QC
more than that of a FA. N GO
Proposed 12 20 62
B. Reversible 4-bit multiplier of [11] TSG [10] 13 26 130
MKG [11] 12 24 120
For the partial product generation phase of their HNG [12] 12 20 64
multiplier, like that of [10], they used 24 gates of TOF2 for
preparing the necessary fan-outs. Moreover, 16 Peres Table-4 Comparative study of different reversible
gates are used in order to generate the partial products. multipliers as shown in Table-4.
For the multi-operand addition phase they used 12 TABLE.4 Comparative study of different reversible
MKG gates where a MKG gate is a 4×4 reversible gate. multipliers
Therefore, the overall gates used in their reversible Reversible No No of Quantum
multiplier is (24×TOF2) + (16×Peres) + (12×MKG). The multipliers of Garbage cost
overall critical path of their multiplier consists of two gates outputs QC
TOF2 gates, a Peres gate, and seven MKG gates. As the N GO
case for TSG, there is also no reference for the Proposed 40 50 150
implementation of the MKG. Therefore, although we TSG [10] 13 26 130
believe that the QC, Depth, and GO of a TSG gate is much MKG [11] 52 56 208
more than that of a FA, we assume, for the sake of a fair HNG [12] 53 58 234
comparison, the QC, Depth, and GO of a MKG gate the
same as that of a full-adder.
From the above study in our opinion the proposed design
C. Reversible 4-bit multiplier of [12] is better when compared to the other existing designs as
the total circuit cost is much less compared to the other
This multiplier and that of [11] are somehow the same designs.
except for the multi-operand addition phase which is
implemented in [12] by means of 8 HNG gates along with 4. CONCLUSION
four Peres gates. This modification leads to the following
critical path: (2×TOF2) + (2×Peres) + (6×HNG). Multiplier is a basic arithmetic cell in computer arithmetic
units. Furthermore, reversible implementation of this unit
D. The proposed reversible 4-bit multiplier is necessary for quantum computers. For this purpose,
various designs can be found in the literature. We
In the proposed design for the partial product generation proposed in this paper a novel reversible multiplier, no
phase, like those of [11] and [12], we take advantage of the increase in quantum cost or the number of garbage outputs
Peres gates in order to generate the partial products. For with respect to previous counterparts. In proposed design,
the multi-operand addition phase as is shown in Fig. 15, partial products were generated using Peres gates. Next,
we use 8 full-adders and 4 halfadders. The critical path of the final product was obtained using a multi-operand adder
this new design consists of two TOF2 plus a Peres gate for including CSA tree and carry propagate addition,
the partial product generation phase and 5 full-adders plus
REFERENCES

VLP0302-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

[1] R. Landauer, "Irreversibility and heat generation in the computing


process", IBM J. Res. Develop., Vol. 5, pp. 183–191, July 1961.
[2] C.H. Bennett, “Logical Reversibility of Computation”, IBM
Research and Development, pp. 525-532, November 1973.
[3] T. Toffoli, "Reversible computing", MIT, Tech. Rep., 1980.
[4] E. Fredkin and T. Toffoli, “Conservative logic,” Int‟l J. Theoretical
Physics, Vol. 21, pp.219–253, 1982
[5] A. Peres, “Reversible logic and quantum computers”, Physical
Review A, Vol 32, pp. 3266-3276, 1985.
[6] J. A.Smolin and D. P.DiVincenzo, “Five Two-Bit Quantum Gates are
Sufficient to Implement the Quantum Fredkin Gate”, Physical
Review A (Atomic, Molecular, and Optical Physics), Vol. 53, No. 4,
pp. 2855-2856, April 1996.
[7] D. Maslov, G. W. Dueck and D. M. Miller, “Simplification of Toffoli
Networks via Templates”. Proc. 16th Symposium on Integrated
Circuits and Systems Design, pp. 53-58, September 2003.
[8] W. N. N. Hung, X. Song, G. Yang, J. Yang and M. A Perkowski,
“Quantum Logic Synthesis by Symbolic Reachability Analysis”,
Proc. 41st annual conference on Design automation DAC,
pp.838-841, January 2004.
[9] D. Maslov, C. Young, D. M. Miller, and G. W. Dueck, “Quantum
Circuit Simplification Using Templates”, Proc. Design Automation
and Test in Europe (DATE), Vol 2, pp.1208-1213, March 2005.
[10] H. Thapliyal and M.B. Srinivas, “Novel Reversible Multiplier
Architecture Using Reversible TSG Gate”, Proc. IEEE International
Conference on Computer Systems and Applications, pp. 100-103,
March 2006.
[11] M. Shams, M. Haghparast and K. Navi, “Novel Reversible
Multiplier Circuit in Nanotechnology”, World Applied Science
Journal Vol. 3, No. 5, pp. 806-810, 2008.
[12] M. Haghparast, S. Jafarali Jassbi, K. Navi and O.Hashemipour,
“Design of a Novel Reversible Multiplier Circuit Using HNG Gate
in Nanotechnology”, World Applied ScienceJournal Vol. 3 No. 6, pp.
974-978, 2008.
[13] M.S. Islam et al., “Low cost quantum realization of reversible
multiplier circuit”, Information technology journal, 8 (2009) 208.

VLP0302-5
VHDL environment for floating point Arithmetic Logic Unit - ALU
design and simulation
1 2 4
Rajit Ram Singh Vinay Kumar Singh 3poornima shrivastav Dr. GS Tomar
singhrajitram@gmail.com
VINDHYAIndore- India
vinay.singh@tatatechnologies.com
TATA Motors Ltd. Luck now -India
shrivastava.poornima@gmail.com
MIST Gwalior -India

ABSTRACT operations. The ALU is a fundamental


VHDL environment for floating point arithmetic and building block of the central processing unit
logic unit design using pipelining is introduced; the (CPU) of a computer. e inputs to the ALU are
novelty in the ALU design.Pipeling provides a high the data to be operated on (called operands)
performance ALU. Pipelining is used to execute and a code from the control unit indicating
multiple instructions simultaneously. In top-down which operation to perform. Its output is the
design approach, four arithmetic modules, addition, result of the computation.
subtraction, multiplication and division are combined to
form a floating point ALU unit. Each module is divided In many designs the ALU also takes or generates as
into sub- modules. Two selection bits are combined to inputs or outputs a set of condition codes from or to a
select a particular operation. Each module is status register. These codes are used to indicate cases
independent to each other .all modules in the ALU such as carry-in or carry-out, overflow, divide-by-zero,
design are realized using VHDL, design functionalities etc. Floating Point Unit also performs arithmetic
are validated through VHDL simulation .all operations between two values, but they do so for
components and module is successfully run, Synthesis numbers in floating point representation. And the ALU
and Simulation in the Xilinx12.1i software. with floating point operations is called a FPU.

Keywords: ALU- Arithmetic Logic Unit, Top-Down  Top-down approach (is also known as step-
design, Validation, Floating point, Test- wise design) is essentially the breaking down
Vector\ of a system to gain insight into its
compositional sub-systems. In a top-down
I.INTRODUCTION approach an overview of the system is
formulated, specifying but not detailing any
Floating point describes a system for representing
first-level subsystems. Each subsystem is then
numbers that would be too large or too small to be
refined in yet greater detail, sometimes in
represented as integers. Floating point representation is many additional subsystem levels, until the
able to retain its resolution and accuracy compared to entire specification is reduced to base
fixed point representation. Numbers are in general
elements. A top-down model is often specified
represented approximately to a fixed number of
with the assistance of "black boxes", these
significant digits and scaled using an exponent. The
make it easier to manipulate. However, black
base for the scaling is normally 2, 10 or 16. The typical
boxes may fail to elucidate elementary
number that can be represented exactly is of the form:× mechanisms or be detailed enough to
exponent
Significant digits × Base realistically validate the model
 In order to stimulate a device off board, a
e series of logical vectors must be applied to the
S ×B
device inputs. These vectors are called test
IEEE 754 standard for floating point representation vectors and are mostly used to stimulate the
in 1985. Based on this standard ,floating point design inputs and check the outputs against the
representation for digital system should be platform – expected values.
independent and data are interchanged freely among
different digital systems.  An pipeline is a technique used in the design
of computers and other digital electronic
 Arithmetic logic unit (ALU) is a digital circuit devices to increase their instruction throughput
that performs arithmetic and logical
(the number of instructions that can be
executed in a unit of time).
The fundamental idea is to split the processing of a Table1: select ALU operation.
computer instruction into a series of independent steps,
with storage at the end of each step. This allows the
computer's control circuitry to issue instructions at the
processing rate of the slowest step, which is much faster
than the time needed to perform all steps at once. The
term pipeline refers to the fact that each step is carrying Output status
data at once (like water), and each step is connected to 0000 Normal operation
the next (like the links of a pipe.)The origin of 0001 Overflow
pipelining is thought to be the IBM Stretch
0010 Underflow
project(1954) .Implementing pipeline requires various
phases of floating point operations be separated and be 0100 Result zero
pipelined into sequential stages. 1000 Divide by zero
We propose VHDL environment for floating point
ALU design and simulation. To ease the description,
v. Clock pulse is only provided to the module
verification, simulation and hardware realization.
which is selected using demux.
VHDL is widely adopted standard and has numerous
vi. Concurrent processes are used to allow
capabilities that are suited for designs of this sort .the processes to run in parallel hence pipelining
use of VHDL for modeling is especially appealing since
it provides formal description of the system and allows
the use of specific description styles to cover the
different abstraction levels(architectural, register ,
transfer and logic level) employed in design .

II MATERIAL AND METHODS

The main objective of this paper is to describes the


implementation of pipelining in design the floating -
point ALU using VHDL.. the sub-objective s are to
design a 16-bit floating point ALU operating on the
IEEE 754 standard rd .floating point representations
,supporting the four basic arithmetic operations;
Fig:1 top level view of the ALU design
addition, subtraction, multiplication and division
.second sub-objective is to model the behavior of the
ALU is separated into smaller modules:
ALU design using VHDL.
addition,subtraction,moltiplication and division,demux
Specifications for a 16-bit floating-point ALU design.
and mux.each arithmetic module is further divided into
i. Input A and B and output result are 16-bit binary
smaller modules .the top level view of fig.1 shows the
floating point.
top level view of the ALU .it consist of four functional
ii. Operands A and B operate as follows
arithmetic modules, three demultiplexes and two
A (operation) B=results
multiplexers. the demuxs and muxes are used to route
Operation can be addition (+), subtraction (-),
input operands and the clock signal to the correct
Multiplication (*), division (/)
functional modules .they also route outputs and status
iii. ‘Selection’ a 2-bit input signal that selects
signals based on the selector pins.
ALU operation and operate as shown in table1.
iv. Status a 4-bit output signal work as a flag an
microprocessor.

Selection Operation

00 Addition
01 Summation
10 Multiplication
11 Division
Fig: 2 view of selection of a add module Tab:1 setting zero check bit

Align module
After a module
In this module completes
operations its
are task, outputs
perform basedand status on status signal from previous stage zero operands are
signals are sent to the muxes where they multiplexes Checked in the align module as well this module
with other outputs from corresponding modules to introduces implied into the operands shown in table.
produce output result selector pins are routed to these
Zero_a1 a_sign Implied Implied
muxes such that only the output from currently
xor bit for a bit for b
operating functional module is sent to the output port.
zero_b1
Clock is specifically routed rather then tied
0 X(do’t 0 0
permanently to each module since only the selected
care)
functional modules need clock signals. This provides
power savings since the clock is supplied to the 1 1 0 1
required modules only and avoid invalid results at the 1 0 1 0
output since the clock is used as a trigger in every
process.
Tab:2 setting of implied bit
Pipelining floating point addition module:
Add_ sub module
Addition module has two 16 bit inputs and one16 bit This module performs actual addition and subtraction
output selection input is used to enable or disable the of operands. Firstly operands are checked via the status
module this module is further divided into 4 sub signals are carried out results are automatically
modules zero check, align, add_ sub and normalize obtained if either of the operand are zero shown in table
module. 3 normalization is needed if no calculation are done
here the operation is done based on the science and the
relative magnitude of mantissa i.e. summaries in table 4
status signal is set to one is indicate the need of
normalization by the next stage

Zero_a2 Zero_a1 Zero_a2 Result


&zero_b2 xor
zero_b1
0 0 X Perform
Fig: 3 pipeline floating point addition add_sub
0 1 1 b stage2
Zero check modules: 0 1 0 a stage2
1 X X 0
This module detects zero operands early in the
operation and based on the detection result it has two
status signals. This eliminates the need of sub sequent Tab:3 check for add_sub module
processes to check for the presence of zero operands
table 1 summarize the algorithm Operation a_sign a>b Result Sign
xor
I/P a I/P b Zero_a1 Zero_b1
b_sign
0 0 1 1
a+b 0 X a+b +ve
0 NZ 1 0 (-a)+(-b) 0 X a+b -ve
NZ 0 0 1 a+(-b) 1 Yes a-b +ve
NZ NZ 0 a+(-b) 1 No b-a -ve
(-a)+b 1 Yes a-b -ve
(-a)+b 1 No b-a +ve

Tab: 4 add_sub operation


Normalize module operand is zero and report the result accordingly.stage2
determines the product sign, add exponents and
Input is normalize and packed into the IEEE 754 multiply fractions.stage3 normalize and concatanitate
floating point representation if the normalize status the product.
signal is set normalization is perform otherwise MSB is
dropped .

Pipeline floating point subtraction module:

Subtraction module has two 16-bits inputs and one 16-


bit output. Selection input is used to enable/ disable the
entity depend on the operation. This module is divided
further into four sub-modules: zero-check aligns
add_sub and normalize module. The subtraction
algorithm differs only in the add_sub module where the
subtraction operator changes the sign of the result. the
reaming three modules are similar to those in the Fig 4. Pipeline structure of multiplication module
addition module.tab5 and tab 6 summarize the
operation
Check-zero module
Initially two operands are checked to determine
Zero_a2 Zero_a2 Zero_a b_sign Result sign whether they contain a zero .if one of the operand is
&zero_b2 xor 2 zero ,the zero_flag is set to 1 .the output results zero. if
zero_b2 neither of them is zero then the inputs with IEEE 754
0 0 X X Perform NA format is unpacked and assigned to the check sign, add
add_sub exponent and multiply mantissa modules, the mantissa
0 1 1 0 b_stage2 b_sign=1 is packed with hidden bit 1.
0 1 1 1 b_stage2 b_sign=0
0 1 0 X a_stage2 a_sign
1 X X X 0 NA
Add exponent module
The module is activated if the zero flag is set .else zero
is passed to the next stage and exp_flag is set to 0,two
extra bit are added the exponent indicating overflow
Tab: 5 checks for add_sub module
and underflow.
Operation a_sign a>b Result sign
Multiply mantissa module
xor
In this stage zero_flag is checked first. if the zero_flag
b_sign
is set to 0,then no calculation and normalization is
(-a)-b 1 X a+b -ve
performed. The mant_flag is set to 0 if both the
a-(-b) 1 X a+b +ve operands are nonzero after the multiplication is done
(-a)-(-b) 0 Yes a-b -ve mant_flag is set to 1 to indicate that this operation is
(-a)-(-b) 0 No b-a +ve executed.
a-b 1 Yes a-b +ve
a-b 1 No b-a -ve Check sign module
This module determines the product sign of two
Tab: 6 add_sub operation and sign fixing operands .the product is positive, when the two
operands have the same sign; otherwise it is negative.
Pipelined floating point multiplication module The sign bit are compared using XOR circuit. the
sign_flag is set to 1
Multiplication entity has three 16-bit inputs and two 16- .
bit outputs. Selection input is used to enable/disable the Normalize and concatenate module
entity. multiplication module is divided into check-zero,
check-sign, add-exponent and normalize –and- This module checks the overflow and underflow occurs
concatenate all modules, which are executed if the 9th bit is 12.overeflow occurs if the 8th bit is 1.if
concurrently .status signal indicates special result cases exp_flag, sign_flag and mant_flag are set, the
such as overflow, underflow and result zero, in this normalization is carried out. Otherwise, 16-zero bits are
project pipelined floating point multiplication is divided assigned to the result.
in to three stages(fig-4).stage1 checks whether the
During the normalization operation, the mantissa MSB In this stage zero_flag is checked first. if the zero_flag
is 1, hence no, normalization is needed. the hidden bit is is set to 0,then no calculation and normalization is
dropped and the reaming bit is packed and assigned to performed. The mant_flag is set to 0 if both the
the output port .normalization module set the mantissa operands are nonzero after the multiplication is done
MSB to 1.the current mantissa is shifted left until 1 is mant_flag is set to 1 to indicate that this operation is
encountered .foe each shift the exponent is decreased executed.
by 1,if the mantissa MSB is 1,normalization is
completed and first bit is the implied bit dropped. The Check sign module:
remaining bits are packed and assigned to the output
port. The final normalization product with the correct This module determines the product sign of two
biased exponent is concatenated with product sign. operands .the product is positive, when the two
operands have the same sign; otherwise it is negative.
the sign bit are compared using XOR circuit. the
Pipelined floating point division module sign_flag is set to 1.

Division entity has three 16-bit inputs and two 16-bit Align dividend module:
outputs. Selection input is used to enable or disable the
entity. Division module is divided into six modules: This module compares both mantissas. if mant_a is
check zero, align, dividend check sign, subtract greater than or equal to the msant_b then the mant_a
exponent, divide mantissa and normalize concatenate must be aligned .for every bit right shift of the mant_a
modules. Each module is executed concurrently. Status mantissa ,the mant_a exponent is then increased by
indicates the special cases such as overflow, underflow, 1.this increase may result in an exponent overflow, in
and result zero and divides by zero. Fig shows the this case an overflow flag is set. Otherwise, the process
pipeline structure of the division module. continues with the parallel operation of exponent
subtraction and mantissa division. Align_flag is set to 1.

Subtract exponent module

This module is activated if the zero flag is set. if not


,zero value is passed to the next stage and exp_flag is
set to 0.two extra bits are added to the exponent to
indicate overflow .here two exponents are subtracted
.the bias is added back. after this the exp_flag is set to
1.
Divide mantissa module

In this stage ,align flag is checked first. if align flag is 0


Fig: 5 pipeline structure of the division module
then no mantissa division is performed .mant_flag is
set to 0.if both operand are not zero, mant_a is divided
Check-zero modules: by mant_b .in division algorithm ,comparison between
two mantissa is done by subtracting the two values and
Initially two operands are checked to determine checking the output sign.
whether they contain a zero .if one of the operand is
zero, the zero_flag is set to 1 .the output results zero. If III .SIMULATION AND DISCUSSION
neither of them is zero then the inputs with IEEE 754
format is unpacked and assigned to the check sign, add Design is verified through simulation, which is done in
exponent and multiply mantissa modules, the mantissa a bottom –up fashion .small modules are simulated in
is packed with hidden bit 1. separate test benches before they are integrated and
tested as a whole.
Add exponent module: Align RTL1:
The module is activated if the zero flag is set .else zero
is passed to the next stage and exp_flag is set to 0,two
extra bit are added the exponent indicating overflow
and underflow.
Multiply mantissa module:
Simulation Result of Align:

Demux wave:

RTL of Demux:
Multiplexer:

RTL division:
Simulation result of Mux:

Iv COCLUSION

By simulating with various test vectors the proposed


approach of pipeline floating point proposed approach
RTL of division: Of pipeline floating point ALU design using VHDL is
successfully designed, tested and implemented
currently, we are conducting further research that
consider the further reduction in the hardware
complexity in terms of synthesis and fully download the
code into Altera FLEXIOK: EPFIOKIOLC, FPGA chip
on LC 84 package for hardware realization

Reference:
[IIANSIWEE Std 754-1985, IEEE Standard for
Binary Flooring-Point Arithmetic, IEEE, New
York, 1985.
[2]M. Daumas, C. Finot, "Division of Floating Point
Expansions with an Application to the
Computation of a Determinant", Journal o/
Universol Compurer Science, vo1.5, no. 6, pp. 323-
338, June 1999.
[3]AMD Athlon Processor techmcal brief, Advance
Micro Devices Inc., Publication no. 22054, Rev. D,
Dec. 1999.
[4]S. Chen, B. Mulgeew, and P. M. Grant, "A
Clustering techmque for digital communications
Channel equalization using radial basis function
Networks,'' IEEE Trans. Neural Networks, vol. 4,
pp. 570-578, July 1993.
[5] Mamu Bin Ibne Reaz, MEEE, Md. Shabiul Islam,
MEEE, Mohd. S. Sulaiman, MEEE. ICSE2002 Proc.
2002,penang-Malaysia.
Simulation of division:
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

OTRA based Grounded Inductor and its application


Rajeshwari Pandey(member IEEE),Neeta Pandey(member IEEE),Ajay Singh,
B.Sriram, Kaushalendra Trivedi
Delhi Technological University, Delhi

Abstract — In this a lossless grounded of signal processing circuits such as


inductor has been proposed using filters[5,6,7,8], oscillators[9,10,11],
Operational Transresistance Amplifier multivibrators[12,13] and immittance
(OTRA). PSPICE Simulation results have simulation circuits [9,14,15,16] an
been included to demonstrate the application which has been dealt with in this
performance and verify the theoretical paper.[14] presents simulation of lossy
analysis. grounded inductor, whereas a negative
inductance has been proposed in
Index Terms— Inductor simulators, [15].Lossless grounded inductor topologies
OTRA , grounded inductor. have been presented in [9,16].In this paper
another lossless grounded inductor topology
with its applications has been proposed
I. INTRODUCTION which will give further flexibility to analog
The Operational transresistance amplifier circuit designers.
(OTRA) is gaining considerable attention II. CIRCUIT DESCRIPTION
amongst analog integrated circuit designers
as it inherits all the advantages offered by
current –mode techniques. The OTRA is a OTRA is a three terminal device, shown
high gain current input voltage output symbolically in Fig.1 and its port relations
device. The input terminals of OTRA are can be characterized by matrix ((1)
internally grounded, thereby eliminating
response limitations due to parasitic
capacitances and resistances at the input[1]. (1)
Although the OTRA is commercially
available from several sources under the
name of current differencing amplifier or
Norton amplifier, it has not gained attention
until recently. These commercial
realizations do not provide internal ground
at the input port and they allow the input
current to flow in one direction only. The
former disadvantage limits the functionality Fig.1 OTRA Circuit symbol
of the OTRA where as the later forces to use
external DC bias current leading to complex
For ideal operations the transresistance gain
and unattractive designs [2]. Several high
performance CMOS OTRA topologies have Rm approaches infinity and forces the input
been proposed in literature [1,2,3,4,] leading currents to be equal. Thus OTRA must be
to growing interest in OTRA based analog used in a negative feedback configuration.
signal processing circuits. In recent past The proposed circuit is shown in Fig. 2.
OTRA has been extensively used as an
analog building block for realizing a number

VLP0401-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Table.1 Aspect ratio of the transistors in


OTRA circuit

Transistor W(µm)/L(µm)
M1-M3 100/2.5
M4 10/2.5
M5,M6 30/2.5
M7 10/2.5
M8-M11 50/2.5
M12,M13 100/2.5
M14 50/0.5

Fig. 2. Grounded Inductor III.APPLICATION

Routine analysis yields The proposed inductor is used to design (i)A


high pass filter (ii)an LC oscillator
(2) A. High Pass Filter
A high pass filter, as shown in Fig. 4(a), can
be constructed using proposed inductor. The
subject to the condition
transfer function for high pass response is

(3) (4)

For simulation CMOS implementation of Where


the OTRA, proposed in [4] and reproduced , (5)
in Fig. 3, was used. Aspect ratios used for
different transistors are same as in [4] and
are given in Table.1.The supply voltages
taken are ± 1.5 V for SPICE simulation.

Fi
g.4 (a) High Pass Filter

To verify theoretical propositions a HP filter


with cutoff frequency 159 KHz is designed
for which the component values are
computed as R=1KΩ, C=1nF and Leq
=1mH.For this value of Leq component
Fig. 3. CMOS Implementation of OTRA[4]
values are chosen as =

VLP0401-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

1K, and =1nF. The frequency


response of the filter simulated using
PSPICE is depicted in Fig. 4(b) and is found
to be in close agreement with theoretical
response.

Fig.5 (a) Oscillator

Fig. 4 (b) HP Response

B. LC Oscillator
An LC oscillator is designed as a signal
generating application, employing proposed
inductor, and is shown in Fig. 5(a). The
condition of oscillation and frequency of
oscillation are given as
(6)
Fig.5 (b) Oscillator Output.
(7)

A typical simulation for component values


=1K,
=10pF, which results in Leq =0.1mH, and
C=1nF is shown in Fig. 5(b). The simulated
frequency of oscillation is 775 KHz and is in
close agreement with the theoretically
calculated value of 795.77 KHz. Fig. 5(c)
shows the output frequency spectrum. Total
harmonic Distortion is measured as
4.906%.

Fig. 5(c) Frequency Spectrum

VLP0401-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

V. CONCLUSION [7] J.-J.Chen,H.-W.Tsao and S.-I.Liu,


A new OTRA based lossless grounded “Parasitic- capacitance-insensitive current-
inductor topology is presented. A high pass mode filters using OTRA” IEE Proc.-
filter and an oscillator are realized to Circuits Devices Syst., Vol. 142, No.3 June
illustrate the applications of the proposed 1995.
topology. PSPICE simulation results are [8]Ahmet Gokcen, Ugur Kam, “MOS-C
included to verify the theoretical single amplifier biquads using the OTRA”
prepositions. Int.J. Elect. Commun. (AEU), Vol 63,
It is expected that the proposed circuits will (2009), pp 660-664.
be useful in design of analog signal [9]K.N. Salama and A.M. Soliman, “Novel
processing and generation applications and oscillators using operational transresistance
will provide further possibilities to the amplifier,microelectron.j.,31, 39-47,2000.
designer in the field. [10]U. Cam, “A Novel Single-Resistance-
Controlled Sinusoidal Oscillator Employing
Single Operational Transresistance
Amplifier”, Analog Integrated Circuits and
Signal Processing, Vol. 32, pp. 183-186,
August 2002.
References [11]Rajeshwari Pandey, Mayank Bothra,
[1]J.-J.Chen,H.-W.Tsao and C.-C.Chen, “Multiphase Sinusoidal oscillator using
“Operational Transresistance Amplifier Operational Transresistance Amplifier”,
using CMOS Technology” Electronics IEEE Symposium on Industrial Electronics
letters Vol.28, No.22, pp.2087-2088, and Applications (ISIEA-2009), pp 371-
October 1992. 376,oct 2009.
[2]K. N. Salama and A. M. Soliman, [12] C.L.Hou, H. C. Chien and Y. K. Lo, “
“CMOS OTRA for analog signal processing Squarewave generators employing OTRAs,
applications.” Microelectron. J. 30, pp. 235– IEE proc.-Circuits Devices Syst., Vol.152,
245, 1999. no. 6, Dec 2005
[3]Hasan Mostafa, Ahmed M. Soliman, “A [13] Y. K. Lo, H. C. Chien, H. G. Chiu
Modiefied realization of the OTRA”, “Switch Controllable OTRA Based Bistable
frequenz 60(2006) pp 70-76. Multivibrator,” IET Circuits Devices Syst.,
[4]Abedelrahman K.kafrawy and Ahmed M. 2008, Vol. 2, No. 4, pp. 373–382.
Soliman, “A modified CMOS differential [14] U.Cam, F.Kacar, CommunicationO.
OTRA” Int.J. Elect. Comm. (AEU), Vol 63, Cicekoglu, h. Kuntman and A.Kuntman,
issue12, Dec2009, pp 1067-1071 “Novel grounded parallel immittance
[5] Selcuk Kilinc, Ugur Cam, “Cascadable simulator topologies employing single
allpass and notch filters employing single OTRA,” AEU- Int. J Electronics and
operational transresistance amplifier”, Communications,vol. 57, no.4, pp. 287-
Computers and electrical Engineering 290,2003.
31(2005), pp 391-401. [15] Selcuk Kilinc, Khaled n. Salama,and
[6] Cem Cakir, Ugur Cam and Oguzhan Ugur Cam, “Realization of fully
Cicekoglu, “Novel All pass Filter Controllable negative Inductance with single
Configuration Employing Single OTRA”, operational Transresistance
Ieee Transactions on Circuits and systems- Amplifier”Circuits Systems Signal
II: Express briefs,Vol. 52,No.3,march 2005, Processing,Vol 25,no.1, pp.47-57,2006
pp 122-125.

VLP0401-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

[16] U.Cam, F.Kacar, CommunicationO.


Cicekoglu, h. Kuntman and A.Kuntman,
“Novel two OTRA-based grounded
Immittance simulator topologies,” Analog
Integrated circuit and Signal Processing
,Vol. 29,pp. 233-235,2001.Analog
Integrated circuit and Signal Processing
,Vol. 39,pp. 169-175,2004.

VLP0401-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

OTRA based Precision Full Wave Rectifier


Rajeshwari Pandey (member IEEE) , Ajay Singh, B.Sriram, Kaushalendra Trivedi
Department of Electronics and Communication Engineering, Delhi Technological University, Delhi

Abstract — This paper presents an 10]. In general diodes are used as a rectifier
operational transresistance amplifier based having the drawback of threshold voltage,
precision full-wave rectifier using an all- and hence rectification is not permitted
pass filter as a 90◦ phase shifter. The circuit below a voltage of ∼0.7 V for a silicon
gives a dc output voltage that is almost the diode and ∼0.3 V for a germanium diode.
same as the peak input voltage over a Low-voltage rectification is required in
frequency range of 50 Hz–30 MHz with a applications such as amplitude modulated
very low ripple voltage having low signal detectors. Slew rate limitation
harmonic distortion. prevents the fast turning on of the diodes in
Index Terms—OTRA, All-pass filter, high frequency range and thus results in
harmonic distortions, precision rectifier, distortion. In view of above a precision
ripple voltage. rectifying circuit using OTRAs has been
I. INTRODUCTION proposed in this paper. The performance of
State-of-the-Art analog integrated circuit the circuit has been verified in the frequency
design is receiving a tremendous boost due range 50Hz-30MHz using P-SPICE.
to the development and application of II. PROPOSED RECTIFIER CIRCUIT
current-mode processing[1].It is well known The circuit symbol of OTRA is shown in
that the key performance features of current- Fig.1and its port relations can be
mode technique are inherent wide characterized by the following matrix:
bandwidth which is virtually independent of
closed loop gain, greater linearity and large Fig.1 OTRA Circuit symbol
dynamic range. Recently operational
transresistance amplifier (OTRA) has
II. CIRCUIT DESCRIPTION
emerged as an effective alternate analog
building block. It is a high gain current
input, voltage output amplifier [2].OTRA OTRA is a three terminal device, shown
being a current processing building block symbolically in Fig.1 and its port relations
inherits all the advantages of current mode can be characterized by matrix ((1)
technique. It is also free from parasitic input
capacitances and resistances as its input
terminals are virtually grounded thus (1)
eliminating response limitations due to
parasitics. OTRA is now being used as an
analog building block for realizing a number
of circuits having applications in signal
processing and generation[2-6 ].
Precise rectification function is one of the
important requirements in instrumentation
and measurement. It finds applications in ac
voltmeters, ammeters, signal-polarity
Fig.1 OTRA Circuit symbol
detectors, averaging circuits, sample-and-
hold circuits, peak value detectors and
amplitude-modulated signal detectors [7-

VLP0402-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

For ideal operations the transresistance implementation of OTRA, proposed in [11]


gain Rm approaches infinity and forces the and reproduced in Fig 3, was used.
input currents to be equal. Thus OTRA must Simulation was carried out for frequency
be used in a negative feedback range 50Hz-30MHz and the results are
configuration. compared with diode based full wave
Fig.2(a) shows the block diagram of the rectifier circuit.
proposed rectifier circuit. It consists of an
all-pass filter that acts as a 90° phase shifter,
two squaring circuits, one summer, and one
square rooter. The phase of the input
sinusoidal signal Vin = A sin (2πft) is shifted
by 90° by adjusting the resistance (R) and
capacitor (C) of the RC network of the all-
pass filter in accordance with equation (2).
The amplitude of phase-shifted output of all Fig.3 CMOS implementation of OTRA[11]
pass filter remains same as that of input
signal. A. Rectified Output
φ = −2 tan−1 (2πfRC) = 90◦. (2) The waveform tests were performed for both
The output of the all-pass filter can be the proposed circuit and previously reported
written as Vp = Acos(2πft). The squaring of circuits. Fig 4 (a) shows an input sinusoidal
Vin and Vp is done by using analog signal of frequency 100Hz. fig4. (a) Shows
multiplier. These squared signals are input signal, rectified output of the proposed
summed up using summer circuit circuit has been shown in fig 4(b).
implemented through OTRA. The summed
signal, after square rooting, becomes ~A,
which provides a rectified output.

Fig. 4(a) sinusoidal input

Fig.2 (a) block diagram of proposed circuit

Fig.4 (b) Rectified output of proposed


circuit

Fig.2(b)Circuit diagram of proposed circuit It is seen that output of the proposed circuit
contains less ripple in comparison to
III. SIMULATION RESULTS previously reported circuit [7] in which one
To verify the theoretical propositions the diode conducts for one half cycle and other
rectifier circuit is simulated using P-SPICE diode conducts for the other half cycle as
program. For simulation C-MOS shown in fig.4(c).

VLP0402-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Fig.4(c) Rectified output of diode based Fig.6 (a)10mV,100KHz Input signal and 90
degrees phase shifted signal
In the proposed circuit, rectification is not Similarly a high frequency signal of
performed by diodes, and frequency 100KHz and amplitude of 10mV
therefore, it has fewer ripples. is analyzed and the result is shown in, (b) is
Low voltage rectification i.e. below the rectified output.
threshold level of the diode was also carried
out. Fig 5shows typical output of the
proposed circuit for 100Hz frequency.

Fig 6(b) rectified output with Input signal.

B. Harmonic Distortion
Fig5(a) sinusoidal input of frequency 100 The harmonics in the signal cause distortion
Hz and amplitude 10mV along with 90 in the output of the circuit. Thus the
degrees phase shifted signal harmonic components are required to be
examined for circuit performance analysis.
Being periodic in nature, these harmonic
components can be analyzed by Fourier
series. The magnitude of each harmonic of a
waveform is obtained with fast Fourier
transform using PSPICE. In fig 7(a) FFT of
input signal of frequency 100Hz is shown
along with rectified output .whereas 7(b)
shows FFT of the input of frequency 100
kHz is shown.
Fig 5(b) rectified output with Input signal

VLP0402-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Fig8 (a)
Fig7(a)

Fig8 (b)

Previously reported circuit gives a ripple


Fig 7(b) factor of 0.483[12].
Fig. 7(a) shows frequency spectrum of
rectified output and 100Hz input Frequency V. CONCLUSION
spectrum of rectified output and input at a In this paper, a precision full wave rectifier
frequency of 100 kHz is shown in Fig. 7(b). is implemented using Operational
transresistance amplifier (OTRA).The
C. Ripple Factor: circuit provides an output voltage amplitude
Ripple factor of output has been computed. being almost equal to input voltage. The
Ripple factor is given by circuit works well in frequency range of
r= = 50Hz – 30MHz.The excellent performance
of circuit is obtained by using OTRA that
Where,
makes it work in much higher frequency
r = ripple factor,
range than previously reported circuit.
Vrms = rms value of AC component of
output,VDC = DC component present in
output
In fig 8(a) ripple factor is shown for input of
100 Hz and it is clearly seen that max value .
of ripple factor is 0.316 while its average REFERENCES:
value is 0.03. Ripple factor for an input of [1] “Analog IC design : The current mode
frequency of 100 kHz is shown in fig 8(b) approach” C.Toumazou,F.J.Lidgey,Peter
having an average value of 0.035. Peregrinus Ltd. 1990

VLP0402-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

[2] Salama Khaled N., Soliman Ahmed M., [9] P. Gray, P. J. Hurst, S. H. Lewis, and R.
CMOS operational transresistance amplifier G. Meyer, Analysis and Design of Analog
for analog signal processing, Integrated Circuits. New York: Wiley, 2001.
MicroelectronicsJournal,Vol.30,No.9,pp.235 [10] S. J. G. Gift, “A high-performance full-
-245, March 1999. wave rectifier circuit,” Int. J. Electron., vol.
87, no. 8, pp. 925–930, Aug. 2000.
[3] U. Cam, “A Novel Single-Resistance- [11] Hasan Mustafa, Ahmed M.Soliman,”A
Controlled Sinusoidal Oscillator Employing Modified realization of the
Single Operational Transresistance OTRA”,frequenz60(2006) pp70-76.
Amplifier”, Analog Integrated Circuits and [12] R. A. Gayakwad, Op-Amps and Linear
Signal Processing,Vol. 32, pp. 183-186, Integrated Circuits., 3rd ed. New Delhi,
August 2002. India: Prentice-Hall, 2007, pp. 316–318.
[4]Rajeshwari Pandey, Mayank Bothra,
“Multiphase Sinusoidal Oscillators Using
Operational Trans-Resistance Amplifier”,
IEEE Symposium on Industrial Electronics
and Applications (ISIEA 2009),pp 371-376
October 4-6, 2009.

[5] U.Cam, F.Kacar, CommunicationO.


Cicekoglu, h. Kuntman and A.Kuntman,
“Novel grounded parallel immittance
simulator topologies employing single
OTRA,” AEU- Int. J Electronics and
Communications,vol. 57, no.4, pp. 287-
290,2003.

[6] U.Cam, F.Kacar, CommunicationO.


Cicekoglu, h. Kuntman and A.Kuntman,
“Novel two OTRA-based grounded
Immittance simulator topologies,” Analog
Integrated circuit and Signal Processing
,Vol. 29,pp. 233-235,2001.Analog
Integrated circuit and Signal Processing
,Vol. 39,pp. 169-175,2004.
.

[7] S. J. G. Gift and B. Maundy, “Versatile


precision full-wave rectifiers for
instrumentation and measurement,” IEEE
Trans. Instrum. Meas., vol. 56, no. 5, pp.
1703–1710, Oct. 2007.
[8] S. R. Djukic, “Full-wave current
conveyor precision rectifier,” Serbian J.
Elect. Eng., vol. 5, no. 2, pp. 263–271, Nov.
2008.

VLP0402-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

GaN-based HEMTs for Communication Circuits


T R Lenka1 and A K Panda2
National Institute of Science and Technology
Palur Hills, Berhampur, Odisha, Pin-761008
E-mail: trlenka@gmail.com1 and akpanda62@hotmail.com2

Abstract: electrons is very high in the 2DEG, which leads


In this paper the role of GaN-based high electron mobility
the device towards microwave applications [4].
transistors (HEMTs) in microwave communication Advanced HEMT Monolithic Millimeter-wave
circuits have been discussed. Due to superior material Integrated Circuits (MMIC) for Millimeter and
properties, GaN-based devices produce a record Sub-millimeter-Wave power sources and power
maximum frequency of oscillation of around 300GHz and amplifiers for applications to heterodyne
high cutoff frequency. It has become one of the prime
candidates for solid-state power amplifiers at frequencies
receivers, transmitters, and communication
upto 50GHz. The unique properties of GaN include a peak circuits are highly popular and dominated by
saturation velocity of 2.5x107 cm/s, high breakdown GaN based devices [5]. In discrete device
electric field of 3.3MV/cm, output power densities in applications, low-noise HEMTs are
excess of 10W/mm at 40GHz and more than 2W/mm at commercially available and are in use in
80.5GHz. Recent wide-spread R&D to advance the
HEMT technology has led to high-speed low-power LSI
broadcast satellite and radio telescope systems.
circuits and ultra-low noise amplifiers. In this paper the This paper reviews the state-of-the-art HEMT
microwave characteristics of HEMT which includes technology for communication systems.
available gain (GA), maximum available gain The commonly used HEMT structure is
(MAG/GMax), unilateral gain (GU), Maximum Stable Gain discussed in section 2. The microwave
(MSG), Noise Figure (NF) and Minimum noise figure
(NFmin) are discussed. The potential usability of HEMT as
characteristics of GaN-based HEMTs are
an amplifier and Oscillator are also discussed. discussed in section 3 and finally the conclusion
is drawn in section 4.
Key Words: GaN, HEMT, Microwave, MMIC, Gain
2. HEMT STRUCTURE
1. INTRODUCTION
The AlGaN/GaN heterostructure is generally
GaN-based semiconductor devices are grown on sapphire/SiC substrate by Molecular
currently the main focus of great interest in beam epitaxy (MBE) or metal organic vapor
academia as well as industry because of its very phase epitaxy (MOVPE) process [6]. For
interesting material properties. [1] These Schottky ohmic contacts Ti/Al/Ni/Au is mostly
semiconductor alloys have a wide bandgap used. The TCAD simulated structure of this
(>3.4eV), high temperature sustainability and device is shown in figure 1. Schrödinger’s wave
high electric breakdown fields, which allow equation and Poisson equation are solved self
them to be used for the fabrication of short- consistently to give rise to a two dimensional
wavelength (blue, UV) optical devices, high- electron gas (2DEG) which is created at the
frequency and high power electronics [2]. heterointerface of AlGaN/GaN due to the
Due to conduction band discontinuity, two growth of wideband material over narrow
dimensional electron gas (2DEG) channel is bandgap material and it is the heart of any
created at the heterointerface between two heterostructure device [7]-[8]. The electron
undoped materials by piezoelectric and concentration at the 2DEG is dependent upon
spontaneous polarizations [3]. The 2DEG is the the conduction band discontinuity. However in
heart of the HEMTs. The modeling of GaN- order to reduce the scattering in the 2DEG
based HEMTs still presents many challenges to formed at the heterointerface, a binary nanoscale
the worldwide research community. Due to lack AlN layer is epitaxially grown at the
of scattering effects, the mobility of the heterointerface of AlGaN/GaN heterostructure
[7]-[8].

VLP0403-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

characteristics of HEMT [9]. When embarking


on any amplifier design it is very important to
understand the stability of the device chosen,
otherwise the amplifier may well turn into an
oscillator. The microwave parameters include
available gain (GA), maximum available gain
(MAG), unilateral gain (GU), Maximum Stable
Gain (MSG), Noise Figure (NF) and Minimum
noise figure (NFmin) etc [9]. The small signal
model of HEMT is shown in figure 3.
The main way of determining the stability of a
device is to calculate the Rollett’s stability
Fig. 1 Simulated Structure of AlGaN/GaN-based
factor (K), which is calculated using a set of S-
HEMT parameters for the device at the frequency of
The two dimensional electron gas (2DEG) operation. We can calculate two Stability
created at the heterointerface of AlGaN/GaN parameters K & |Δ| to give us an indication to
with a mole fraction of 0.3 is shown in figure 2. whether a device is likely to oscillate or not or
whether it is conditionally/unconditionally
stable [9].
2 2 2
1 S11 S 22
K 1
2 S12 S 21 (1)
where S11 S 22 S12 S 21 1
The parameters must satisfy K > 1 and |Δ| < 1
for a transistor to be unconditionally stable.
Once the K factor is calculated and we find that
the device is unconditionally stable then we can
calculate the Maximum available gain (MAG).
S 21
MAG GMax K K2 1 (2)
S12
when K is on the limit of unity the above
Fig. 2 Formation of 2DEG at the heterointerface equation reduces down to
S 21
3. MICROWAVE CHARACTERISTICS MSG (3)
S12
3.1 HEMT as an Amplifier In this case the MAG is known as the maximum
stable gain MSG and is shown in figure 4.

Fig. 3 Small Signal Model of HEMT


Fig. 4 MAG/GMax and MSG of HEMT
Two port network analyses have been done by
microwave office to understand the microwave

VLP0403-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

As frequency increases from 1 to 50 GHz the small signal circuit, the forward gain can be
maximum available gain (GMax) and maximum increased to the desired value.
stable gains (MSG) decreases and both are
coinciding together. It means K=1 and the
device is unconditionally stable. The various
gains at different frequencies are mentioned in
figure 4.

Fig. 7 Noise Figure (NF) and NFMin of HEMT


The microwave noise figure (NF) and NFMin
are shown in figure 7. It is seen from this figure
that the NF increases with the frequency of
operation and it is minimum upto 5.5GHz
Fig. 5 Available Gain (GA) and Unilateral Gain (GU) whereas the NFMin is very negligibly small with
of HEMT the span of frequency from 1 to 50GHz. The NF
Mason’s unilateral gain (MUG/GU) and the can be optimized to the required value by tuning
available gain are plotted in figure 5, in a the values of the lumped elements of the small
frequency range of 50GHz. It is seen from signal circuit.
figure 5 that the available gain reaches to a peak
of 29.7dB at 1GHz and the unilateral gain (GU) 3.2 HEMT as an Oscillator
varies from 56dB at 1GHz to 24dB at 50GHz. In spite of the great progress in performance
achieved during the last few years, there are still
several important issues that need to be
overcome to further increase the performance of
GaN HEMTs at millimeter frequencies (30-
300GHz). One of the key challenges to achieve
high-gain millimeter-wave power amplification
is to increase the maximum power-gain cutoff
frequency (fmax) and it is the maximum
frequency at which the transistor still provides a
power gain and can be expressed as [6]-[11]
fT
f max (4)
Fig. 6 S21 and S12 with respect to frequency 2 Ri Rs Rg / Rds Rg C gd 2 fT
The two-port network is connected to load Where fT is the current-gain cutoff frequency
impedance ZL, source impedance ZS, and and C gd is the gate-drain (depletion region)
characterized by a scattering matrix [S]. The S
parameters such as S21 and S12 are the forward capacitance, while Ri , Rs , Rg , and Rds represent the
voltage gain and reverse voltage gain gate-charging, source, gate and output resistance,
respectively. and are shown in figure 6. As per respectively. To maximize f max , each parameter
the values of the lumped elements of the small needs to be carefully optimized. In FETs the
signal model the forward gain of the device is short-channel effects play an important role in
measured to be 22.94dB at 1GHz, and then it the high frequency characteristics [6]. So gate-
decreases with the frequency whereas the recess technology can suppress the short-
reverse gain is in negative values. By taking channel effects and it leads to the improvement
suitable values of the lumped elements of the of high frequency characteristics.

VLP0403-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

The general figure of merit given in equation 5, In estimating the resistance directly under the
for comparing microwave circuits is the cut-off gate Rg , the 2DEG is assumed to be under the
frequency (Fc) and is defined by the on influence of the gate voltage, making n s a
resistance (Ron) and off–state capacitance (Coff)
function of the gate voltage Vg [11]. The
of the device [10]-[11].
1 resistance elements Rsg and Rdg are assumed to
FC (5)
2 Ron Coff not to be controlled by the applied gate voltage
The on resistance of the HEMT is governed by and thus n s is not a function of Vg in the source-
the total source-drain resistances at microwave gate and drain-gate regions.
frequencies for voltages higher than threshold. The capacitance model includes both voltage-
Below threshold voltage the 2DEG is dependent and parasitic capacitances. The
suppressed under the gate and the resistance voltage-dependent capacitances used in
increases dramatically. modeling the GaN HEMT are the source-gate
The general channel resistance R DS is composed and drain-gate capacitances C g and the
of several resistance components and may be capacitances between the gate and inner side of
written as
the source and drain electrodes, Cig . The total
RDS Rg Rsg Rdg (6)
capacitance C DS can be written as [11]
where Rg is the interface (or channel) resistance
CDS Cg Cig C par
(9)
under the gate, Rsg and Rdg are the source-gate
where C is the total parasitic capacitance.
and drain-gate channel resistances respectively. par

The contribution of Rsg and Rdg to the total on-


state resistance Ron, depends on the gate-drain 4. CONCLUSION
and gate-source electrode spacing. This spacing The small signal model of HEMT is designed
governs the high breakdown voltage with wider for two-port network analysis using microwave
spacing yielding higher break-down voltages. office and its corresponding GaN-based HEMT
The resistances making up R DS are governed by is simulated using TCAD tool. Various
the 2DEG that is induced at the heterointerface. microwave parameters such as MSG, MUG,
Below the threshold voltage, the 2DEG carrier MAG, NF and NFMin are discussed. The
density goes to zero and R DS approaches Amplifier and Oscillator behavior of HEMT is
maximum Rmax due to carriers in the GaN also discussed.
material.
Since the 2DEG governs the resistance in the ACKNOWLEDGEMENT
conductive channel, the resistance of each The authors acknowledge the DST-FIST and
element may be estimated as [10] DST-SERC fund received by National Institute
Li of Science and Technology from Department of
Ri Rs (7)
W Science & Technology (DST), Government of
where Rs is the sheet resistance of interface India.
channel, W is the gate width of HEMT and Li is
the approximate geometrical length. The value
of the sheet resistance is dependent on the REFERENCES
density of the 2DEG and the mobility of the 1. David F. Brown et al: N-Polar InAlN/AlN/GaN MIS-
carrier in the channel and can be written as [11] HEMTs, IEEE Electron Device Letters, Vol. 31,
1 No.8, Aug, 2010
Rs (8) 2. T R Lenka and A. K. Panda, “Role of Nanoscale AlN
qns n
and InN for the Microwave Characteristics of AlGaN/
where q is the single charge, n is the low-field (Al, In) N/GaN - based HEMT,” Accepted for
mobility of the 2DEG and n s is the 2DEG publication in “Fizika i Tehnika Poluprovodnikov”/
density. Semiconductors (Springer) (2011).

VLP0403-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

3. Haifeng Sun et al: 205-GHz (Al, In)N/GaN HEMTs, .


IEEE Electron Device Letters, Vol. 31, No.9, Sept,
2010.
4. T R Lenka and A. K. Panda, “Characteristics Study of
2DEG Transport Properties of AlGaN/GaN and
AlGaAs/GaAs-based HEMT,” “Fizika i Tehnika
Poluprovodnikov”/ Semiconductors (Springer), Vol.
45, No 5, 2011, pp.660-665.
5. Haifeng Sun et al: 102 GHz AlInN/GaN HEMTs on
Silicon With 2.5-W/mm Output Power at 10GHz,
IEEE Electron Device Letters, Vol. 30, No.8, Aug,
2009.
6. Jinwook W. Chung et al: AlGaN/GaN HEMT with
300-GHz fmax, IEEE Electron Device Letters, Vol. 31,
No.3, Aug, Mar 2010.
7. T R Lenka and A. K. Panda, “Self-consistent
Subband Calculations of AlxGa1-xN/(AlN)/GaN-based
High Electron Mobility Transistor,” Advanced
Materials Research, Vol. 159, pp 342-347, 2011.
8. T R Lenka and A. K. Panda, “Effect of Nanoscale
AlN layer for improving 2DEG Transport properties
in AlGaN/AlN/GaN-based HEMT,” International
Journal of Pure and Applied Physics (IJPAP), Vol. 6,
No.4, pp.419-427, 2010.
9. Microwave Office Manuals.
10. Kelson D. Chabak et al: Strained AlInN/GaN HEMTs
on SiC with 2.1-A/mm Output Current and 104GHz
Cutoff Frequency, IEEE Electron Device Letters,
Vol. 31, No.6, June, 2010.
11. Nikolai V. Drozdovski et al: GaN-Based High
Electron-Mobility Transistors for Microwave and RF
Control Applications, IEEE Trans on Microwave
Theory and Techniques, Vol. 50, No.1, Jan, 2002.

VLP0403-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Ternary logic in digital communication for high


speed and performance

Email: vinay.ic01@gmail.com,vinay.ic1@rediffmail.com

Abstract—In this paper “Digital Transceiver using


Advance Ternary Technique” gives the details digital signal; at the receiver, the digital data are
about digital transmitter and receiver with the recreated by decoding the digital signal [1]
design of a ternary line coding. In this scheme
computer data (byte) will be converted into base-3
data elements. Current applications of line codes are
enormous in data transmission networks and in
recording and storage of information systems. The
applications include local and wide area networks
both wireless and wire connected. A coding
technique named advanced ternary line code can be
derived from three popular line codes NRZ-L, NRZ
and polar RZ. In this scheme six signal patterns are Fig. 1: Digital data to digital signal encoding
required for eight binary data patterns.
Line codes data transmission categorized into three
I INTRODUCTION ways. The first type is still in binary in nature. The
second type of line codes are ternary codes which
This scheme focused on the electric signal and data operate on three signal levels (+, 0, and -). The third
processing. Implementation of this scheme will type of line codes are called as multilevel codes
improve the means for encoding a binary data word which has more than three output levels. The
as ternary code word. At the decoding time ternary encoder and decoder circuits can be able to simulate
codeword to recapture the binary data word. The and implement by using simple combinational logic
main advantage of this scheme is to maintain the circuits..
DC balance at the time of ternary data word
transmission. And other advantage of this scheme is
that ternary coding carries more data per bit than
binary data. Six binary bits can represent the 64
different values (0-63) whereas six ternary bits can
represent 365 different values from 000000-
111111).
Line Coding is the process of converting digital data
to digital signals. We assume that data, in the form
of text, numbers, graphical images, audio, or video
are stored in computer memory as sequences of bits.
Line coding converts a sequence of bits to a digital
signal. At the sender, digital data are encoded into a

VLP0405-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Like a decimal to Binary Conversion, it takes


Figure 2: Unipolar NRZ, Polar NRZ,Unipolar RZ following steps:

II TERNARY REPRESENTATION Algorithm:


(a) DECIMAL TO TERNARY CONVERSION
Step-1: Write the decimal number
S.NO. Decimal Ternary
1. 0 0 0 0 0 Step-2 : Divide the decimal value by three (3), write
2. 1 0 0 0 1 quotient and remainder
3. 2 0 0 1 -1 Step-3: If the remainder becomes 2 then the value
4. 3 0 0 1 0 of quotient becomes increase by one and the
5. 4 0 0 1 1 resultant remainder decrease by 3.
6. 5 0 1 -1 -1 Step-4: Repeat step 2 on the quotient; keep on
7. 6 0 1 -1 0
repeating until the quotient becomes zero
8. 7 0 1 -1 1
9. 8 0 1 0 -1 Step-5 Write all remainder digits in the reverse
10. 9 0 1 0 0 order (last remainder first) to form the final result.
11. 10 0 1 0 1
12. 11 0 1 1 -1 Example: (25)10 =(X)3
13. 12 0 1 1 0
14. 13 0 1 1 1
15. 14 1 -1 -1 -1
16. 15 1 -1 -1 0
17. 16 1 -1 -1 1
18. 17 1 -1 0 -1
19. 18 1 -1 0 0
20. 19 1 -1 0 1
21. 20 1 -1 1 -1
22. 21 1 -1 1 0
23. 22 1 -1 1 1
24. 23 1 0 -1 -1
25. 24 1 0 -1 0
26. 25 1 0 -1 1 (25)10 =(1 0 -1 1)3
27. 26 1 0 0 -1
28. 27 1 0 0 0 (c) TERNARY TO DECIMAL CONVERSION
29. 28 1 0 0 1
30. 29 1 0 1 -1 (1 0 -1 1)3 =(X)10
31. 30 1 0 1 0
32. 31 1 0 1 1
33. 32 1 1 -1 -1
1 *33 +0*32 + (-1)*31 +1*30
34. 33 1 1 -1 0 27+0+(-3)+1 = 25
35. 34 1 1 -1 1 (1 0 -1 1)3 =(25)10
36. 35 1 1 0 -1
37. 36 1 1 0 0
38. 37 1 1 0 1 (d) TERNARY ADDITION
39. 38 1 1 1 -1 Ternary addition can be performed by the
40. 39 1 1 1 0 following rules:
41. 40 1 1 1 1
Table1: Decimal -Ternary A B C Carry Sum
0 0 0 0
(b) DECIMAL TO TERNARY CONVERSION 0 1 0 1
The decimal (base 10) numeral system has ten 0 -1 0 -1
possible values (0, 1, 2,3,4,5,6,7,8 or 9) for each 1 0 0 1
place value. In contrast, the ternary (base 3) 1 1 1 -1
1 -1 0 0
numeral system has three possible values, often -1 0 0 -1
represented as -1, 0 or 1, for each place-value. -1 1 0 0
VLP0405-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

-1 -1 -1 1 magnitude such as (-28) can be represented as


-1 -1 -1 -1 0 (0 0 -1 0 0 -1)
1 1 1 1 0
Table 2: Rules of Ternary addition

(f) TERNARY MULTIPLICATION


Example:
Ternary multiplication can be performing in
following ways similar to Binary multiplication.
Here the some basic rules are applied for
multiplication

S. No. A B AxB
1 0 0 0
2 0 1 0
3 0 -1 0
4 1 0 0
5 1 1 1
6 1 -1 -1
7 -1 0 0
8 -1 1 -1
9 -1 -1 1
Table 3: Rules for Ternary Multiplication
Example:
(i) (37)10 x (4)10= (148)10

(1 1 0 1 ) 3 * (0 0 1 1]) 3 = [X]3
(e) TERNARY SUBSTRACTION
1 1 0 1
if negative numbers are considered, then by X 0 0 1 1
changing all +1’s to -1’s and vice versa, leaving all ---------------------------------------
zeroes unchanged, gives the negative of the 1 1 0 1
corresponding number. Hence it follows that 1 1 0 1 x
addition and subtraction may be performed with the 0 0 0 0 x x
same hardware in the balanced ternary system by 0 0 0 0 x x x
sign changes of the addend or subtrahend, ----------------------------------------
respectively. 0 1 -1 -1 1 1 1
----------------------------------------
(i) A-B =X (0 1 -1 -1 1 1 1)3 = (148)10
(ii) X=A + B’ where in B’ change all +1 to -1
and vice versa (ii) (14)10 x (15)10= (210)10

(1 -1 -1 -1 ) 3 * (1 -1 -1 0]) 3 = [X]3

1 -1 -1 -1
X 1 -1 -1 0
---------------------------------
0 0 0 0
-1 1 1 1 x
Here there is no need to convert the negative -1 1 1 1 x x
VLP0405-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

1 -1 -1 -1 x x x
---------------------------------
0 1 0 -1 -1 1 0
---------------------------------
(0 1 0 -1 -1 1 0)3 = (210)10
III PRINCIPLES OF TERNARYDATA
PATTERNS ENCODING

The method for transmitting an 8-bit binary data as


a 6-ternary code includes encoder, decoder. For
each 8- bit binary data has a unique 6-ternay
codeword that is optimized for communication
Ternary data Patterns encoded as three signal
patterns[3]. The three signal levels are represented as
--, 0 and +. The first 8 bit binary pattern 10100011
is converted in (163)10 and this encoded as in 6-
ternary (1 -1 0 0 0 Figure 4:Ternary Data Communication
1)3.
This data patterns is encoded in signal patterns (+ -- The logic circuitry of this method is optimized to
0 0 0 +) accomplish the translation using a small number of
combinational logic gates. Implement of Ternary
communication increase the speed and performance
over the 8 bit data word communication and also
decrease the size of encoder.

IV PRINCIPLES OF TERNARY DATA


PATTERNS DECODING

The principle of decoding system is very simple and


reverse process of encoding system. Decoding
system receive the 6 ternary pattern. And decoder
circuit converts into the 8-bit binary pattern which
format is understandable by receiver.
Figure3: Binary Data Communication

V ADVANTAGES OF TERNARY CODES

The Concept of 6-Ternary data communication


between two devices make the system high speed
and high performance and also reduce the size of
the overall circuitry system. This concept relates
generally to electric signal and data processing[5].
The encoder converts the 8-bit binary data word
into a 6-ternary data word and decoder also converts
the 6-ternaty data word into a 8 bit binary data
word. Ternary data transmission can be use in a
high speed network. Ternary data transmission
maintains the DC balance in transmission. Binary
data word to ternary conversion has beneficial for
placement of data on an electromagnetic channel.
VLP0405-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

Ternary data communication increase the data [7] Takasaki, Y., “Digital transmission design and
carrying can be used to increase the speed of data jitter analysis,” Artech House, 1991, pp.35-60.
transmission. In future, this can be increase the data [8] Sandeep Patel, Howard W. Johnson, “Methods
capacity in storage media [8]. and apparatus for implementing a type 8B6T
Encoder and decoder “, 1996, patent no 5,525,983.
VI CONCLUSION

In this Paper we have discussed about the Ternary


logic in digital communication for providing high
speed & performance. In this Scheme, we have
discussed about the encoding of binary data word to
ternary data word for improving the data word
transmission time & correspondingly high speed
communication as compared to the Binary logic that
is generally used in digital communication. For the
implementation of this scheme, we have discussed
some major algorithms and various conversion
methods that are very useful for understanding this
logic. In this paper, we also tried to light on the
principle of ternary data pattern encoding which is
responsible for ternary data communication.

VII REFERENCES

[1] Glass, A., Ali, B. and Bastaki, E. “Design and


modeling of H-Ternary line encoder for digital data
transmission”. International Conference on Info-
Tech & Info-Net, Beijing, China, 2001, pp 503-
507.
[2] A. Mahadevan, Digital Transceiver using H
Ternary Line Coding Technique, Proceedings of the
World Congress on Engineering 2007 Vol I
[3] A. Srivastava and K. Venkatapathy, “Design and
Implementation of a Low Power Ternary Full
Adder”1996 OPA (Overseas Publishers
Association) Amsterdam B.V. Published in The
Netherlands under license by Gordon and Breach
Science Publishers SA
Printed in Malaysia
[4] Abdullatif Glass and Bahman Ali, Nidhal
Abdulaziz, “H-Ternary Line Decoder for Digital
Data Transmission:
Circuit Design and Modelling”,
[5] Bylanski, P. and Ingram, D., “Digital
transmission systems,” Peter Peregrinus, 1976, pp.
216-246.
[6] Lathi, P., “Modern digital and analog
communication systems (3rd Ed),” Oxford
University Press, 1998, pp.294-353.

VLP0405-5
“Embedded Implementation of Space Vector PWM using FPGA”
Ashish Gupta
Assistant Professor
Department of Electronics Engineering,
MPEC, Kanpur
ashish3179@rediffmail.com

Abstract - This Paper introduces the working implementation in many fields [2]. FPGA based
principle of space vector pulse width modulation embedded implement of SVPWM can make the
(SVPWM), and presents a new circuit realization of computing power of processor and the logical
SVPWM generator based on a flexible, high processing power of hardware circuit combined,
computation speed and cost effective field thus the processing efficiency of CPU and the
programmable gate array (FPGA) embedded logical units utilization can be improved . Figure 1
technique. Controlling of the machines using the shows a SVPWM control system based on FPGA-
vector control techniques is becoming more popular embedded technique –
nowadays. The need for extensive computations has
no more become an objection to the vector control
implementation. This is due to the wide availability
of high speed digital processors. The method of
decoupling the variables and controlling them
independently is known as vector control. To relieve
the controller from the time consuming
computational task of PWM signal generation, a
new method of Space Vector PWM signal
generation is implemented in FPGA using
Hardware Description Language VHDL. The Space Figure 1: SVPWM control system based on FPGA-
Vector PWM pulses are first designed in embedded technique
MATLAB/SIMULNK environment and relevant
coding are written to generate the pulses and then Recent applications of FPGA’s in industrial
by using software conversion tool the M files are electronics include mobile- robot path planning and
converted into VHDL coding. Thus the triggering intelligent transportation [3], current control applied
pulses are given to the inverter circuit and hence to power converters, real-time hardware in the loop
the switching pattern generated will reduce the testing for control design, Controller
harmonic content and switching losses. implementation, separating and recovering
independent source signals, and neural computation.
Keywords : FPGA- Field Programmable Gate Since the concept of multilevel PWM converter was
Array, SVM, Space Vector PWM, VHDL, Induction introduced, various modulation strategies have been
motor drive developed and studied in detail, such as multilevel
sinusoidal PWM, multilevel selective harmonic
1 Introduction elimination and space vector modulation. Among
these strategies, the space vector PWM (SVPWM)
The Pulse Width Modulation (PWM) Technique [4]stands out because it offers significant flexibility
called “Vector Modulation”, which is based on to optimize switching waveforms and is well suited
space vector theory, is the most important for digital implementation. Complexity and
development in the last few years [1]. Although, computational cost of traditional SVPWM
several of PWM methods have been created in the techniques increases with the number of levels of
past, the vector modulation technique appears to be the converter, and most of all use trigonometric
the best alternative. FPGA’s development reached a functions or pre-computed tables. A symmetrical
level of maturity that made them the good choice of space vector modulation PWM pattern is proposed

1
in this paper, it shows the advantage of lower THD to above equations, the eight switching vectors,
without increasing the switching losses. Thus this output line to neutral voltage (phase voltage), and
paper demonstrates that a more efficient and faster output line-to-line voltages in terms of DC-link
solution is the use of Field Programmable Gate Vdc, are given in Table.1 shows the eight inverter
Array (FPGA’s), it investigates how to generate a voltage vectors (V0 to V7)
variable PWM waveform based on Xilinx FPGA
[5].The rest of the paper is organized as follows.
Section II introduces the principle of symmetrical
space vector PWM method. Section III shows
details on FPGA. Section IV shows the m-file
coding/Simulink blocks required to generate Space
Vector Pulses. Section V explains the experimental
results and Section VI is the conclusion

2. Principle of Space Vector PWM


In vector coordinates, the combinations of three- Figure 3: Circuit model of PWM inverter with
phase inverter output voltages form eight space center-taped grounded DC bus.
vectors shown in Figure. 2 There are six nonzero
space vectors forming an origin centered hexagon, Switching Line to Neutral Line to line
and two zero space vectors (V0-V7) located at the Voltage Vectors Voltage voltage
origin. The hexagon is the maximum boundary of Vectors a b c Van Vbn Vcn Vab Vbc Vca
the space vector, and the circle is the maximum V0 0 0 0 0 0 0 0 0 0
trajectory of the regular sinusoidal outputs in linear V1 1 0 0 2/3 -1/3 -1/3 1 0 -1
modulation. This figure also explains the PWM V2 1 1 0 1/3 1/3 -2/3 0 1 -1
V3 0 1 0 -1/3 2/3 -1/3 -1 1 0
output patterns in the six regions (denoted as sector
V4 0 1 1 -2/3 1/3 1/3 -1 0 1
I–VI) separately. In accordance with three-phase to V5 0 0 1 -1/3 -1/3 2/3 0 -1 1
two-phase transformation, the three-phase inputs V6 1 0 1 1/3 -2/3 1/3 1 -1 0
(Va, Vb, Vc) are transformed into (Vα, Vβ) as the V7 1 1 1 0 0 0 0 0 0
reference vector.
Table-1 Details of different phase and line
voltages for the eight states.

3. Field Programmable Gate Array

A Field-Programmable Gate Array or FPGA is a


silicon chip containing an array of configurable
logic blocks (CLBs). Unlike an Application
Specific Integrated Circuit (ASIC) which can
perform a single specific function for the lifetime of
the chip an FPGA can be reprogrammed to perform
different function in a matter of microseconds. The
Figure 2: Basic Eight Switching Vector and Vector design used Xilinx development tools, and is
Representing of Sector 1. realized in a single FPGA chip with no external
memory. The benefits of this design are as follows
As shown in Figure. 3, there are eight possible
combinations of on and off patterns for the three  The whole system is implemented in only a
upper power switches. The on and off states of the single chip consequently the circuit is very
lower power devices are opposite to the upper one compact.
and so are easily determined once the states of the  Systems of FPGA chip are more reliable
upper power transistors are determined. According because they do not need any control software

2
 Faster design and verification time, design 4.1 Simulink Model to generate
change without penalty. Space Vector PWM
In this paper programming FPGA using Hardware
Description Languages and coding are used to
generate the Space Vector Modulation for the
inverter circuit. The point to be remember here is
that instead of writing the direct VHDL coding
firstly the M-File coding is written to generate the
SVPWM pulses and then after by using he software
converter VHDL coding is generated. Hence the
work requires less time and fast operation. The
MATLAB/SIMULNK environment is familiar to
large number of software programmers and since
m-file coding is very much common to most of the
programmers it becomes easier to work in this
software. A very attractive high-level design/
simulation tool is provided by FPGA and is called
XILINX. It is a very flexible design tool, which
allows Testing of a high-level structural description
of the design and makes possible quick changes and
corrections. The circuit description structure is very
similar to the way the design could be implemented
later. Therefore mapping tool allowing conversion
of such a structure into VHDL code would save the
designer’s time, which otherwise has to be spent in
rewriting the same structure in VHDL and probably
making mistakes that will need debugging.

4. Simulation Steps: Figure 4.1: Simulink Model for Overall System

(1) Initialize system parameters in MATLAB/


SIMULNK .

(2) Perform M-File coding to


(i) Determine sector.
(ii) Determine time duration T1, T2, T0.
(iii) Determine the switching time (Ta,Tb
and Tc) of each transistor (S1 to S6).
(iv) Generate the inverter output voltages
(VAB, VBC, VCA).
(v) Generate VHDL Codings through
software convertion tool.
(vi) Burn the program in the FPGA kit.

(3) View the SVPWM waveform by XILINX.

Figure 4.2: Subsystem Simulink Model for


“Space Vector PWM Generator”

3
Tech-
SPWM SVPWM
nique
Output Output
M. I. line THD line THD
(M) voltage (%) voltage (%)
(peak V) (peak V)
0.4 180.80 162.11 192.70 154.07
0.5 266.50 123.35 312.20 108.78
0.6 289.40 117.12 318.10 105.69
0.7 369.20 94.52 436.60 81.19
0.8 396.10 89.73 442.90 78.56
0.9 472.90 70.69 552.30 53.62
1.0 502.40 64.83 567.90 49.15
Parameter used : Fundamental frequency :50 Hz,
Switching frequency:10 KHz ,
DC Voltage : 600 volts
Table 2: Comparisons between SPWM and
SVPWM by varying modulation index.

Figure 4.3: Subsystem Simulink Model for


“Making Switching Time”

5. Results and Discussions

The control scheme is simple in architecture and


thus facilitates the realization of the developed
SVPWM controller using FPGA based circuit
design approach. The designed SVPWM control IC
has been realized using single FPGA.The
simulation results of internal module and the final
output of Space Vector PWM switching pattern has
been achieved with a fundamental frequency of 50 Figure 5: Locus comparison of maximum linear
Hz. Such a wide frequency control with very high control voltage in Sine PWM and SVPWM.
frequency-switching is only possible by utilizing the
state-of-art VLSI digital circuit design approach.
From the result the switching pattern generated will
reduce the harmonic content and switching losses.
A comparisons between spwm and svpwm by
varying modulation index is shown in the below
mentioned table 2 and which evidently shows the
greater advantage of controlling the drive by
SVPWM technique. Figure 5 shows the Locus
comparison of maximum linear control voltage in
Sine PWM and SVPWM. Figure 6, 7 and 8
represents the axis converter, Delay time, Output of
each inverter respectively. Figure 9, 10 shows the
simulation results of Van, Vab, Vac and Simulation
results of pulse patterns Fig 6: Three to Two axis converter. (Va, Vb, Vc)
are transformed into (Vα, Vβ)

4
6. Conclusion
In this paper, a theoretical study concerning the
SVPWM control strategy on the voltage inverter
based on FPGA is presented. This aims on one hand
to prove the effectiveness of the SVPWM in the
contribution in the switching power losses
reduction. SVPWM is among the best solution to
achieve good voltage transfer and reduced harmonic
distortion in the output of an inverter. On the other
hand since Field programmable gate array (FPGA)
Fig 7: Delay time
have better advantages compared to microprocessor
and DSP control, this modulation technique is
implemented in an FPGA by initially generating m-
file through Matlab-Simulink environment. The
FPGA coding makes it easier in designing the
vector modulation pattern generator using field
programmable Array. Moreover the MATLAB/
SIMULNK environment is familiar to large number
of software programmers and since m-file coding is
very much common to most of the programmers it
becomes easier for individuals to work in this
software. The switching pattern generated will
Fig 8: Output of each inverter reduce the harmonic content, provides efficient as
well as flexible control and reduces the total size of
the system. This SVPWM IC can be used for high
performance ac drives and power conditioning
equipment as a modulator.

References
[1] Ying-yu Tzou; Hau-Jean Hsu; Tien-Sung Kuo.
Industrial Electronics, Control, and Instrumentation,
1996., Proceedings of the 1996 IEEE IECON 22nd
International Conference. “FPGA based SVPWM
Fig 9: Simulation results of Van, Vab and Vac control IC for 3-phase PWM inverters”. Volume 1,
Issue, 5-10 Aug 1996 Pages(s):138-143.

[2] J.J. Rodriguez-Andina, M.J. Moure, and M.D.


Valdes, “Features, design tools, and application
domains of FPGAs”, IEEE Trans. Ind. Electron.,
vol.54, no.4, pp.1810 – 1823, Aug. 2007.

[3] K. Sridharan and T. Priya, “The design of a


hardware accelerator for realtime complete
visibility graph construction and efficient FPGA
implementation,” IEEE Trans. Ind. Electron.,
vol.52, no.4, pp. 1185 – 1187, Aug. 2005.
Fig 10: Simulation results of pulse patterns

5
[4] L. Franquelo, M. Prats, R. Portillo, J. Galvan,
M. Perales, J. Carrasco, E. Diez, and j. Jimenez,
“Three-dimensional space-vector modulation
algorithm for four-leg multilevel converters using
abc coordinates”, IEEE Trans. Ind. Electron., vol.
53, no.2, pp. 459-466, Apr. 2006.

[5]Xilinx Inc.,”Foundation Series ISE 3.11 User


Guide’”2000.

You might also like