Professional Documents
Culture Documents
Abstract— The field of Analog VLSI design is an essential parallel array which acts as the image plane the architecture is
part of any electronics system because of our real world is shown in the fig. 1
analog, In this paper low power amplifier is presented for FT CCDs are very much like FF architectures. The difference
CCD array [1]. CCD are used to capture the images modern is that a separate and identical parallel register, called a
digital cameras and high resolution cameras consists of CCD storage array, is added which is not light sensitive. The idea is
array but all the performance of the CCD array is depends on to shift a captured scene from the photosensitive, or image
the performance of On-Chip amplifier which is placed at the array, very quickly to the storage array [5]. Readout off chip
end of the array in this paper single and two stage amplifier from the storage register is then performed as described in the
are simulated and the result is presented for the power and FF device previously while the storage array is integrating the
bandwidth by varying the sizes of the different transistors all next frame. The architecture is shown in the fig. 2
the results are verified by using the Tanner tool (version 7.1)
[11]. There are number of analysis presented by the
researchers in the literature to improve the power dissipation
but most of the structure are compromise sometimes with the
area or sometimes with the bandwidth here we have achieve
the lesser power dissipation but with the handsome value of
bandwidth is also maintained to support this claim the
detailed results are presented in the result section.
Keywords: Gain, power dissipation, bandwidth, capacitance
INTRODUCTION
1. Full-Frame (FF)
2. Frame-Transfer (FT)
VLP0101-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Both of the above architecture are widely used but the The Two stage amplifier further improves the character tics of
performance of both the architecture are depends on the type the amplifier and gives the better result which is shown in the
and the quality of the On-chip (output) amplifier which is result section of the paper and the architecture of two stages is
fabricated at the last stage of the structure as shown in the fig shown two stage amplifier also improves the sensitivity of
above. the amplifier and this also reduces the noise level of the
overall CCD.
ARCHITECTURE OF ON-CHIP AMPLIFIER
Output amplifier has also two type of the architecture
VRD
1. Single stage amplifier
2. Two stage amplifier
W=22u
L=2u
Mr Vdd
Reset gate pulse
VRD
W=22u W=22u
L=2u L=2u
M1 M2
W=22u FD
L=2u
VDD Detection node
VRG output
W=22u
L=2u
M1 W=22u W=22u
L=2u L=2u
FD VCS Mc M3
Detection Node out
W=22u
L=2u
Mc
OPTIMIZATION
For optimization of the on-chip amplifier Length and Width
Fig. 3 Single Stage CCD On-Chip amplifier
of the individual transistor are varied and the various
optimization results are obtained. The effect of increase and
The single stage amplifier consists of source follower M1 and decrease of Length and Width of the transistor is given as
load transistor Mc for biasing. The reset FET is connected to
the detection node and consists of floating diffusion [6, 7] and To achieve maximum gain:
the gate of M1. In the ON state it resets the detection node to a
reference voltage (VRD) and in the OFF state the floating can Transistor „M1‟: -The gain can be maximized by increasing
receives the next charge packet. The voltage source between the width of this transistor as this increases the difference in
the gate and source of the current sink Transistor Mc
the output voltage amplitude.
determines the bias current of the first stage and can be used
as a signal injection point to measure the ratio between total
capacitance and the effective sense capacitance and the Transistor „MC‟: -The gain can be maximized by decreasing
bandwidth in the off state. the width of this transistor as this increases the difference in
the output voltage amplitude.
VLP0101-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
To achieve maximum bandwidth: Table 1: When the width of the transistor M3 varied
Transistor „M1‟: - The bandwidth of the circuit can be Transistor M2 M3 Power Bandwidth
increased by increasing the width of this transistor as the Dimensions (W× (W× L) Dissipation BM
(W× L) μm L) μm μm (mW) (MHz)
increase in width increases the transconductance which helps
in increasing the bandwidth as the impedance decreases. M1 Mc
Transistor „MC‟: - The bandwidth of the circuit can be 15×25 12×10 20x10 10x25 5.9 302
increased by increasing the width of this transistor as the
15×25 12×10 20x10 12x25 5.95 320
increase in width increases the transconductance which helps
in increasing the bandwidth as the impedance decreases.
15×25 12×10 20x10 15x25 6.0 242
Transistor „M2‟: - The bandwidth of the circuit can be 15×25 12×10 20x10 18x25 6.1 207
increased by increasing the width parameter of this transistor.
So bandwidth can be increased by changing this parameter.
Table 2: When the width of the transistor M2 varied
Transistor „M3‟: - The bandwidth of the circuit can be
increased by increasing the width of this transistor as the Transistor M2 M3 Power Bandwidth
Dimensions (W× L) (W× L) Dissipation BM
increase in width increases the Tran conductance which helps (W× L) μm μm μm (mW) (MHz)
in increasing the bandwidth as the impedance decreases,
although the change desired is not that large. M1 Mc
VLP0101-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0101-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
INTRODUCTION
The most commonly used amplifier configuration of Fig.1 Two-stage Common Source Amplifier
MOSFETs is common source amplifier. The common-
source (CS) amplifier may be viewed as a transconductance The a.c. equivalent circuit of Fig. 1is shown in Fig. 2
amplifier or as a voltage amplifier. As a transconductance
RF
amplifier, the input voltage is seen to be modulating the
current going to the load. As a voltage amplifier, input
voltage modulates the amount of current flowing through the
FET, changing the voltage across the output resistance 2 4
accordingly.
This paper aims to develop the mathematical model of 1
common source amplifier. The floating admittance matrix of RD1 RD2
FET is taken to advantage for derivation of its voltage gain,
input resistance and output resistance in the common source rs RG1 3 RG2
configuration.
The two stage Common Source FET amplifier can be The matrix representation of FET as two-port network (four
represented as in Fig. 1 terminals) is written as
VLP0102-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
1 2 3 Y =
g s G G1 G F 0 g s G G1 GF
i1 ig = gg 0 gg 1 v1 v g (6.1) (5.1)
i2 id gm gd gm gd 2 v2 vd g m1 g d1 G D1 G G 2 g m1 g d1 G D1 G G 2 0
i3 is gg gm g d g g g m g d 3 v3 vs g m1 g s G G1 g d1 g m 2 G D1 G G 2 g m1 g d1 g m 2 g d 2 g s g d2 G L
(1) G G1 G D1 G G 2 G L
The admittance matrix of the FET as a device is expressed in
(1). Its coefficient matrix is expressed as GF g m2 g m 2 g d2 G L g d2 G L G F
1 2 3 (6)
Equation (6) represents the Floating Admittance Matrix [3],
Y =
gg 0 gg 1 [4], [5] of two stages Common Source Amplifier.
gm gd gm gd 2 Now from (6) the input impedance of circuit in Fig.2 can be
expressed as [1],[2]
gg gm gd gg gm gd 3
0 0 0
= (2) (g d1 g g 2 GD G G )(g d 2 GL GF)
gm gd gm gd =
(g g1 G G G F )(g d1 g g 2 g m2 GD G G )(g d 2 GL GF)
gm gd gm gd G F [(g m1g m 2 (g d1 g g 2 g m2 GD G G )G F ]
Gate to source resistance of FET is assumed to be very large
(7)
(ideally infinity) as it is always reverse biased, hence g g = Similarly, its output impedance and voltage gain can be
0 S. Then the above coefficient matrix of the FET of (1) expressed as [1], [2]
reduces to (2). Thus, the admittance matrix of two FETs
(device1 and device2) connected in Fig.2 can be written as
1 2 3
(g d1 g g2 GD G G )(g g1 GG GF)
0 0 0 1 =
Ydevice1 = (3) (g g1 gs GG G F )(g d1 g g2 g m2 GD G G )(g d 2 GF)
g m1 g d1 g m1 g d1 2 G F [(g m1g m 2 (g d1 g g2 g m2 GD G G )G F ]
g m1 g d1 g m1 g d1 3 (8)
43 Y13
11 43
A Sgn 4 3 Sgn 1 3 1
2 4 3 V 13
Y13
13
0 0 0 2 AV=
g m1g m2 G F (gd1 G D G G )
(9)
Ydevice 2 = (4)
(gd1 g g2 G D G G )(gd2 G L G F )
g m2 g d2 g m2 g d2 4
g m2 g d2 g m2 g d2 3 VERIFICATION ON MATLAB
Now the composite matrix of two devices (device1 and
43
device2) is written as The values of , , and A V 13 for different values of
1 2 3 4
source conductance and load conductance ( 0mS, 1mS, and
0 0 0 0 1 2mS) have been programmed through MATLAB. The
Ydevices = output of the MATLAB programs have been plotted for ,
g m1 g d1 g m1 g d1 0 2
43
g m1 g d1 g m 2 g m1 g d1 g m 2 g d 2 g d2 3 , and A V 13 with respect to feedback conductance, Gf .
0 g m2 g m 2 g d2 g d2 4 If we assume that the two MOSFETs of Fig. 2 are properly
biased to yield the same values of its internal parameters
(5)
( g d1 = g d 2 and g m1 = g m 2 ), then for plotting on demand
The over all admittance matrixes for Fig.2 is written as
VLP0102-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0102-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
The plot of output resistance as a function of feedback mS, thereafter Ro starts increasing exponentially (from
conductance (Gf) is shown in Figs.6, 7, and 8 for 0 S, 1 mS 237.9 Ω to 2829 Ω) from Gf = 0.03340 mS to Gf = 0.03341
and 2 mS of source conductance respectively as per (8). mS and suddenly jumps down (to -7836 Ω) as Gf reaches
Following observations are recorded from the plots in Fig. 6, 0.033411 mS. Again, Ro rises (from -7836 Ω to -22.83 Ω)
7 and 8: from Gf = 0.033411 mS to Gf = 0.0335 mS, and remain
constant thereafter at -22.83 Ω for higher values of Gf.
VLP0102-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
CONCLUSION
Plots in the Figs. 3 to 8 reveal a region of very sudden
change in the values of input resistance and output resistance
from very high positive values to large negative value, for
very small change of the order of 10-05 in the value of
feedback conductance, Gf. This zone of very high variation
in input and output resistances can be used for compensation
of resistances to obtain very high Q-factor in the lossy
networks.
REFRENCES
[1] Wai-Kai Chen, On second order cofactors and null return difference in
feedback amplifier theory, International Journal of circuit theory and
application, Vol. 6, Issue 3, pp. 305-312, Dec. 2006.
[2] Otso Juntunen , A two port S-parameter data transformation, circuit
theory laboratory report series, CT-35, Helsinki University of technology,
Finland, Espoo 1998.
[3] B.P. Singh, Unified Approach to electronics circuit analysis, IJEEE, pp.
276-285, July 1978.
[4] B.P. Singh, Active bridge for measurement of admittance parameters of
the transistors, Indian Journal of Pure and Applied Physics, Vol. 15, pp.
783-786, Nov. 1976.
[5] B.P. Singh, A new active bridge for measuring FET parameters, J Phys.
E. Scientific Instrument, Vol. II, pp. 667-670, 1978.
[6] Jacob Millman and Christos C. Halkias, Integrated Electronics, Analog
and Digital Circuits and Systems, TATA McGRAW-HILL publication, pp.
471-475, 2004.
[7]B.P. Singh, Meena Singh, Sanjay Kumar Roy and S.N. Shukla,
Mathematical Modeling of Electronic Devices and its integration;
Proceedings of National Seminar on Recent Advances on Information
Technology, Allied Publishers Pvt. Ltd., Indian School of Mines Dhanbad
University, pp.494-502, Feb. 6-7, 2009
[8]B.P. Singh, Arun Kumar Singh, verification of transfer functions
of BJT obtained by using MATLAB, Proceedings of IEEE National
Symposium on Innovative Development in Electronics Arena, Arya
College of Engineering, pp. 92-96, Dec. 12, 2009.
VLP0102-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
sksingh@doeaccgkp.edu.in
Abstract—Due to the increasing importance of power electronics Inverters are used in hybrid electric vehicles to
in control of devices particularly in electrical vehicles the convert the DC supply coming from battery into AC for use in
reliability analysis becomes important. The reliability of a motor to run the vehicle. Inverters are made up of
component is the probability that this component will perform its semiconductors and capacitors, so it is important to assure the
intended function after a time ‘t’ in a given operating condition. reliability of these components. Because malfunctioning of
Nowadays component reliability is not very important by any of the power electronic components may prevent the
considering only the power losses. For predicting reliability of vehicle to operate.
power electronics components temperature and temperature
Mainly three phase voltage source inverters are used
cycle are to be determined.
Military handbook [3] has been released by US
in these types of applications. Here IGBTs are used as
department of defence is generally accepted and often used to switching devices. For designing an inverter, it is important to
determine reliability [1]. Now the handbook is not revised and make a good thermal design such that on the one hand the
new components like IGBTs are not considered here the values temperature of the components never exceeds their specified
are too conservative for available devices. Some manufacturers maximum temperature and on the other hand the cooling
gives information of finding reliability through information that system is not oversized.
only continue to finding switching losses and total power losses, document is a template. An electronic copy can be
very few of them gives the thermal model of the devices. The downloaded from the conference website. For questions on
information of calculating the power losses and thermal
paper guidelines, please contact the conference publications
modelling is presented in [5] based on PWM reconstruction
technique. This method is useful for large simulation time step
committee as indicated on the conference website.
and particularly for long mission profiles. D. Hirschman Information about final paper submission is available from the
presented an approach with simple formulas for reliability conference website.
prediction of inverters in HEVs. Work presented in literature so
far has developed reliability models for power electronics II. BASICS OF RELIABILITY CALCULATION
components but not bothered about the effect of PWM method 2.1 INTRODUCTION
on the reliability. This work presents the comparison between
―The reliability of a component is the probability that
six-step PWM based inverter and SVPWM based IGBT inverter
on finding reliability. In this work reliability is found by this component will perform its intended function after a time
conventional method and also by considering thermal cycles. t in a given working condition.‖
MATLAB/Simulink based models for finding out the switching The Global reliability of the system is the product of all
losses and temperature cycles are developed. reliabilities
VLP0103-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Let be the total no. of identical components for The component failure rate is computed by multiplying
reliability. a component base failure rate with application specific -factors.
is the no. of components surviving at time‗t‘. Failures/
is the no. of components failed at time‗t‘. (2.6)
Then at any time, + = and Reliability is Here is Base Failure Rate
given as is Temperature Factor
= is Application Factor
is Quality Factor
(2.1)
is Environmental Factor
Reliability is characterized by the failure rate .
However no-factor exist which takes temperature cycles into
The failure rate is the probability that a component, which is
consideration.
still operational at time , fails in the time interval ,
where . Thus, it gives the fraction of failures in a
certain time interval for defined boundary conditions. The unit III. ELECTRICAL MODELING AND
of the failure rate is FIT (failures in time) CALCULATION OF POWER LOSSES
A. During the design phase of an inverter, it is important to
make a good thermal design such that on the one hand the
(2.2) temperatures of the components never exceed their
The total failure rate for a system, consisting of k specified maximum temperature and on the other hand the
components, is the sum of all single failure rates , given as cooling system is not oversized. In hybrid electric vehicles,
the inverter load cannot directly be derived from the
(2.3) current load status. Instead, the inverter load is computed
The mean time to failure (MTTF) is also used to characterize by a complex algorithm that considers the motor speed, the
the reliability. MTTF is mean time elapsed before the first required torque, the state of charge of the traction battery
failure occurs, is equal to the area under the reliability curve. etc.
The electrical simulation includes the inverter model and
computes the currents and voltages at the terminals of the
(2.4) inverter. These values are stored in a file which is used as
It can be calculated easily as input for the thermal simulation. The advantage of this
procedure is that the results of the electrical simulation can be
reused for different thermal simulations, if nothing in the
model is changed.
(2.5) SIMULATION AND RESULTS
Different approaches can be used to calculate reliability. 5.1 INTRODUCTION
Well known method is to use Failure rate catalogs. There are A block diagram representation of the whole work is
various failure rate catalogs available e.g. Military Handbook shown in fig 5.1. The fig shows a three-phase inverter with
(MIL-HDBK-217F) and Recueil de Données de Fiabilité IGBT/Diode as a switching device, constant DC supply as an input to
(RDF 2000). the inverter model and a three-phase load.
The losses in IGBT i.e. conduction loss and switching loss
2.2 Military Handbook (MIL-HDBK-217F) Method is calculated and fed to the thermal model. Here it should be noted
that switching losses in an IGBT can be found by using datasheets.
Military Handbook 217F has been released in 1995 by the
The thermal model gives the junction temperature as an output,
US Department of Defense, Washington DC. This revised which is later used in calculating reliability of the devices.
version is also the last version as the Department of Defense
has discontinued updating this standard. Hence, new
electronic devices like IGBTs are not considered in this
standard and many reference values are too conservative for
the currently available devices. Regardless, MIL-HDBK-217F
is generally accepted and often used to determine reliability.
The models have been developed, based on the historical part
failure rates.
VLP0103-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0103-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
TABLE 5.1
Calculated and
IGBT DIODE
= 0.238 °C/W = 0.650 °C/W
= 0.362 °C/W =0.650 °C/W
= 0.095 J/°C = 0.064 J/°C
= 0.240 J/°C = 0.133 J/°C
= -0.499 = -1.099
= 7.772 = 6.901
= 69.518 = 40.212
VLP0103-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Stator current
Stator Current (A)
TEMPERATURE CURVE
50 80
60
-50
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
50
Time (sec)
Rotor speed 40
2000
Speed (rpm)
30
1000
20
0 10
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec) 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
50 TIME (in seconds)
Torque (Nm)
number of times
10
Fig.5.4 Stator Current, Speed and Torque Curve
In fig. 5.5 the changes in torque and speed values is shown 8
which clearly indicates the changes that occur at time 1sec, 1.5sec, 6
2.5sec and 4sec. The change in the curve of stator current takes place 4
in accordance with changes in speed and torque values. 2
A speed reference step from 0 to 1800 rpm is applied at t = 0
0 5 10 15 20 25 30
0. The speed set point doesn't go instantaneously at 1800 rpm but difference in temperature
follows the acceleration ramp. The motor reaches steady state at t = 1
s.
At t = 1.5 s, a decelerating torque is applied on the TEMPERATURE CURVE
35
motor's shaft. We can observe a speed decrease. Since the rotor speed
TEMPERATURE (in degree Celcius)
30
is higher than the synchronous speed, the motor is working in the
25
generator mode. The braking energy is transferred to the DC
link and the bus voltage tends to increase. However the over voltage 20
limits. 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
At t = 2.5 s, the torque applied to the motor's shaft TIME (in seconds)
steps from 30 Nm to 0 Nm .You can observe a DC bus voltage and number of detected temperature cycles
30
speed drop. At this point, the DC bus controller switches from
braking to motoring mode. 25
20
0
0 5 10 15 20 25 30 35 40
difference in temperature
VLP0103-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
100
∆T n(reldata) N N(Total)= N* 1/N
n(reldata)
Power Loss (W)
30
7 4 7.0469e+006 2.82E+07 3.55E-08
20
8 4 6.7032e+006 2.68E+07 3.73E-08
10
0
9 9 6.3763e+006 5.74E+07 1.74E-08
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec) 10 5 6.0653e+006 3.03E+07 3.30E-08
100
11 5 5.7695e+006 2.88E+07 3.47E-08
Temp(deg cel)
VLP0103-6
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Power Loss(W)
100
50
Temp(deg cel)
TECHNIQUE 20
Stator current
10
50
Stator Current(A)
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0 Time (sec)
60
Temp(deg cel)
-50
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
40
Time (sec)
Rotor speed
2000 20
1500
Speed (rpm)
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
1000 Time (sec)
500
Fig
0 5.9 Power Loss and Temperature Curves
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec) TEMPERATURE CURVE
30
TEMPERATURE (in degree Celcius)
Electromagnetic Torque
40 25
20
Torque(Nm)
20 15
10
0 5
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
TIME (in seconds)
-20
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 number of detected temperature cycles
Time (sec) 15
number of times
10
Fig. 5.8 Stator Current, Speed and Torque
5
At time t = 0 s, the speed set point is 1800 rpm. The speed
follows precisely the acceleration ramp. Speed comes to a steady 0
0 5 10 15 20 25 30 35
state at t=1 sec. difference in temperature
VLP0103-7
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
TEMPERATURE CURVE
60
40
∆T n(reldata2) N N(Total)= 1/N
N* 30
n(reldata2) 20
3 3 8.6071e+006 2.58E+07 3.87E-08
10
4 3 8.1873e+006 2.46E+07 4.07E-08
5 3 7.7880e+006 2.34E+07 4.28E-08 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
6 4 7.4082e+006 2.96E+07 3.37E-08 TIME (in seconds)
number of times
11 3 5.7695e+006 1.73E+07 5.78E-08
20
12 3 5.4881e+006 1.65E+07 6.07E-08
13 3 5.2205e+006 1.57E+07 6.39E-08 15
MTTFs
VLP0103-8
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
The results for SVPWM technique for simulation time of inverter,‖ in Proc. 37th IEEE Power Electron. Specialists
50sec is given in appendix II. Conf., 2006, PESC ’06, Jun. 2006, pp. 1–5.
REFERENCES [14] Semikron Application Handbook. Berlin, Germany: ISLE
Verlag, 1998. ISBN 3-932633-24-5.
VLP0103-9
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0103-10
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1
Abstract— Multi-threshold CMOS (MTCMOS) technology calculation in section III, simulation results in section IV and
features the MOSFETS having low threshold voltage (for speed conclusion in section V.
enhancement) and high threshold voltage (for suppressing
standby leakage current during sleep period). In this design, Vdd
frequent transition of mode i.e. active to sleep and sleep to active
may occur, which consumes significant amount of energy. This
paper presents charge recycling concept between virtual supply
and virtual ground to reduce dynamic energy consumption Virtual
Vdd
during mode transition. This paper presents the simulation of Vdd (Vp)
two bit carry ripple adder used in 2 bit accumulator depicting
reduction of 75% dynamic energy consumption during mode CMOS Full
transition as compared to a ripple adder with conventional Carry
Adder CMOS Full
MTCMOS.
Adder
Virtual
Index Terms— Charge recycling, Gated ground, Gated-power, Gnd (VG)
Multi-threshold voltage, Virtual power node,
I. INTRODUCTION
VDD
Sleep
VCR
VDD
Virtual Vdd
CMOS Full (Vp)
Adder
VCR CMOS Full
Virtual Gnd Adder
(VG) Carry
Fig. 2. Virtual Ground Voltage VG =1.3V and Virtual supply voltage VP = Sleep
0V during sleep mode
TABLE I.
Process parameter of TSMC 180 nm process for VDD =1.8 V
Parameters NMOS PMOS
CGDO (fF/µm) 0.37 0.33
CJ (fF/µm2) 0.77 0.85
CJSW (fF/µm) 0.18 0.33
CGD CDB
CGB
CGS CSB
50
45
40
35
30
Enargy(fJ)
25
20
15
10
0
Conventional Gated - Charge Recycling
MTCMOS MTCMOS
Fig.8. The energy overheads of the MTCMOS 2-bit ripple adders
V. CONCLUSION
In this paper, a charge recycling MTCMOS technique for
two bit ripple adder is proposed to reduce the dynamic energy
overhead during mode transition from sleep to active and
active to sleep transition. Transmission gate is used for charge
recycling between virtual rails. We have shown the reduction
of 75% of energy overhead during mode transition i.e. active
to sleep and sleep to active, in charge recycling technique with
compare to conventional one. Here, in the standby mode, the
circuit lost the data. So in future, we can propose the data
retentive circuit in this circuit.
REFERENCES
[1] S.Mutoh, T. Douseki, Y. Matsuya, T. Aoki, S. Shigematsu, and J.
Yamada, “1 V power supply high-speed digital circuit tehnology with
multi-threshold-voltage CMOS, “IEEE J. Solid-State Circuits, vol. 30,
no.8, pp.847-854, Aug.1995.
[2] A. Abdollahi, F. Fallah, M. Pedram “ A Robust Power Gating Structure
and Power Mode Transition Strategy for MTCMOS Design”, IEEE
Trans. Very Large Scale Intergrated Sysytem, vol. 15, Jan. 2007.
[3] E. Pakbaznia, F. Fallah, and M. Pedram, “Charge recycling in
MTCMOS circuits: concept and analysis,” in Proc.ACM/IEEE Des.
Autom. Conf., 2006,pp 97-102.
[4] Z. Liu and V. Kursun, “ Charge Recycling between Virtual Power and
Ground Lines for Low Energy MTCMOS,” Proceedings of the
IEEE/ACM International Symposium on Quality Electronic Design. Pp.
239-244, March 2007.
[5] J. P. Uyemura, Introduction to VLSI CIRCUITS ANS SYSTEMS,
WIELWY Student edtition.
[6] S. Mo. Kang, Y. Leblebici, CMOS Digital Intergrated Circuits-
Analysis and Design, 3rd ed. PEARSON Education.
[7] N. H.E . Weste, D. Harris, A. Benerjee, CMOS VLSI Design- A circuit
and system perspective, 3rd ed. PEARSON Education.
[8] J. M. Rabey, A. P. Chandrakasan, B. Nicolic, Digital Intergrated Circuit,
A Design Perspective, 2nd ed.
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
Abstract— CMOS technology has reached to the level of sub- believed that the CMOS device downsizing will approach the
45nm range. It is expected that the nano-CMOS technology will physical limit.
govern the IC manufacturing at least for another couple of
decades. Though there are many challenges ahead, further down-
sizing the device to a few nanometers is still on the schedule of
International Technology Roadmap for Semiconductors (ITRS).
Several technological options for manufacturing nano-CMOS
microchips has been available or will be available very soon. This
paper reviews the challenges of nano-CMOS downsizing and will
focus on the recent developments on the key technologies for the
nano-CMOS in the years to come.
I. INTRODUCTION
Among numerous great inventions made in the 20th
century, electronics is the most important one. Almost every
thing related to human activities, such as power generation,
transportation, entertainment, medical care, is now provided
and controlled by electronics. Semiconductor is strategically
an important technological area for all nations. The electronic
circuit development has been accomplished with the
downscaling of component size since the replacement of Figure 1: Feature size versus time in silicon ICs.
vacuum tubes with transistors 40 years ago. The circuit
characteristics have benefited a lot from the downsizing. We II. CHALLENGES IN SCALING
are now able to integrate millions of CMOS transistors at the Device downsizing from 10 μm to the sub-45-nm range
nanoscale level on the silicon chip with only few centimetres presented a lot of benefits in terms of speed, power, and cost.
square of area occupied. Right now the operating speed of the But apart from the improvements, reported above, one of the
recently developed microprocessor has already reached upto 5 major problems for performance degradation in the ultra-large
GHz and is expected to increase further. Although recent scale circuits is the interconnect delay due to the increase in
trends indicate that the increase in the clock frequency may the resistance and the capacitance values of narrow and dense
gradually get saturated. The CMOS integrated circuits as well interconnection metal lines (parasitic). Furthermore, the
as their core device technology are expected to evolve further performance improvement is also questionable for the ultra-
for at least a couple of decades and their importance will be small MOSFET itself. According to the scaling theory, the
further increased in future intelligent systems. CMOS device drain current per unit gate width should remain constant.
dimensions have been reduced to a millionth at the production However, a significant reduction of the drain current value per
level in the past 100 years. Hundred years ago, no one could unit gate width for sub-45nm gate length MOSFETs was
have ever imagined that the mankind of our time will be able reported recently (as in Fig. 2), [2]. This phenomenon is due
to make any such electronic components which will consist of to the non-optimized MOSFET structure and process. On the
billions of electronic components with dimension smaller than other hand, the small drain current (of several tens of micro-
the bacteria size and those circuits will fulfil the different Ampere per micrometer) at the scaled supply voltage becomes
needs of the society. Future scaling trends have been predicted a major concern. Besides, the fringing capacitance of the gate
by the International Technology Roadmap for Semiconductors electrode, and the inversion layer capacitance will also
(ITRS) for 30 years up to 2040, when the physical gate length degrade the performance of the ultra-small MOSFETs (as in
is expected to be 1 nm (as shown in figure 1), [2]. It is Fig.3), [2]. It is still doubtful at this moment that such a small
MOSFET can be used for high-speed devices. Hence, without
VLP0105-1
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
VLP0105-2
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
IV. CONCLUSIONS
Silicon MOSFETs have been the smallest electronic device
for several decades. The gate length used for high
performance logic unit is 45 nm in production and 5 nm in
research. Note that the 5-nm gate length is the distance of 18
atoms and 0.8-nm oxide thickness is two atomic layers only.
Si technology is no doubt the most successful nano-devices.
We do not see that there is any realistic replacement for
silicon devices. Even the Si devices reach the downsizing
limit no matter 10 nm, 5 nm, or 1 nm, other emerging devices
such as molecular transistors will also reach their limit of
downsizing in similar dimensions. It is a critical period for
moving from 45-nm to 10-nm technology within this decade.
Most of the materials and the manufacturing processes used in
Figure 5: FinFET structure the deep-submicron era are now pushing to their physical
limits. New materials and technologies are required for further
This may solve the alignment issue, but there is one other down-scaling the device to 10-nm technology and below.
challenge to overcome. In order to control SCE, the body Immersion lithography for ultra fine patterning, strained
thickness must be ¼ of the gate length. This is a daunting channels, nickel salicide, high-k gate dielectric, low-k
challenge because the gate length is usually the smallest interlayer for interconnect, plasma doping, flash and laser
dimension that can be fabricated. There are some technologies annealing for source and drain doping, elevated source and
that may address this, but more work needs to be done in this drain and three-dimensional MOSFETs for controlling short-
area. channel effects, would help to overcome the materials and
The most popular idea is to use carbon nanotubes (CNTs) as technological constraints and improve the device performance
transistors (a configuration example is shown in Fig. 6). This in the ultra-small scale. The final remark is a non-technical
concept is very appealing because it is still a transistor and issue. We anticipate that this issue will be one of the most
could make use of all the architectural knowledge developed important issues for nano-CMOS technology development in
for CMOS. Carbon nanotubes however do have a long way to the next 15 years. We are aware that most of the new mega-
go before they can start replacing the silicon based MOS fabs being planned or under construction are in the East and
transistors. First of all, nanotube transistors developed till date Southeast Asia, and particularly the Mainland China. In 10 or
has shown very poor performance characteristics. Many of the 15-year’s time, the distribution of semiconductor
problems they are exhibiting are similar to the challenges manufacturing sites in Asia (including Japan) will be quite
CMOS is currently facing, such as high off-state leakage and substantial. Currently, Korea and Taiwan are in the first place
source-to-drain tunneling. Also, despite the hopes for for semiconductor memory manufacturing and semiconductor
chemical self assembly some day, it is still very difficult to foundry, respectively. They also lead the technology
produce nanotube transistors. development in Asia region. Mainland China seems to be
another super power for semiconductor manufacturing. The
share of China semiconductor manufacturing will keep fast
growing with the support of booming IC design houses,
constructing new fabs with remarkable increase in industrial
investment, and will be the most important huge and rapidly
expending market. As many other industries and other sectors
of electronic products, Mainland China will eventually
become ―the factory of the world‖ in semiconductor
manufacturing in 15 years or longer and will have great
impact on the future nano-CMOS technology.
REFERENCES
VLP0105-3
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
VLP0105-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0106-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
1 2 3 Y =
i1 ig = gg 0 gg 1 v1 vg g s G G1 G F 0 (6.1) g s G G1 GF (5.1)
i 2 id gm gd gm gd 2 v 2 vd
g m1 g d1 G D1 G G 2 g m1 g d1 G D1 G G 2 0
i3 i s gg gm gd gg gm g d 3 v3 vs
(1) g m1 g s G G1 g d1 g m 2 G D1 G G 2 g m1 g d1 g m 2 g d 2 g s g d2 G L
RF G G1 G D1 G G 2 G L
GF g m2 g m 2 g d2 G L g d2 G L G F
2 4 (6)
Equation (6) represents the Floating Admittance Matrix [3],
1 [4], [5] of two stages common source amplifier.
RD1 RD2 Now from (6) the input impedance of circuit in Fig.2 can be
expressed as [1]-[3]
rs RG1 3 RG2
gm gd gm gd (8)
Y13
43 11
43
A Sgn 4 3 Sgn 1 3 1
Thus the floating admittance matrix of two MOSFETs V 13
Y13
(device1 and device2) connected in Fig.2 can be written as 13
1 2 3 g m1g m2 G F (gd1 G D G G )
AV= (9)
0 0 0 1 (gd1 g g2 G D G G )(gd2 G L G F )
Ydevice1 = (3)
g m1 g d1 g m1 gd1 2
VERIFICATION ON MATLAB
g m1 g d1 g m1 g d1 3
43
2 4 3 The values of , , and A V 13 for different values of
0 0 0 2 source conductance and load conductance ( 0mS, 1mS, and
Ydevice 2 = (4)
2mS) have been programmed through MATLAB. The
g m2 gd 2 g m2 gd 2 4
g m2 gd 2 g m2 gd 2 3 output of the MATLAB programs have been plotted for ,
43
Now the composite matrix of two devices (device1 and , and A V 13 with respect to feedback conductance, GF .
device2) is written as If we assume that the two MOSFETs of Fig. 2 are properly
0 0 0 0 1 biased to yield the same values of its internal parameters
Ydevices = g m1 g d1 g m1 g d1 0 2 ( g d1 = g d 2 and g m1 = g m 2 ), then for plotting on demand
g m1 g d1 g m2 g m1 g d1 g m2 g d 2 g d2 3 value of simulated input and output resistances, typical
0 g m2 g m 2 g d2 g d2 4 values of external parameters along with its internal
(5) parameters can be given as:
The overall admittance matrixes for Fig.2 is written as
VLP0106-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
g d1 = g d 2 = 0.1mS, g m1 = g m 2 = 5mS, G L = G D = 1mS, 225.4 Ω) till GF = 0.002 mS and remains constant thereafter
G G1 = G G 2 = G G = 0.001mS, g g1 = g g 2 = 0.0001mS, G F at -225.4 Ω for higher values of GF.
VLP0106-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0106-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
REFRENCES
[1] Wai-Kai Chen, On second order cofactors and null return difference in
feedback amplifier theory, International Journal of circuit theory and
application, Vol. 6, Issue 3, pp. 305-312, Dec. 2006.
[2] Otso Juntunen , A two port S-parameter data transformation, circuit
theory laboratory report series, CT-35, Helsinki University of technology,
Finland, Espoo 1998.
[3] B.P. Singh, Unified Approach to electronics circuit analysis, IJEEE, pp.
276-285, July 1978.
[4] B.P. Singh, Active bridge for measurement of admittance parameters of
the transistors, Indian Journal of Pure and Applied Physics, Vol. 15, pp.
783-786, Nov. 1976.
[5] B.P. Singh, A new active bridge for measuring FET parameters, J Phys.
E. Scientific Instrument, Vol. II, pp. 667-670, 1978.
[6] Jacob Millman and Christos C. Halkias, Integrated Electronics, Analog
and Digital Circuits and Systems, TATA McGRAW-HILL publication, pp.
471-475, 2004.
[7]B.P. Singh, Meena Singh, Sanjay Kumar Roy and S.N. Shukla,
Mathematical Modeling of Electronic Devices and its integration;
Proceedings of National Seminar on Recent Advances on Information
Technology, Allied Publishers Pvt. Ltd., Indian School of Mines Dhanbad
University, pp.494-502, Feb. 6-7, 2009
[8]B.P. Singh, Arun Kumar Singh, verification of transfer functions
of BJT obtained by using MATLAB, Proceedings of IEEE National
Symposium on Innovative Development in Electronics Arena, Arya
College of Engineering, pp. 92-96, Dec. 12, 2009.
VLP0106-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
ABSTRACT:
CML or current mode logic is a in robustness to switching noise as
differential logic style which offers high compared to CMOS logic style [1]. Also,
noise immunity and high speed of at high frequencies (hundreds of MHz to
operation. In this paper we compare the GHz range) CML style is more power
performance of PFSCL or positive efficient than CMOS logic[7].This type of
feedback source coupled logic with logic was first implemented using bipolar
MCML or MOS current mode logic which transistors [5] and extended for application
are derivatives of CML style. We show with MOS transistors. It has less power
through simulations on Orcad PSPICE consumption than ECL but is slower than
using .18nm technology that PFSCL offers ECL.
significant advantages over MCML in
terms of power consumption, area MCML is a extension of Current Mode
occupied and propagation delay . Logic where MOSFET is used as the
transistor instead of BJT. A constant
Due to growing market for digital signal current source is used to bias the
processing and optical communication differential pair of transistors which
applications, commercial interest in high switches the current from one of the pair to
resolution mixed signal ICs has been another depending upon the applied input.
growing. In mixed signal ICs the analog The differential operation suppresses the
and the digital blocks are integrated on the noise coupled with the signal inputs.
same base and hence the resolution of the
analog block is limited by the dynamic PFSCL is new logic style which introduces
switching noise produced by the digital positive feedback into single ended
block. Hence CMOS logic style is not MCML gates [ 7]. This eliminates the need
suitable as it is suffers from dynamic for complementary second input signal
switching noise. Also, for CMOS the while still maintaining the differential
advantage of having zero static power mode of operation.
consumption is lost when it is used at In the following, the operation of MCML
hundreds of MHz to GHz of frequencies. gates is explained in section II. The
Several other logic styles have been architecture of PFSCL and its operation is
proposed to reduce the dynamic switching addressed in section III. In section IV,
noise in mixed signal ICs such as in [2],[3] result of comparison between the
and [4]. The CML style offers advantage performance of PFSCL and MCML is
VLP0107-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
MCML GATES :
To understand the operation and the ground. When vi is low,the PU switch will
unique properties of MCML we consider be closed and the PD switch open
the simple case of an inverter and will see establishing vo=vdd. Next if vi is raised to
different configurations for its logic high, the PU switch will be open
construction.[8] while the PD switch will close thus
establishing vo=vdd. This circuit constitutes
Inverters can be implemented using the basis of the CMOS inverter.
transistors operating as voltage controlled
switches. The simplest configuration is as The third type of configuration can be
shown in the figure below: implemented using a double –throw switch
as shown below :
[from ref 9]
[from ref 9]
VLP0107-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
III PFSCL
In PFSCL, the MCML logic style is MCML.
modified to include positive feedback from
the drain of M1 vo1 to the gate of M2, the
second transistor of the differential
pair.[10]
VLP0107-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
NM = Vswing/2 (1-1/Av)
6.00E-05
Area required
5.00E-05
4.00E-05 W1+W2+W3 MCML
3.00E-05
2.00E-05 W1+W2+W3 PFSCL
1.00E-05
0.00E+00
0.00E+ 1.00E- 2.00E- 3.00E- 4.00E- 5.00E-
00 04 04 04 04 04
Iss,bias current
This graph shows that as the bias current occupied by PFSCL. The advantage in
value increases, the area occupied by area also leads to decrease in associated
MCML increases at a faster rate than area parasitic capacitance which in turn causes
VLP0107-6
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
2.50E-09
2.00E-09
1.50E-09
t_d pfscl
t_delay
t_d MCML
1.00E-09
5.00E-10
0.00E+00
0.00E+00 2.00E-05 4.00E-05 6.00E-05 8.00E-05 1.00E-04
Iss
This graph shows the advantage of PFSCL gate vs MCML in terms of speed of operation.
This enables the extension of CML architecture into the GHz frequency range.
Monte Carlo Simulations were also carried From the simulation result it was found
out on PFSCL vs MCML gate to that PFSCL was more robust and its
determine the robustness of the logic style robustness increases as the bias current
to process variations(eg: tox ) and increases.
variations in Vth(the threshold voltage of
the MOS).
VLP0107-7
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
REFERENCES :
1) D. Allstot, S. Chee, S. Kiaei, and basic concepts and
M. Shristawa, “Folded source- perspectives(lecture), Massimo
coupled Logic vs. CMOS static Alioto,2007
logic for low-noise mixed-signal
7) Modeling and Evaluation of
ICs,”IEEE Trans. Circuits Syst. I,
Positive-Feedback Source-
vol. 40, pp. 553–563, Sept. 1993.
Coupled Logic, M. Alioto,
2) S. Kiaei, S. Chee, and D. Allstot, Member, IEEE, L. Pancioni, S.
“CMOS source-coupled logic for Rocchi, Member, IEEE, and V.
mixed-mode VLSI,” in Proc. Int. Vignoli, Member, IEEE, IEEE
Symp. Circuits Systems, 1990, Tansactions on Circuits and
pp.1608–1611. Systems—I: Regular Papers vol.
51, NO. 12, December 2004
3) J. Kundan and S. Hasan,
“Enhanced folded source- 8) A. Sedra and K. Smith,
coupled logic techniquefor low- Microelectronic Circuits,Oxfords
voltage mixed-signal integrated
9) M. Alioto and G. Palumbo,
circuits,” IEEE Trans.Circuits Syst.
“Design strategies for source
II, vol. 47, pp. 810–817, Aug.
coupled logic gates,” IEEE Trans.
2000.
Circuits Syst. I, vol. 50, pp. 640–
4) J.Kundan and S. Hasan, “Current 654, May 2003.
mode BiCMOS folded source-
10) Modeling and Evaluation of
coupled logic circuits,” in Proc.
Positive-Feedback, Source-
ISCAS, June 1997, pp. 1880–
Coupled Logic, M. Alioto,
1883.
Member, IEEE, L. Pancioni, S.
5) ] P. Gray, P. Hurst, S. Lewis, and Rocchi, Member, IEEE, and
R. Meyer, Analysis and design of V. Vignoli, Member, IEEE, IEEE
analog integrated circuits, 4th Transactions on Circuits and
ed. New York: John Wiley & Systems—I: Vol. 51, No. 12, Dec
Sons, 2000. 2004
VLP0107-9
Comparative Study of Fast Adders using VHDL and FPGA
Abstract: Adders are one of the most widely lookahead adder and its variations, and carry-save
used components in integrated circuits and they adders.
are most commonly used in various electronic Several researchers had worked on the
applications. The major challenge for VLSI performance analysis of adders and other
designer is to reduce area of chip and the next researchers on the performance analysis of
multipliers. Therefore, lot of research is going on to
phase is to increase the speed of operation to
reduce power consumption. Therefore, there are
achieve fast operations.
three performance parameters on which a VLSI
Therefore, various adders such designer has to optimize their design i.e. Area,
Speed and Power[2]. It is very difficult to achieve
as the ripple adder, carry-look-ahead adder, carry
all constraints for particular design, therefore
select adder etc. are compared and VHDL is used
depending on demand or application some
in their comparison. Their comparative study compromise between constraints has to be made.
included the use Xilinx 9.2i as the synthesis Hence, the VHDL codes
tool, Xilinx ISE Simulator as the simulation tool have been formulated for these fast adders and to
and FPGA Spartan-II kit for the implementation get area and delay report, Xilinx 9.2i is used as
of these adders.In this comparison study, area the synthesis tool. In addition to this, Xilinx ISE
and delay report is generated for these adders Simulator is used for simulation and FPGA
and the VHDL codes can be as well implemented Spartan –II kit is used for implementation.
on the FPGA Spartan-II kit.
One of the most widely used components in Ripple Carry Adder (RCA)
integrated circuits are adders, so designing efficient
adders has been the goal of research in VLSI It is possible to create a logical circuit using
design. Addition is a crucial arithmetic function for multiple full adders to add N-bit numbers. Each full
most digital systems. Various adder structures can adder inputs a Cin, which is the Cout of the previous
be used to execute addition such as serial and adder. This kind of adder is a ripple carry adder,
parallel structures. They are used not only for since each carry bit "ripples" to the next full adder.
addition, but also for other operations such as Ripple carry adder can be designed by cascading
subtraction, multiplication, division, and address full adder in series i.e. carry from previous full
computation .Adders are one of the most widely adder is connected as input carry for the next stage.
used components in integrated circuits and they are Full adder is a basic building block of Ripple carry
most commonly used in various electronic adder. The major limitation of Ripple carry adder is
applications e.g. Digital signal processing in which that as the bit length goes on increasing, delay also
adders are used to perform various algorithms like increases. Therefore, Ripple carry adder is not
FIR, IIR etc[1]. In past, the major challenge for suitable if large number bits are to be added.
VLSI designer is to reduce area of chip by using
efficient optimization techniques.
Apart from aiding a The two Boolean functions for the
designer in selecting an adder with favorable sum and carry are:
characteristics, aim is providing insight into design
tradeoffs that can save power and enhance SUM = Ai ⊕ Βi ⊕ Ci (i)
performance. The adders studied include linear
Cout = Ci+1 = Ai · Bi + (Ai ⊕ Bi) · Ci (ii)
time ripple carry and manchester carry chain
adders, carry skip and carry select adders, carry
Fig 1. Ripple carry adder
Ci+1=Gi+Ci.ti (x)
ti = Xi + Yi (xi)
Si = xi xor yi (xii)
Ci = xi and yi (xiii)
Carry 22.792 17
Lookahead
Adder
References
ABSTRACT: Organic Thin Film Transistors (OTFTs) applications. Some important applications like
are out breaking their performance over the past few display drivers, advertising boards, smart cards,
years and becoming very attractive for large range of
applications such as oscillators, flexible display devices, wall sized televisions, identification tags, portable
small and large scale and even integrated optoelectronic products such as modern cell phones and video
devices. Transistor based on organic semiconductor as games [1]. Organic material based devices like
active layer to manage electric current flow is known as Organic Thin Film Transistor (OTFT), Organic
organic thin film transistor. For the last decade
Field Effect Transistor (OFET), Organic Light
organic/polymeric materials have been extensively
investigated for substrate, conducting semiconductor Emitting Diode (OLED) and Solar Cell have
layer, dielectric and contact electrodes for thin film numerous advantages of low cost, flexibility and
transistor (TFT) devices. In organic thin film transistor, light weight than their inorganic counterparts.
the type of semiconductor, processing, doping and Organic semiconductors can be processed at low
structure can affect their electrical characteristics. This
paper presents new insight into structure, organic
temperatures compatible with plastic substrate
materials, conduction mechanism and performance whereas higher temperatures are required for
characteristics of OTFT. However pentacene based alternative Si based devices [2, 3]. Organic
bottom and top contact structure has been modelled to transistors can usually be manufactured at or near
characterise adopted structures for organic transistor. room temperature, unlike silicon based
It explores the current status of OTFTs in terms of
various parameters such as contact resistance, effect of transistors, which typically require fairly high
channel length, active layer thickness and on/off current process temperatures (>800ºC for crystalline Si
ratio etc. Organic electronic products are lighter, more transistor).
flexible and less expensive than their inorganic For simulation of OTFTs certain structures
counterparts. These are also biodegradable being made
have been proposed. In order to enhance the
from carbon. This opens the door to many exciting
applications that would be impossible using silicon. device speed, considerable research effort has
Since OTFT provide simple and low cost processes, its been devoted to increase the mobility of organic
application to display has been discussed. materials by improving deposition conditions [4,
5]. At the same time as a result of this effort,
Keywords: Bottom and Top Contact Structures of
OTFTs, Contact Resistance, Mobility, Organic
mobility exceeding 1 cm2/V.sec for Pentacene
Materials, Organic Thin Film Transistors. [6], this is of comparable value to amorphous
hydrogenated silicon (a-H:Si) and 0.1 cm2/V.sec
for poly (3-hexylthyophine) P3HT [7]. In addition
1. INTRODUCTION to mobility, other ways of improving performance
of OTFTs such as channel length scaling and
Organic electronics has the potential to create active layer thickness have also attracted
new range of devices, circuits and their considerable attention [8]. This paper first
VLP0109-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
2. OTFT STRUCTURES
(a)
semiconducting materials is around this level. effort has gone into the development of organic
Adding nickel on gold improves adhesion of the n-channel OTFTs because this allows the
gold on the oxide. Platinum electrodes are implementation of complementary circuits with
inferior to gold electrodes. Aluminum shows low static power consumption [9, 18]. Table-2
slightly higher electron mobility (2.2cm2/Vs) at gives mobility and current on/off ratio for some
room temperature in single crystals [15, 16]. n-type semiconductors.
VLP0109-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
6. OTFT FOR DISPLAY DEVICES resistance. Top contact OTFT shows better field
effect mobility due to less contact resistance than
that of a bottom contact one.
The companies currently developed a very
diverse set of substrate, drive element and display It has been quoted that on/off current ratio is
mode technologies in order to realize flexible higher for short channel devices over long
display. E-paper display market is expected to channel devices. For memory and display
show 46.9% annual average growth rate from applications high on/off current ratio is more
US$ 260 million in 2010 to US$ 2.1 billion in important requirement than high mobility and this
2015 and US$ 7 billion in 2020. OTFTs can be ratio should be more than 108. In spite of
used to make good displays of LCD or E-paper as numerous advantages such as, large area
there need high on/off current ratio [29]. coverage, structural flexibility and especially low
cost, certain limitations like instability, lower
carrier mobility, and shorter lifetimes are
TABLE-3 DISPLAY APPLICATIONS WITH OTFTS
WITH PENTACENE AS OSC [29]
associated with organic material based devices
need to be resolve to commercialize OTFTs
App. Specification Organization
based applications.
OLED 4*4 pixel on PC NHK (Japan)
OLED 8*8 pixel on glass Pioneer (Japan)
LCD 64*128 on plastic ERSO (Taiwan)
REFERENCES
LCD 15 in. full color XGA on Samsung
glass (Korea) [1] M. Jamal Deen, “Plastic microelectronics with organic
LCD 1.4 in. 80*80 RGB on glass Hitachi (Japan) and polymeric thin film transistors,” Proc. 26th
international conference on microelectronics, MIEL,
2008.
Table-3 summarizes the display prototypes [2] Yoshiro Yamashita, “Organic semiconductors for
organic field effect transistor,” Sci. Technol. Adv.
using OTFTs and LCD (made with OTFT matrix Mater. vol.10, pp-024313, 2009.
array) and active matrix organic light emitting [3] H. Klauk, D. J. Gundlach, and T. N. Jackson, “Fast
diode (AMOLED) with dot matrix patterns. organic thin-film transistor circuits,” IEEE Electron
Organic/polymer LEDs displays have the Device Lett., vol. 20, pp. 289-291, 1999.
potential to replace LCDs and become the next [4] A. R. Brown, A. Pomp, C. M. Hart, and D. M. De
Leeuw, “Logic gates made from polymer transistors
dominant force in flat panel display due to require and their use in ring oscillators,” Science, vol. 270, pp.
fewer steps in fabrication processes and have 972-974, 1995.
lower material costs than LCD [30]. [5] Y. Sun, Y. Liu and D. Zhu, “Advances in organic field-
effect transistors, ” J. mater. chem. , vol. 15, pp. 53-
65, 2005.
7. CONCLUSION [6] Y. Y. Lin, D. J. Gundlach, S. F. Nelson, and T. N.
Jackson, “Stacked pentacene layer organic thin film
Organic/polymer electronics is a very transistors,” IEEE Electron Device Lett., vol. 18, pp.
promising alternative to crystalline, 606–608, Dec. 1997.
polycrystalline and amorphous silicon processes. [7] Z. Xie, M. Abdou, A. Lu, M. J. Deen, S. Holdcroft,
Moreover, there are no restrictions as to the “Electrical Characteristics of Poly (3-Hexylthiophene)
Thin Film MISFETs,” Canadian J. of Physics, vol. 70
dimensions of the device. It has been observed
no. 10 & ndash; 11, pp. 1171-1177, 1992.
that with increasing the permittivity of gate [8] O. Marinov, M. J. Deen, and R. Datars, “Compact
insulator and thickness of organic material, the modeling of charge mobility in organic thin-film
mobility decreases in OTFTs. The effect of transistors,” J. Appl. Phys. , vol. 106, no. 6, pp.
channel length has been discussed; long channel 064501-1–064501-13, Sep. 2009.
devices are relatively immune to high contact
VLP0109-6
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
[9] H. Klauk, “Organic thin film transistor,” Chem. Soc. [22] P. Necliudov, M. Shur, D. Gundlach, and T. Jackson,
Rev., 39, pp. 2643-2666, 2010. “Modeling of organic thin film transistors of different
[10] O. Marinov, M. J. Deen, and B. Iniguez, “Charge designs,” J. Appl. Phys., vol. 88, no. 11, pp. 6594–
transport in organic and polymer thin-film transistors: 6597, Dec. 2000.
Recent issues,” Proc. Inst. Elect. Eng. Circuits Devices [23] De Leeuw, D. Gelinck, G. Geuns, T. Van Veenendaal,
Syst., vol. 152, no. 3, pp. 189–209, Jun. 2005. E. Cantatore, E. and B. Huisman, “Polymeric
[11] N. Karl, “Charge Carrier Transport in Organic integrated circuits: fabrication and first
Semiconductors,” Synth. Met. , vol. 649, pp . 133- characterization,” IEEE-IEDM, 2002, pp. 293–296.
134, 2002. [24] C. D. Dimitrakopoulos, S. purushothaman, J. Kymissis,
[12] R. A. Street and A. Salleo,” Contact effects in polymer A. Calleggari and J. M. Shaw, “Low-Voltage Organic
transistors,”Appl. Phys. Lett. vol. 81, no. 15, pp. 2887, Transistors on Plastic Comprising High-Dielectric
2002. Constant Gate Insulators,” Science, vol. 283 no. 5403
[13] S. F. Nelson, Y. Y. Lin, D. J. Gundlach and T. N. pp. 822-824, February 5, 1999.
Jackson, “Temperature independent transport in high [25] C. D. Dimitrakopoulos, I. J. Kymissis, S.
mobility pentacene transistors,” Appl. Phys. Lett. vol. Purushothaman, D. A. Neumayer, P. R. Duncombe,
72, no.15, pp.1854, 1998. and R. B. Laibowitz, "Low-Voltage, High-Mobility
[14] F. Garnier, “Thin-Film Transistors Based on Organic Pentacene Transistors with Solution-Processed High
Conjugated Semiconductors, Chem. Phys., 227, 253, Dielectric Constant Insulators," Adv. Mater. 11, 1372,
1998. 1999.
[15] H. Klauk, D. J. Gundlach, M. Bonse, C. C. Kuo, and [26]C. D. Dimitrakopoulos and P. R. L. Malenfant,
T. N. Jackson, “A reduced complexity process for “Organic Thin Film Transistors for Large Area
organic thin film transistors,” Appl. Phys. Lett., 76, Electronics,” Adv. Mater. vol. 14, pp. 99-117, 2002.
1692, 2000. [27] M. J. Deen, O. Marinov, Jianfei Yu, S. Holdcroft and
[16] J. H. Schon, Ch. Kloc, and B. Batlogg, “On the W. Woods, “Low-frequency noise in polymer
intrinsic limits of pentacene field-effect transistors,” transistors,” IEEE Trans. on Electron Devices, vol. 48,
Organic Electronics., vol.1, no. 57, 2000. no. 8, pp. 1688-1694, 2001.
[17] C. Shekar, T. Lee and S. W. Rhee, “Organic thin film [28] I. G. Hill, “Numerical simulations of contact resistance
transistors, material, processes and devices,” Korean J. in organic thin-film transistors,” Appl. Phys. Lett. Vol.
Chem. Engg., vol. 21, no. 1, pp. 267-287, 2004. 87, pp. 163505-1-163505-3, 2005.
[18] G. Horowitz, “Organic field-effect transistors,” Adv. [29] Jin Jang and S. H. Han, ”High performance OTFT and
Mater. vol. 5, pp. 365-377, 1998. its application,” Current Applied Physics, 6S1, pp.
[19] P. Stallinga, and H. L. Gomes, “Modelling electrical e17-e21, 2006.
characteristics of thin-film-field-effect transistor, I. [30] A. Afzali, C. D. Dimitrakopoulos and T. L., Breen,
Trap-free materials,” Synthetic Metals, 156, pp. 1305- “High-performance, solution-processed organic thin
1315, 2006. film transistors from a novel pentacene Precursor,” J.
[20] G. Horowitz, “Organic thin film transistors: From Am. Chem. Soc., vol. 124, pp. 8812, 2002.
theory to real devices”, J. Mater. Res., vol. 19, no. 7, [31] J. H. Schon, S. Berg, Ch. Kloc, and B. Batlogg,
pp. 1946-1962, Jul 2004. “Ambipolar pentacene field-effect transistors and
[21] O. Marinov, M. J. Deen, and B. Iniguez, “Performance inverters,” Science, vol. 287, pp. 1022, 2000.
of organic thin film transistors,” J. Vac. Sci. Technol.,
vol. 24, no. 4, pp. 1728–1733, 2006.
VLP0109-7
CHARACTERIZATION OF 4T SRAM CELL
Setu Garg1, Prof.S.N.Sharan2, Garima Chandel3 Member IEEE, Hridesh Verma4
1
GCET, Greater Noida,2GNIT, Greater Noida, 3,4ABES IT Ghaziabad, India.
1
gargsetu06@gmail.com, 2snathsharan@yahoo.com, 3garimachandel@rediffmail.com,
4
hridesh.verma@gmail.com
ABSTRACT — The Static Random Acess Memory represents a b-bit input/output (I/O -port). The I/O-
discussed in this paper is based on a Four- port consists of b I/O-blocks, i.e. one block per bit of
Transistor SRAM cell. This paper focuses on the the output word. Each bit of the I/O-port can be
various important parameters viz., Static Noise connected to one out of 2w bit lines by a 2w-to-1
Margin Analysis and Bit Line Leakage current column or bit line multiplexer. Any SRAM cell can
analysis to characterize Four-Transistor SRAM be accessed by an address word which is (h + b) bits
cell. Maximum allowable SNM is needed to be long. This address is applied to the control logic
investigated for efficient operation of SRAM cell. block which controls all the memory operations, e.g.
The purpose of this analysis is to measure the write, read, enable, data- in, data-out.
SNM of bit cell without flipping the cell contents. .
Bit line leakage current analysis is also done.
Analysis involves bit cell contribution to column II. BASIC SRAM ARCHITECTURE
leakage and margin available for sum of total cell
leakage current in a long column. The A typical static random access memory
performance and results have been validated (SRAM) architecture is as shown in Figure 1. It
through simulations using ELDO tool from consists of a matrix of memory cells arranged in an
Mentor Graphics Corporation. array of 2N rows by 2M columns. The total size of the
memory array is 2M x 2N bits. During a read
Index Terms – SRAM, Bit Line, Static Noise operation, one of the 2N rows (Word lines) is selected
Margin, DC Source, Word Line by the row address decoders by decoding the row
addresses. All the memory cells in the given word
line are enabled. The column decoder selects one of
I. INTRODUCTION the 2M columns and the value of the selected memory
Static random-access memory (SRAM) is a cell is read out by the sense amplifier. The data into
critical component across a wide range of and out of the memory array is controlled by the
microelectronics applications from consumer Read-Write control circuit.
appliances to high-end workstation and
microprocessor applications. For almost all fields of
applications, semiconductor memory has been a key
enabling technology. It is forecasted that embedded
memory in SoC designs will cover up to 90% of the
total chip area. A representative example is the use of
cache memory in microprocessors. The operational
speed could be significantly improved by the
application of on-chip cache memory that
temporarily stored a fraction of the data and
instruction content of the main memory.
A. WRITE OPERATION
B. READ OPERATION
The read operation of the cell is different from Figure 3. 4T SRAM cell simulated structure
that of 6T cell. To read from the cell the bit lines are B. Method to calculate Static Noise Margin
charged to ground instead of Vdd and the word line
voltage is set to Vdd to turn on the NMOS access To analyze Static Noise Margin, introduce a DC
transistors. The node with logic’1’ stored will pull the noise source inside the SRAM cell and see where the
voltages on the corresponding bit line up to a high cell flips .Put the WL (Word Line) at Vdd . Bit Line
(not Vdd because of the voltage drop across the and Bit Line’ (BL and BL’) are connected to ground.
NMOS access transistor) voltage level. The other bit Iinitialize Q’ with Vdd and Q with 0. Now slowly
line is pulled to ground. The sense amplifier detects increase VX from 0 and monitor points Q and Q’ to
which bit line is at high voltage and which bit line is investigate where the cell flips. Static Noise Margin
at ground. is measured to be 362.3279 mV.
C. Bit Line Leakage Current Analysis voltage supply ripple and thermal noise. Static Noise
Margin is measured to be 362.3279 mV.
The purpose of this analysis is to characterize For BLCC it is also seen the margin available for
the bit cell contribution to column leakage. The main the sum of total cell leakage currents in a long
purpose of this test is to see the margin available for column during a read operation.
the sum of total cell leakage currents in a long The Bit Line Leakage Current for 4T SRAM
column (from unselected WLs) during a read cell is measured to be 7.1441 pA. Objective is also to
operation. This simulation should be used as keep Bit line Leakage Current as low as possible.
guidelines for designing the maximum number of
physical rows in a SRAM array.
VI. REFERENCES
V. CONCLUSION
Abstract—On-chip L1 and L2 caches represent a sizeable A potentially important source of this power dissipation is
fraction of the total power consumption of microprocessors. In on-chip caches, because larger on-chip caches are being
nanometer-scale technology, the sub threshold leakage power is integrated onto the chip. For example, an Intel processor for
becoming one of the dominant total power consumption com- server applications has 1 and 6 MB on-chip L2 and L3 caches,
ponents of those caches. In this study, we present optimization
techniques to reduce the sub threshold leakage power of on-chip respectively1; subthreshold leakage power is dissipated by all
caches assuming that there are multiple threshold voltages, ’s, of the subbanks even if they are not accessed, while dynamic
available. First, we show a cache leakage optimization technique power is dissipated when a cache subbank is accessed. To
that examines the tradeoff between access time and sub threshold alleviate this problem, transistors in caches could be designed
leakage power by assigning distinct ’s to each of the four for low subthreshold leakage, for example, by assigning them
main cache components—address bus drivers, data bus drivers,
decoders, and static random access memory (SRAM) cell arrays with a higher threshold voltage or by controlling the with
sense amplifiers. Second, we show optimization techniques to reduce adaptive body biasing or, if a better balance of speed and power
the leakage power of L1 and L2 on-chip caches without affecting the is required, by employing dual [3]–[7]. Traditionally, at
average memory access time. The key results are: most two ’s—one low and one high —have been avail-
1) two additional high ’s are enough to minimize leakage in a able in high-performance process technologies, allowing cache
single cache—3 ’s if we include a nominal low for micro- designers only limited flexibility for suppressing subthreshold
processor core logic; 2) if L1 size is fixed, increasing L2 size can
result in much lower leakage without reducing average memory leakage current. To further improve the subthreshold leakage,
access time; 3) if L2 size is fixed, reducing L1 size may result in several circuit and microarchitectural techniques [8]–[13] have
lower leakage without loss of the average memory access time for the therefore been proposed targeted at the subthreshold leakage
SPEC2K benchmarks; and 4) smaller L1 and larger L2 caches than are power reduction of L1 caches.
typical in today’s processors result in significant leakage and One consequence of the increasing importance of sub-
dynamic power reduction without affecting the average memory
access time. threshold leakage current is that, the number of available ’s
in future process technologies will increase. Next-generation
Keywords—Microprocessor memory hierarchy, multiple
65-nm processes are expected to support three ’s (one
threshold voltage, on-chip caches, SRAM, sub threshold
leakage power. low and two high ’s) and future processes are likely to
provide designers with even more choices. This increase
provides new flexibility for subthreshold leakage power re-
duction methods, allowing new tradeoffs between the of
I. INTRODUCTION
different parts of a cache and between different levels in the
NTIL VERY recently, only dynamic power has been a cache hierarchy. The availability of additional ’s suggests a
U significant source of power consumption, and Moore’s
law has helped to control it. Shrinking processor technology
new examination of the tradeoff between cache size and
reduce power loss from subthreshold leakage current.
to
below 100 nm has allowed, and actually required, reducing the In this study, we present systematic techniques for assigning
supply voltage to reduce dynamic power consumption. How- multiple ’s to memory hierarchies to minimize power dis-
ever, smaller geometries with a low-threshold voltage exacer- sipation, in particular subthreshold leakage [14]. Based on our
bate leakage, so static power is beginning to dominate the power techniques, we provide a detailed quantitative tradeoff analysis
consumption equation [1]. For example, a 90-nm Pentium 4 con- between access time and subthreshold leakage power of on-chip
sumes 110 W, and roughly 40% of the total power dissipation caches as a function of the number and the strength of .
is consumed by leakage power [2]. The excessive heat dissipa- Although the qualitative trends of subthreshold leakage power
tion by the leakage power in the high-end 90-nm Pentium 4 pro- versus access time tradeoff are well known, this paper provides a
detailed quantitative analysis to determine the optimal number
cessor forced Intel Corporation to adopt more expensive power
of ’s for given design constraints and to justify the cost of
delivery, cooling, and packaging systems.
extra ’s. First, we examine optimal leakage power dissipa-
tion for various access times in on-chip SRAM caches, when
more than one high is available. Then, we show how many
high ’s are needed, in addition to a nominal required for
the processor’s general logic circuits and how much should
be increased for effective leakage power reduction for
TABLE I
CACHE ORGANIZATIONS FOR EACH CACHE SIZE
Fig. 2. Leakage power dissipation of the 7 128, 8 256, and 9 Fig. 3. Delay time of 7 128, 8 256, and 9 512 decoders.
512 decoders.
(4)
where , and are constants derived using the same tech- apply four distinct ’s, the analytic approximated equation for
nique as that used for the leakage power models. the access time (AT) is
The rest of the cache components show the same delay trend
characteristics as the decoder case of Fig. 3. Hence, the same (6)
curve-fitting technique can be applied for those components to where , and represent the ’s for address
derive approximated delay time models as functions of like bus drivers, data bus drivers, decoders, and 6T-SRAM cell ar-
(5). The coefficients for all the components in (5) can be found rays, respectively. Each exponential term corresponds to the
in Table III. delay time of one of the four components.
Once all of the approximated delay time models for each We also define baseline caches in which the of all the
component are extracted for a specific cache size, total delay cache components is set to a low- (0.2 V). Fig. 4 shows the
or access time of the cache can be approximated as a sum of the access time and the leakage power of the baseline caches. The
delay times of all the cache components. Assuming that we can cache access time grows logarithmically and the leakage power
increases linearly with the cache size. Those trends agree with
4 was around 2 in submicrometer technology, but it has been decreased to those of earlier studies on SRAM design. In Fig. 4, we assume a
about 1.3 in the current generation deep-submicrometer technology. direct-mapped cache organization and consider only the leakage
power of data arrays, disregarding the leakage of the tag com-
parators and other cache control logic.
(7)
Fig. 5. Normalized optimum LP and V versus normalized AT of 512-KB
constraints caches—schemes I and II.
Fig. 6. Normalized optimum LP and V versus normalized AT of 512-KB Fig. 7. Normalized optimum LP and V versus normalized AT —scheme IV.
caches—schemes I and III.
reduce leakage power beyond the 155% access time point, be-
the SRAM cell array, denoted as array in the graph, starts to in- cause the leakage power of the peripheral components, where a
crease first. This implies that the SRAM cell array is responsible low is used, becomes substantial beyond this point.
for the most significant fraction of total cache leakage power, Fig. 7 shows the normalized optimum leakage power and
but it has the least impact on increasing the total cache access versus normalized access time trends for a 512-KB cache of
time. After the of the SRAM cell arrays are saturated to the scheme IV. The optimum leakage power and the ’s are ob-
maximum allowed point (0.5 V), the of the peripheral com- tained again using (7) and (8) of Section III-A. In scheme IV,
ponents labeled as peri in the graph is increased further to reduce we can assign up to 4 distinct ’s for leakage power opti-
further leakage power in the peripheral components. However, mization. According to the results shown in Fig. 7, the of
this just increases the access time without much further cache the 6T-SRAM cell arrays starts to increase first similar to the
leakage reduction. For example, the leakage power is not de- scheme III case. Among the peripheral components, the for
creased over the 215% access time point where the for the the data bus starts to increase first. This implies that the data
peripheral circuit has not reached the maximum value (0.5 V) bus consisting of 128 b—the assumed bus width between the L2
in this 512-KB cache case. and L1 caches—has the second most significant impact on the
This leakage power and versus access time trends also ex- leakage power. Even though the address bus has the same struc-
plain the leakage optimization results shown in Fig. 5: scheme II ture, the number of bits in the address bus is much smaller than
shows a better optimization result than scheme I does when the the data bus. Hence, the leakage power impact of the address
normalized access time is less than 155%, but it does not beyond bus much less than the data bus. However, in the case of smaller
155% access time point. Recall that scheme I assigns a high- caches (e.g., 16–64 KB caches) where the data bus width is 32 b,
to all the cache components. It sacrifices more access time un- both the data and address bus have almost the same impact on
necessarily by increasing the of the peripheral components the leakage power. Therefore, the trends for both the data and
with little leakage reduction at the same access time. However, address buses will be the same. These trends suggest the di-
scheme II assigns the high- to just the SRAM cell arrays rection of optimizations that reduce cache leakage power.
that are responsible for a greater fraction of total cache leakage Table IV summarizes the normalized cache leakage power of
power but affects access time less. However, scheme II cannot schemes I–IV. As expected, we can reduce more leakage power
TABLE V
CACHE DYNAMIC ENERGY CONSUMPTION PER ACCESS AND LEAKAGE POWER
DISSIPATION AT 70 C DIE TEMPERATURE AND A TYPICAL CORNER FOR EACH CACHE SIZE
while achieving the same access time by having more ’s to where HitTime and HitTime are the access time of L1 and
control. If the access time is fixed, the caches of schemes III and L2 caches, Miss Rate and Miss Rate are the miss rate of L1
IV always show 38%–72% better leakage optimization results and L2 caches, and Miss Penalty is the external memory
than those of scheme I. There are a few things we should note access and data transfer time. Note that the local miss rate5 is
from this comparison study. First, as the target access time is used as the Miss Rate .
increased to more than the 150% point in scheme II, caches dis- Similarly, we measure the average memory access energy
sipate more leakage power than those employing scheme I. This (AMAE) to compare the dynamic energy dissipation of each
implies that the cache peripheral components consume nonneg- memory system configuration. Assuming that the L1 cache is
ligible leakage power. The leakage power of those components accessed every cycle, the AMAE represents the average en-
becomes substantial when we cut down the leakage power of the ergy dissipation per access in the entire microprocessor memory
6T-SRAM arrays significantly. Second, the slowest cache ac- system that includes L1, L2, and main memory. We can estimate
cess time of scheme II ends around 150% in small-size caches. average memory access energy, as follow:
This means that the peripheral components also play important
roles in both cache leakage power and access time. In other
words, increasing the of 6T-SRAM cell arrays alone gives us
diminishing returns at some point without reducing the leakage
power further. This is why the caches of scheme I give even (10)
better results than those of scheme II as increases. Finally,
there is a negligible difference between caches of schemes III where Hit Energy is average energy dissipation per access
and IV in terms of leakage power reduction. This implies that given in Table V. We assume a two-channel 1066-MHz
scheme III employing two distinct high ’s—three ’s if 256-MB RAMBUS DRAM RIMM whose sustained transfer
we include a nominal or low for the processor—is enough rate is 4.2 GB/s [19] to derive the main memory access time
to minimize leakage. Finally, as illustrated in Figs. 5–7 and and dynamic energy dissipation per access. Though the sus-
Table IV, each cache shows a wide range of optimal leakage tained transfer rate is quite high, we should also consider the
power consumption depending on target access time. Hence, the RAS/CAS latency of the memory, which is about 20 ns. For the
right tradeoff point between the leakage power and the access energy dissipation per access, we used the number given in [20],
time of the caches will be determined by either system design which is 3.57 nJ per access. The dynamic energy dissipation
specifications or constraints. per access can vary depending on the number of RIMMs. We
assume that one RIMM is installed. See Section IV–B and note
that more RIMMs are favorable for our optimization technique,
IV. LEAKAGE OPTIMIZATION TECHNIQUES because our technique prefers a larger L2 cache to a smaller
FOR TWO-LEVEL CACHES
one for leakage power reduction. The larger L2 cache accesses
A. Methodology DRAM less frequently than the smaller one, resulting in less
energy consumption for accessing the external DRAM. Hence,
In a processor memory system, the average memory access
if more RIMM modules are installed implying more energy
time (AMAT) [18] is a key metric for measuring the overall
dissipation per DRAM access, a larger L2 cache will allow
memory system performance. To evaluate the performance or
even more energy to be saved.
AMAT, it is essential to examine the cache miss characteristics
To obtain L1 and L2 cache miss rates, we use the Simple-
of realistic applications, because the performance or AMAT is a
Scalar/Alpha 3.0 tool set [21], which is a suite of functional and
function of L1 and L2 cache miss rates and cache access times.
timing simulation tools for the Alpha AXP ISA. In addition, we
In our study, we assume that the memory system hierarchy con-
collected the results from all 25 of the SPEC2K benchmarks [15]
sists of separate L1 instruction and data caches with a unified L2
to perform our evaluation. All SPEC programs were compiled
cache. Then, the average performance of the processor memory
for a Compaq Alpha AXP-21 264 processor using the Compaq
system can be measured or compared with the AMAT repre-
C and Fortran compilers under the OSF/1 V4.0 operating system
sented by
using full compiler optimizations . We completed the ex-
ecution for each benchmark application to get reliable L2 cache
miss rates, because L2 cache accesses are far less frequent than
5This rate is simply the number of misses in a cache divided by the total
(9) number of memory accesses to this cache.
TABLE VI
AVERAGE L1 AND L2 CACHE MISS RATES
FROM THE ENTIRE SPEC2K BENCHMARKS
Fig. 8. L2 leakage power optimization at a fixed L1 size (16 KB). (1) and (2)
are the leakage power consumption of the 256- and 512-KB caches at the same
AMAT as the baseline 128-KB cache, respectively.
L1 cache accesses; an insufficient number of L2 accesses may
result in unrepresentatively higher L2 cache miss rates.
Table VI shows the average L1 and L2 cache miss rates from equally important constraint in many situations [22]. In this ar-
the entire SPEC2 K benchmarks for 16-, 32-, and 64-KB L1 gument, we assume that the same AMAT will approximately
caches, respectively. We used direct-mapped L1 instruction give us the same execution time for a fixed processor core, L1
caches and four-way set associative L1 data caches. Also, we cache size, and benchmark program, so that we can fairly com-
used eight-way set associative L2 caches. For simplicity, each pare the total leakage energy consumption as well.
L1 cache miss rate is obtained by taking the sum of the number Fig. 8 shows the leakage power versus AMAT of L2 caches
of total instruction and data cache misses and dividing by the with a fixed L1 cache size—16 KB. The leakage power opti-
sum of total instruction and data cache accesses; a 16-KB L1 mization for individual caches is based on scheme III that re-
means instruction and data caches are each 16 KB in size. Since quires two additional distinct high ’s for L2. Assuming the
an L2 miss rate is a function of the L1 cache miss rate, we AMAT of the fastest 128-KB L2 cache designed with low-
measure the separate L2 cache miss rates for each L1 cache size (0.2 V) as a baseline, we compare the leakage power of other
configuration. Those cache miss characteristics will definitely caches at the same AMAT point; see the (1) and (2) points in
affect the leakage optimization direction of two-level cache Fig. 8. The (1) and (2) points are the leakage power consump-
memory systems. tion of the cache system with the 256- and 512-KB caches at
the same AMAT as the baseline 128-KB cache system. As can
B. L2 Cache Leakage Power Optimization be seen from the plots, the AMAT can be maintained while the
leakage power can be reduced by replacing the baseline 128-KB
Since an L2 cache’s contribution to leakage power dominates L2 cache with a 256-KB L2 cache that is intentionally slowed
due to their size, we will examine the leakage power optimiza- down by increasing its ’s to reduce leakage.
tion of the L2 cache first. Consider caches designed with low- This replacement with the double-sized L2 cache reduces
(0.2 V) devices and a baseline cache memory system consisting the leakage power by 70% compared to the fastest but leakiest
of 16 and 128 KB for L1 and L2 caches, respectively. Then, 128-KB L2 cache with the same AMAT. Similarly, the use of a
we have leakage power consumption and AMAT corresponding 512-KB L2 cache can further reduce leakage compared to the
to this configuration. Increasing of the 128-KB L2 cache 256-KB cache; see the vertical line in Fig. 8.
will reduce the leakage power of the L2 cache, but it will in- Finally, the employment of larger L2 caches also reduces
crease the AMAT of the cache memory system because of the in- the average dynamic power of the memory system, because
creased access or hit time. However, there is a way to reduce the the larger L2 caches reduce the number of external memory
leakage power of the cache memory system without increasing accesses that consume a significant amount of dynamic energy.
the AMAT that significantly impacts on the execution time of Table VII summarizes the results for the normalized leakage
the system. power and normalized average memory access energy for each
The key to reducing leakage power without increasing AMAT L1 cache size designed using scheme III at a fixed AMAT. To
is to compensate for the increased L2 access time by reducing compare leakage power and AMAE, the following standard
the cache miss rate of the cache memory system. To reduce the cache configurations were used: 128-KB L2 with 16-KB L1,
miss rate, we can increase the L2 cache size. The main memory 256-KB L2 with 32-KB L1, and 512-KB L2 with 64-KB L1.
access penalty is quite significant in term of both time and en- The shaded numbers represent the baseline L2 configuration,
ergy. Hence, even a slight reduction of L2 cache miss rates re- leakage power, and AMAE. Table VII shows the counterintu-
sults in a significant improvement in the AMAT. We note that itive results that we can reduce both leakage power and AMAE
although area was one of the most important design constraints by employing larger L2 caches while maintaining the same
in the past, this trend is changing and power is becoming an AMAT.
TABLE VII TABLE VIII
L2 CACHE NORMALIZED LEAKAGE AND AMAE L1 CACHE NORMALIZED LEAKAGE AND AMAE
AT THE FIXED L1 SIZE (16 KB) AND AMAT AT THE FIXED L2 SIZE (512 KB) AND AMAT
REFERENCES
[1] N. S. Kim et al., ―Leakage current: Moore’s law meets static power,‖
IEEE Computer , vol. 36, no. 12, pp. 68–75, Dec. 2003.
[2] G. Sery, S. Borkar, and V. De, ―Life is CMOS: Why chase life after?,‖
in Proc. IEEE Design Automation Conf., 2002, pp. 78–83.
[3] S. Mutoh et al., ―1-V power supply high-speed digital circuit technology
with multithreshold-voltage CMOS,‖ IEEE J. Solid-State Circuits, vol.
30, no. 8, pp. 847–854, Aug. 1995.
[4] T. Douseki, N. Shibata, and J. Yamada, ―A 0.5–1 V MTCMOS/SIMOX
SRAM macro with multi-Vth memory cells,‖ in Proc. IEEE Int. SOI
Conf., 2000, pp. 24–25.
[5] K. Nii et al., ―A low power SRAM using auto-backgate-
controlled
MT-CMOS,‖ in Proc. IEEE Int. Symp. Low Power Electronic Device,
1998, pp. 293–298.
[6] H. Mizuno et al., ―An 18- A standby current 1.8-V, 200-MHz micropro-
cessor with self-substrate-biased data-retention mode,‖ IEEE J. Solid-
State Circuits, vol. 34, no. 11, pp. 1492–1500, Nov. 1999.
[7] F. Hamzaoglu et al., ―Analysis of dual-V SRAM cells with full-swing
single-ended bit line sensing for on-chip cache,‖ IEEE Trans. Very Large
Scale (VLSI) Syst., vol. 10, no. 2, pp. 91–95, Apr. 2002.
[8] M. Powell et al., ―Gated-V : A circuit technique to reduce leakage
in deep-submicron cache memories,‖ in Proc. IEEE Int. Symp. Lower
Power Electronics & Design, 2000, pp. 90–95.
[9] A. Agarwal, L. Hai, and K. Roy, ―A single-V low-leakage
gated-ground cache for deep submicron,‖ IEEE J. Solid-State
Cir- cuits, vol. 38, no. 2, pp. 319–328, Feb. 2003.
[10] N. S. Kim et al., ―Drowsy instruction caches,‖ in Proc. IEEE Int. Symp.
Microarchitecture, 2002, pp. 219–230.
[11] S. Yang et al., ―An integrated circuit/architecture approach to reducing
leakage in deep-submicron high-performance I-caches,‖ in
Proc.
IEEE Int. Symp. High-Performance Computer Architecture, 2001, pp.
147–157.
[12] S. Kaxiras et al., ―Cache decay: Exploiting generational behavior to re-
duce cache leakage power,‖ in Proc. IEEE Int. Symp. Computer Archi-
tecture, 2001, pp. 240–251.
[13] H. Zhou et al., ―Adaptive mode-control: A static-power-efficient cache
design,‖ in Proc. IEEE Parallel Architecture and Compilation
Tech.,
2001, pp. 61–70.
[14] N. S. Kim et al., ―Leakage power optimization techniques for ultra deep
sub-micron multi-level caches,‖ in Proc. IEEE Int. Conf.
Computer
Aided Design, 2003, pp. 627–632.
[15] Standard Performance Evaluation Corporation [Online]. Available:
http://www.specbench.org
[16] S. Wilton et al., ―An Enhanced Access and Cycle Time Model
for
[18] J. Hennessy et al., Computer Architecture—A Quantitative Approach,
3rd ed. San Mateo, CA: Morgan Kaufmann, 2003, pp. 406–408.
[19] 800/1066 MHz RDRAM Advanced Information (2002). [Online]. Avail-
able: http://www.rambus.com
[20] V. Delaluz et al., ―Compiler-directed array interleaving for reducing en-
ergy in multi-bank memories,‖ in Proc. IEEE Asia South Pacific Design
Automation Conf., 2002, pp. 288–293.
[21] T. Austin et al., ―SimpleScalar: An infrastructure for computer system
modeling,‖ IEEE Computer, vol. 35, no. 2, pp. 59–67, Feb. 2002.
[22] T. Mudge, ―Power: A first class design constraint,‖ IEEE Computer, vol.
34, no. 4, pp. 52–57, Apr. 2001.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Abstract—Reliability has been important in many applications The basic block diagram of TMR system has been shown in
and it has been convenient as the size and cost of chips has been figure 1 as follows:
reduced drastically , reliability in electronics circuits is achieved
through fault tolerant where the system itself is able to tolerate
the fault and mask the error, fault tolerant in circuits is achieved
by various redundancy methods( as hardware , software ,
information, and time) but these redundant methods are
different for analog and digital systems so in this paper we have
discussed the important method for analog and digital circuits to
make them fault tolerable. In this paper digital fault tolerant
design has been explained with majority and minority voting and
how fault is injected in the circuits for testing using VHDL.
Analog fault tolerant design has been explained with the help of
fuzzification. The platform used for digital circuits is Xilinx-12.4i
(ISE) and for analog is MATLAB. Figure 1
Here the most important part is voting unit which plays an
Index Terms: Triple modular redundancy, Majority & Minority, important role in reliability of system, as the results in analog
Voter Fuzzification. systems and digital systems are different so this voting unit
plays a distinguished part in both these systems another thing is
I. INTRODUCTION that the voting unit is not redundant here so what happens if it
fails? So these are parts of discussion of this paper.
During lifetime of a system it is tested and diagnosed on
numerous occasions. For the system to perform its intended The distribution of this paper is as follows. In Section II,
mission with high availability, testing and diagnosis must be we make a short review of the most common fault tolerant
quick and effective. A sensible way to ensure this is to specify technique with its mathematical expression that how reliability
testing as one of the system functions– in other words, self-test. is increased as this is the basic method for both digital and
Reliability, availability, and safety (RAS) are the major factors analog systems Section III describes the fault tolerance in
for consideration in system design to provide continuous digital circuit’s environment and how faults are injected in
correct operation [1]. Since faults cannot be completely FPGA circuits for testing. In Section IV, the fault tolerant
eliminated, critical systems always employ fault tolerance technique for analog systems has been discussed with the help
techniques to guarantee high reliability and availability. Fault of fuzzy logic. The discussion of the results for both analog
tolerance (FT) techniques try to keep the system operational and digital circuits is provided in Section V. And finally the
despite the presence of faults [2]. FT can be achieved through future work and scope have been explained in Section VI.
hiding the occurrence of faults and preventing it from
generating errors (fault-masking), or through fault detection
and fault repairing. II. TRIPLE MODULAR REDUNDANCY
The basic block diagram of TMR system has been shown
There are various methods to make a system fault tolerable but above let the reliability of a single module is . Now the
the most basic is TMR method where the module which has above TMR system will give the correct output if either two or
to be made reliable, is made redundant by taking three three modules will perform correct operation so if the
identical modules in parallel in both hardware and software reliability of above system is then
and so the reliability of system increases as it can give the 3 2 3 3 0
right output even on failure of one module. R S R 1 R 2 m m 3 R 1 R
m m
2 3
3R 2Rm m
VLP0112-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
The reliability of above system will be greater than single be compared using minority voter circuits (figure3). The
module if minority voters also take in three inputs, the primary path and
two other redundant paths in question. If the primary path is in
the majority with one other redundant path then the output is
So the reliability of overall system will be greater than the low. If the primary path is in the minority in comparison with
single module if > 0.5. There is a single voter unit in the two other redundant paths then the output is high. Figure 4
above circuit so if this voter unit fails the complete circuit will below is a schematic and truth table of the minority voting
fail so it is important to consider the reliability of voting unit circuit.
also.
Let the reliability of voter unit is , so the reliability of
triplicated TMR will be greater than TMR system if:
Or 2
2 >3-2
These are the mathematical conditions for a triplicated system
to be more reliable as compared to TMR system.
VLP0112-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
V. RESULTS
The results for digital circuits are as follows:
Implementation Results:
The basic circuit used for description of reliability is ALU;
here single module of ALU, Triple module of ALU and
Figure 5 Triplicated TMR of ALU has been implemented. The tables
The final output y will be calculated on the basis of weights of below show how much area is utilized on FPGA board in
voter inputs as: terms of slices/LUTs.
The faults have been injected in the circuit by adding extra
component to the actual circuit so that logic of circuit is
changed this is known as SABOTEUR METHOD.
Here the value of weights will lie in the range of [0, 1], Where Circuit Implementation without TMR:
0 means that the particular module is completely in
disagreement with other modules while 1 means that module is
in complete agreement with other modules. The membership XUPV5-LX110T Used Available utilization
of difference of input pairs [8] has been defined as: Speed Grade-3
Number of Slice
LUTs 46 69,120 1%
Number of
BUFG/BUFGCT 1 32 3%
RLs
Number of
22 17,280 1%
occupied Slices
Number of
53 640 8%
bonded IOBs
The membership of output w has been defined as:
Circuit Implementation with TMR:
Number of
So the weight will be calculated with the fuzzy rules as: BUFG/BUF 1 32 3%
GCTRLs
Number of
occupied 27 17,280 1%
Slices
Number of
bonded 53 640 8%
IOBs
VLP0112-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Number of Slice
LUTs 49 69,120 1%
Number of
BUFG/BUFGCT 1 32 3%
RLs
Number of
28 17,280 1%
occupied Slices
Number of
107 640 16%
bonded IOBs
XUPV5-LX110T
Without With TMR With Triplicated
Speed Grade-3 TMR TMR
22 27 28
Utilized area
(slices)
Both the graphs show that improved fuzzy logic shows the
5.143ns 5.618ns 5.150ns
Maximum path better results as compared to existing fuzzy logic even in
delay (ns) presence of larger errors as shown in second graph.
VLP0112-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0112-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0113-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
represents a number (-1)s × 0.f × 2-126, where s is the number created by prefixing F with an
sign bit and f is the fraction. For double precision,
implicit leading 1 and a binary point.
denormalized numbers are of the form (-1)s × 0.f × 2-
1022
. From this you can interpret zero as a special type of If E=0 and F is nonzero, then V=(-1)S * 2 (-
126)
denormalized number. * (0.F) These are "unnormalized" values.
If 0<E<255 then V=(-1)S* 2 (E-127) * (1.F) If E=0 and F is zero and S is 1, then V= -0
where "1.F" is intended to represent the binary If E=0 and F is zero and S is 0, then V= 0
number created by prefixing F with an implicit 0 00000000 00000000000000000000000 = 0
leading 1 and a binary point. 1 00000000 00000000000000000000000 = -0
If E=0 and F is nonzero, then V=(-1)S * 2 (-126) 0 11111111 00000000000000000000000 = Infinity
* (0.F) These are "unnormalized" values. 1 11111111 00000000000000000000000 = -Infinity.
iii) Infinity 0 11111111 00000100000000000000000 = NaN
The values +∞ and -∞ are denoted with an exponent of
1 11111111 00100010001001010101010 = NaN
all 1s and a fraction of all 0s. The sign bit distinguishes
between negative infinity and positive infinity. Being 0 10000000 00000000000000000000000 = +1 * 2 (128-127) * 1.0 = 2
able to denote infinity as a specific value is useful 0 00000001 00000000000000000000000 = +1 * 2 (1-127) * 1.0 = 2(-126)
because it allows operations to continue past overflow 0 00000000 10000000000000000000000 = +1 * 2 (-126) * 0.1 = 2(-127)
situations .Operations with infinite values are well
0 00000000 00000000000000000000001 = +1 * 2 (-126) *
defined in IEEE floating point .
0.00000000000000000000001 = 2(-149) (Smallest positive value)
0 11111111 00000000000000000000000 = Infinity
Special Operations
1 11111111 00000000000000000000000 = -Infinity Operations on special numbers are well-defined by
iv) Not A Number IEEE. In the simplest case, any operation with a NaN
The value NaN (Not a Number) is used to represent a yields a NaN result. Other operations are as follows:
value that does not represent a real number. NaN's are Table 1
represented by a bit pattern with an exponent of all 1s Special Operations in floating point
and a non-zero fraction. Operation Result
0 11111111 00000100000000000000000 = NaN
n ÷ ±Infinity 0
1 11111111 00100010001001010101010 = NaN
There are two categories of NaN: QNaN (Quiet NaN) ±Infinity × ±Infinity ±Infinity
and SNaN (Signalling NaN).
±nonzero ÷ 0 ±Infinity
a) QNaN is a NaN with the most significant
fraction bit set. QNaN's propagate freely Infinity + Infinity Infinity
through most arithmetic operations. These
±0 ÷ ±0 NaN
values pop out of an operation when the result
is not mathematically defined. Infinity – Infinity NaN
b) SNaN is a NaN with the most significant
fraction bit clear. It is used to signal an ±Infinity ÷ ±Infinity NaN
exception when used in operations. SNaN's can ±Infinity × 0 NaN
be handy to assign to uninitialized variables to
trap premature usage. Double Precision
Semantically, QNaN's
The IEEE double precision floating point standard
denote indeterminate operations, while SNaN's denote
invalid operations. representation requires a 64 bit word, which may be
Summary: represented as numbered from 0 to 63, left to right.
The value V represented by the word may be The first bit is the sign bit, S, the next eleven bits are
determined as follows: the exponent bits, 'E', and the final 52 bits are the
If E=255 and F is nonzero, then V=NaN ("Not fraction 'F'.
a number")
If E=255 and F is zero and S is 1, then V= - The value V represented by the word may be
Infinity
determined as follows:
If E=255 and F is zero and S is 0, then V=
If E=2047 and F is nonzero, then V=NaN
Infinity
("Not a number")
If 0<E<255 then V=(-1)S * 2 (E-127) * (1.F)
If E=2047 and F is zero and S is 1, then V= -
where "1.F" is intended to represent the binary
Infinity
VLP0113-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
If E=2047 and F is zero and S is 0, then V= Mode flag is low when being used for signed
Infinity operation and high when being used for floating point
operation.
If 0<E<2047 then V=(-1)S * 2 (E-1023) * (1.F)
X representrs don’t care condition.
where "1.F" is intended to represent the binary
Operation to be performed by the chip is selected
number created by prefixing F with an implicit
using the last three bits of the control register.
leading 1 and a binary point. Table 2
If E=0 and F is nonzero, then V=(-1)S * 2 (-1022) Opcodes for various operations.
Op2 Op1 Op0 Operation selected
* (0.F) These are "unnormalized" values.
If E=0 and F is zero and S is 1, then V= -0 0 0 0 Addition
If E=0 and F is zero and S is 0, then V= 0 0 0 1 Subtraction
0 1 0 Multiplication
II WORKING PRINCIPLE
0 1 1 Division
A MY CHIP
Chip consists of 2 unidirectional buses, each 32 bits to Status Register
accommodate the input and the output. It consists of a 2
bit address bus for selecting the desired register in the F1F F2F RF NAN OF UF DE Z
chip. Fig. 3 Status Register Format
Signal description:
1) r/w ( read/write) signal to perform the read or The flags of the status register are defined as:
write operation . A high indicates read F1F flag is high when operand 1 is loaded on the chip.
operation and the low indicates write F2F flag is high when operand 2 is loaded on the chip.
operation. RF flag when high indicates the completion of the
2) rst (Reset) signal to reset the chip contents. selected operation by the chip.
NAN flag is high when the content of the result
3) Int (Interrupt) signal to interrupt the processor
register is wrong i.e. NaN (not a number) condition
about some abnormality in the functioning of has been encountered.
the chip. OF flag is high when the content of the result register
exceeds the higher bound limit.
UF flag is high when the content of the result register
crosses the lower bound limit or when a denormalized
number is encountered .
DE flag is high when division by zero (0) error
occurs.
Z flag is high when the result of the operation is zero.
Register Mapping
Table 3
Access codes for registers in my _chip module
Read/write Address Bus Register
X 00 F1
X 01 F2
1 10 RES
0 10 Control Register
X 11 Status Register
When address bus is loaded with 00 then register F1 is
port mapped for read or write operation. The mapping
Fig. 1 Block Diagram of My Chip of register F2 has been done using address bus code
01.
The optimization of address bus has been done for the
Control Register: code 10 where RES register is mapped only for read
operation and control register only for write operation.
IE X X Mode X Op2 Op1 Op0
Status register has been portmapped for address bus
Fig. 2 Control Register format.
code 11.
B FLOATING POINT ARITHMETIC OPERATION
The flags of the control register are defined as:
B.1 Addition & Subtraction:
IE stands for Interrupt Enable. When this flag is low (0) Addition and Subtraction are performed
no interrupt is generated and when this flag is high (1) using module fp_ads. Steps to perform the
interrupt is generated under certain conditions. addition & subtraction operation are:
VLP0113-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Step 1: Check which exponent is bigger and At this point, all the contents of the chip registers are
shifts the mantissa of the smaller number till erased and the chip is ready afresh for a new
the difference between the two numbers is calculation/computation.
reached. If the exponents are equal, then both Step 2: To load the first operand onto the chip register
numbers mantissa’s are checked for the bigger mapping is required making read/write signal low and
one. the loading address bus with 00.
Step 2: Add the exponents of the two numbers. Step 3: To load the second operand onto the chip
If sign bit of both numbers is same otherwise read/write signal is kept low while the status of
we subtract them. Same operation is performed address bus is changed to 01 for the required register
with the mantissa of the input operands. mapping.
Step 3: The abnormality of negative exponents Step 4: To select the operation to be performed last
is resolved by shifting the required number of three bits of the control register are taken into account
bits to get the correct result. To see whether while the address bus indicates 11 and the read/write
result has encountered an overflow error signal is low.
boundary conditions are checked. Refer Table 2. For opcodes of various operations.
B.2 Multiplication Step 5: A start signal is generated by checking the
Multiplication of floating point numbers is F1F and F2F flag of the status register to commence
done by using module fp_mul. Steps for the the selected operation while the address bus shows 10
multiplication operation are as follows: and the read /write signal is low.
Step 1: When we multiply two numbers having Step 6: The confirmation of operation completion is
the same base their powers are added. Similarly checked by the status of the RF flag of the status
here we add the exponents of the two operands. register which should be high for successful
Step 2: Booth multiplication (shift and add) completion of operation while the read/write signal is
technique is employed to multiply the high and the address bus indicates 10.
mantissa’s of the two numbers along with the Step 7: The result of the arithmetic operation done is
‘hidden bit’. Mantissa multiplication result is viewed by checking the dataout signal while the
saved in a 49 bit temporary register. read/write signal is high and the address bus indicates
Step 3: Negative exponents abnormality is 11
removed to get the resultant number mantissa Step 8: The previous entered input values can be
and exponent. viewed by keeping the read/write signal high while
B.3 Division keeping address bus 00 for operand 1 and 01 for
Division operation is performed using module operand 2.
fp_div utilizing fixed point division technique. IV RESULTS AND DISCUSSIONS
Steps to divide p by q, both of n+1 bits are as A ADDITION
follow:
Step 1: Store the numbers p & q in temporary registers
p_temp & q_temp of 2n+1 bits each
respectively.
Step 2: Compare the values of p_temp & q_temp.
If p_temp > q_temp subtract q_temp from
p_temp and store 1 in the quotient register and
move to the next iteration.
If p_temp<q_temp store 0 in the quotient
register and move to the next iteration.
Step 3:After n+1 iterations quotient is saved in quotient
register and remainder is saved in p_temp.
VLP0113-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Table 5.
SIMULATION EXAMPLE FOR FP ADDITION.
Base 10 Sign Exponent Mantissa HEX
Bit Bits Bits Equivalent
F1 4444.44 0 1000 0001 0101 458AE385
1011 1100
0111 0000
101
F2 5555.56 0 1000 0101 1011 45AD9C7A
1011 0011
1000 1111
010
RES 10000 0 1000 0011 1000 461C3FFF
1100 1000
0000 0000
000
B SUBTRACTION
D DIVISION
Fig. 5:Floating Point Subtraction Simulation
TABLE 6:
SIMULATION EXAMPLE FOR FP SUBTRACTION.
C MULTIPLICATION
VLP0113-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0113-6
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
REFERENCES
[1] Digital System Design Using VHDL by Charles H. Roth
Jr.
[2] The Design Warrior’s Guide to FPGA by Clive ‘Max’
Maxfield.
[3] FPGA Based System Design by Wayne Wolf.
[4] A VHDL Primer by Jayaram Bhaskar.
[5] Circuit Design With VHDL by Volnei A. Pedroni.
VLP0113-7
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
Abstract— A complete CMOS based low power voltage/current reference on-chip becomes a non-
supply bandgap voltage reference circuit trivial task. Numerous approaches to achieve low
implemented on TSMC 0.35μm CMOS process voltage supply drift as well as low temperature drift
is presented in this paper. The designed circuit voltage reference have been proposed till date. But
employs a start-up circuit, a beta-multiplier most of them have used BJT devices implemented
circuit(PTAT circuit) and a MOS based in standard CMOS process to implement reference
differential amplifier. This circuit provides a circuits [1-3] which occupies large wafer area.
nominal reference voltage of 323 mV at 2V Moreover, some of the implementations using non-
supply voltage. Experimental results show that standard CMOS process require higher cost owing
the temperature coefficient is 1.16 ppm / ºC in to extra process steps [4-5] .
the temperature range from -20 ºC to +90 ºC. This paper presents a complete MOS based
The value of PSRR achieved without any bandgap voltage reference circuit with the same
filtering capacitor is -21dB at 10KHz. The area general working principle of positive and negative
occupied by the design is 0.027mm² and power temperature coefficient voltages nullifying each
consumption is 62.24μW at room temperature other to give a near about zero temperature
(25 ºC). coefficient reference voltage along with a suitable
technique to minimize the power supply
Keywords— Bandgap voltage reference, PTAT, dependence of this reference voltage[6].
CMOS, PSRR. The major parts of the circuit involves a start-up
circuit, a beta-multiplier circuit made up of NMOS
1. INTRODUCTION and PMOS current mirror circuits, and a differential
The high-precision voltage reference circuit is an amplifier to enhance power supply rejection
important component in mixed-mode applications. capability of the reference voltage.
A stable reference circuit provides a reliable Section II. describes the proposed voltage
reference voltage, and low supply voltage makes reference circuit design along with the detailed
the integration with low voltage analog and digital description of its subparts viz. start-up circuit, the
circuits possible. Such reference circuits should beta-multiplier circuit and the differential amplifier
exhibit little dependence on process, supply voltage, circuit.
and temperature variations (PVT). With steadily Section III illustrates the experimental results.
decreasing power supply voltages in deep Section IV concludes the paper.
submicron CMOS technologies, a design of any
VLP0114-1
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
2. Proposed Reference Circuit MU3 turns off. This is very important since the
start-up circuit should not obstruct the normal
operation of the beta-multiplier circuit(which is
explained in the next subsection).
VLP0114-2
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
will give a negative temperature coefficient voltage. Figure 2 : Differential Amplifier Circuit
These two opposite temperature coefficient voltages TABLE 1: Component Values Of Proposed
will give a reference voltage of very small Reference Circuit
temperature coefficient value. Mathematically, the
reference voltage can be expressed as, Component Values
VLP0114-3
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
Figure 3: Reference Voltage Versus Temperature provide a stable reference voltage of 323mV within
Curve the temperature range -20ºC to +90ºC with the
power supply rejection value of -21dB at 10KHz
Table2: Performance Summary of the Proposed Hertz frequency. The proposed reference circuit
Design provides a stable reference voltage having very
small temperature drift. Such circuit can be used for
Parameter [7] [8] This Work applications which requires a stable voltage
Technology(μm) 0.6 0.5 0.35 reference such as MEMS based temperature sensors
Supply Voltage 1.4V 2.6V 2V and low dropout regulators.
Reference Voltage 0.309V 1.21V 0.323V
Temperature 36.9 613 1.16 REFERENCES
Coefficient(ppm/ºC) [1] Karel E. Kuijk, ―A Precision Reference Voltage
PSRR -47 dB -30 -21dB at Source‖ , IEEE Journal Of Solid-State Circuits,
at 100 dB at 10 KHz Vol. SC-8, No. 3, June 1973, pp. 222-226.
Hz 100 [2] Allen, P.E. & Holberg, D.R (2002). ―CMOS
Hz Analog Circuit Design‖. New York : Oxford.
Active Area(mm²) 0.055 0.045 0.027 [3] Matthew C. Guyton and Hae-Seung Lee, MIT ,
―Bandgap Current Reference‖ , March 2003.
[4] Lee, I., Kim G., & Kim, W. (1994)
―Exponential curvature compensated BiCMOS
bandgap reference‖ IEEE Journal Of Solid-
State Circuits, 29, 1396-1403.
[5] Malcovati, P.,Maloberti, F., Fiocchi, C., Pruzzi,
M. (2001). ―Curvature-compensated BiCMOS
bandgap with 1-V supply voltage‖, IEEE
Journal Of Solid-State Circuits, 36(7), 1076-
1081.
[6] Allen-Holberg, ―CMOS Analog Circuit
Design‖, Second Edition.
[7] Stair, R., Connelly, J.A. , & Pulkin M. (2000)
Figure 4 : Reference Voltage under the three ―A Current Mode CMOS Voltage Reference‖.
corner conditions In proceedings of Southwest Symposium on
Mixed-Signal Design (pp. 23-26)
4. CONCLUSIONS [8] Kimberly Jane S.Udy, Patricia Angela Reyes-
Abu and Wen Yaw Chung, ―A High Precision
A high precision temperature insensitive voltage
Temperature Insensitive Current And Voltage
reference circuit has been presented in this paper.
Reference Generator‖. In proceedings of
The circuit was designed using TSMC 0.35μm
World Academy Of Science, Engineering and
CMOS technology and experimental results were
Technology 2009.
illustrated. It shows that the proposed circuit can
VLP0114-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Abstract: We Study field-effect transistors based on individual single and multi-wall carbon nanotubes and analyzed
their performance. Transport through the nanotubes is dominated by holes and by varying the gate voltage;
we successfully modulated the conductance of a single wall device by more than 5 orders of magnitude.
Multi-wall nanotubes show typically no gate effect.
Keywords: Carbon nanotubes, Semiconductor, Singlewall Nanotube, Multiwall Nanotube, FET
The SWNTs used in our study were produced by laser ablation of The behavior is similar to that of a p-channel metal oxid
graphite doped with cobalt and nickel catalysts [7]. For cleaning, semiconductor FET [9]. The source drain current decreases strongly
the SWNTs were ultrasonically treated in anH2SO4/H2O2 solution. with increasing gate voltage, which not only demonstrates that the NT
MWNTs were produced by an arc-discharge evaporation technique device operates as a Feld Effect Transistor but also that transport
[8] and used without further treatment. The NTs were dispersed by through the semiconducting SWNT is dominated by positive carriers
sonication in dichlroethane and then spread on a substrate with pre holes.
defined electrodes. A schematic cross section of a NT device is
shown in Fig. 1. The conductance modulation of our SWNT-FET exceeds 5 orders of
magnitude. For VG, 0 V, the I – VG curves saturate indicating that the
They consist of either an individual SWNT or MWNT bridging two contact resistance RC at the metal electrodes starts to dominate the total
electrodes deposited on a 140 nm thick gate oxide film on a doped resistance R5 RNT 1 2 RC of the device. Here, RNT denotes the gate-
Si wafer, which is used as a back gate. The 30 nm thick Au dependent resistance of the NT. The saturation value of the current
electrodes were defined using electron beam lithography. For corresponds to RC' 1.1 MV. Similar contact resistances were previously
imaging, we used an atomic force microscope operating in the found for metallic SWNTs [4]. The origin of the holes is an important
noncontact mode. question to address. One possibility is that the carrier concentration is
inherent to the NT.
The source–drain current I through the NTs was measured at room
temperature as a function of the bias voltage VSD and the gate voltage
VG. Figure 2 a shows the output
VLP0115-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
per 250 carbon atoms in the NT. For comparison, in graphite there
4
is only 1 hole per 10 atoms [11]. The large hole density suggests
that the NT is degenerate and/or that it is doped with acceptors, for
example, as a result of its processing [12].
VLP0115-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0115-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
References :
VLP0201-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
which may facilitate in identifying the hot amount of heat from S received within the
spots/zones in a VLSI chip. In the unit sphere centered at the point T. This unit
continuous domain we have used the is the same as that of the distance between S
concept of a unit sphere model to calculate and T, and may be related to the minimum
the local thermal effect at a point due to the dimension of the chip. The cumulative heat
heat being dissipated from several point received at the point T is evaluated as the
heat sources distributed over the chip linear superposition of the amounts received
plane. We establish that a point on a chip at T from all heat – generating sources on
can become very hot due to the conduction the chip.
effects of other heat sources, although it may As illustrated with Fig. 2, let a heat source at
not have a heat source in its immediate a point S generate an amount Q, henceforth
vicinity. In this model, the heat loss due to denoted as the strength of the source S. Let
radiation has been ignored. If it is to be the target point T be at a Euclidian distance
considered, an appropriate heat loss function d from S. Let CT and Cs intersect at the two
has to be incorporated points A and B.
Then the area cut out on the surface of the
sphere CS is equal to the product of solid
angle with its vertex at the center of the
sphere Cs and the square of the sphere‟s
radius
A …. (1)
VLP0201-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0201-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0201-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
active source points, but some points away Circuits”, Proc. Of IEEE Int. conf. on
from the source were found to be much Computer Aided (ICCAD), pp. 124-127,
hotter than the sources itself. The 1999.
randomness of the source did not affect the [5] Solid Angle “, on the Wikipedia, the
result much. One important aspect we have free encyclopedia Website.
observed in all the models is that there are [6] T. Sherwood, E. Perelman, and B.
zones in the chip which become much hotter Calder. Basic block distribution analysis to
even without containing a heat source. We find periodic behavior and simulation points
conclude that it may not be enough to guard in applications. In Proc. PACT, Sept. 2001.
only the active regions to make the chip [7] SIA. International Technology Roadmap
thermally stronger. This also requires the for Semiconductors,2001.
need for more efficient power and thermal [8] S. Gunther, F. Binns, D. M. Carmean,
management techniques and J. C. Hall. Managing the impact of
increasing with three stacked channels,”
References Microelectron, 1991
[1] S. Borkar. Design challenges of microprocessor power consumption. Intel
technology scaling. IEEE Micro, pp. 23–29, Tech. J., Q1 2001.
Jul.–Aug. 1999. [9] Fig: 1.From: K. Skadron,S.Velusam,�
[2] G. Roos, B. Hoefflinger, M. Schubert, K. Sankaranarayanan and D. Tarjan.
and R. Zingg, “Manufacturability of 3D- “Temperature-Aware
epitaxial-lateral-overgrowth CMOS circuits Microarchitecture”.Published in the
[3] R. Mahajan. Thermal management of Proceedings of the 30th International
CPUs: A perspective on trends, needs and Symposium on Computer Architectures,
opportunities, Oct. 2002. Keynote June 9–11, 2003 in San Diego, California,
presentation,THERMINIC-8. USA.
[4] Y.K.Cheng and S.M.Kang, “An Efficient
Method for Hot-spot Identification in ULSI
VLP0201-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Abstract:- In the present IT age, we are in The motivation was to make possible the users
need of fully automatic system for remotely to automate their homes having universal
controlling and monitoring appliances. This access. The home appliances control system
paper mainly focuses on the remotely with an reasonable cost was thought to be built
controlling the industrial and home that should be mobile providing remote access
appliances and making efficient utilisation of to the appliances. There was a need to
power supply[1]. This system is SMS based automate home and industry so that user can
using GSM (Global System for Mobile take advantage of the advancement in such a
Communication) and uses a wireless way that a person getting off the office does not
technology. It provides an perfect solution to get melted with the hot climate. The motive of
the problem faced by home owner when they this paper is to propose a system that allows
forget to switch off their home appliances user to be control home appliances universally
while going out of home. It is one of the via SMS using GSM technology and make a
emerging and new application of GSM efficient utilisation of power supply. A design
technology. It is of great use for efficient and implementation of SMS based control for
utilisation of power in industry and cutting monitoring systems is proposed in[2]. This
down the electric bill. Here we are paper has three modules involving sensing unit
representing a design of a stand alone for monitoring the complex applications, a
embedded system that can monitor and processing unit that was microcontroller and a
control different appliances installed at communication module that used GSM module
industries and home using built-in input and or cell phone. The primary health-care
output pheripherals. Basically this system management for the rural population is
allows the home owner and industry owner explored in [3]. Providing PHC services to the
to control and monitor their appliances rural population by the use of the mobile web-
remotely via mobile phone by sending technologies was prposed in the paper [3]. The
command in form of SMS message and system above involves the use of SMS and cell
receiving the appliances current status. The phone technology for information management,
software used for simulation is ecllispse with transactional exchange and personal
a java run time environment. communication. Internet and wireless
communications have been utilized in home
Keywords- GSM , SMS, Signal Processing automations [6-8].
and Embedded System . In this paper , I have tried to
implement a method in which a
I. INTRODUCTION acknowledgement from receiver could be
received without any additional cost. It would
The objective of this paper is to control home be beneficial on the user aspect to receive a
appliances remotely and reduce the power feedback from the receiver.
wastages by providing cost effective solution.
VLP0202-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0202-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
system communicates with the GSM Module Remote Controlling capability of the
but there is no communicate if the run fails. system allowed user to switch on/off
and check the status through simulating
The system checks support for battery the appliance as directed by the
level, signal strength, GSM Module and other incoming SMS.
components by SMS sending and receiving
capability. If these tests succeed the system The system automatically performed tests
gives response of „Ok‟, if it fails then „ERROR‟ and checked support for available
is returned. The remote user sent SMS with features, hardware and SMS sending
security code (as defined in the program code) and receiving capability and configured
from a cell phone on the home appliances system accordingly.
control system to turn on/off the specified
appliance and the system performed the The program code is written using high
respective function by simulating the appliance level language like C, C++ and the compiler
on/off as directed by the user. converts it into machine code and it is stored in
microcontroller . The software used is ecllipses
with a java run time enviroment. The code is
Appliances SMS System Feedback transferred from the computer to
send by Response Message microcontroller with help of USB port,
User (current USBtiny and RS232 device. The compiler used
status) is AVRdude. The program code can be edited
Air AC on AC AC on and compiled using the ecllipse software . The
conditioner AC off button AC off sender and receiver GSM number with the
simulated security code is defined in the program code.
to on/off
Light Light on Light Light on VI. CONCLUSION
Light button Light off
off simulated In the paper low cost, secure, universally
to on/off accessible, remotely controlled with a feedback
Fan Fan on Fan Fan on solution for automation of homes has been
Fan off button Fan off introduced[1]. The target of achieving the
simulated control over home appliances remotely using
to on/off the SMS-based system is possible by this
system. GSM technology capable solution has
Fig. Results of home appliances control proved to be controlled remotely, provide home
system with feedback response[1]. automation and is cost-effective as it can reduce
the electric bill by efficient utilisation of the
home appliances. The appliances are used only
Achieved analytical results:- when they are required. It is of great use for the
industrial appliances also. Hence we can
System allowed the provision of security conclude that the required objectives and goal
such that system took no action against of home appliances control system have been
the instructions received from SMS achieved.
without security code or if the SMS
received is from unregistered number.
The required task was performed only VII. FUTURE DIRECTION
when the SMS with correct security
code instructed the system. The basic level of home appliance
control and remote monitoring with feedback
has been implemented. In case of remote
VLP0202-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
monitoring other home appliances can also be Appliances Monitoring & Control
monitored and controlled such that if the level System IEEE. Pp.237
of temperature rises above certain level then it
should generate SMS or sensors can also be 6) Liang, Li-Chen Fu and Chao-Lin W, “An
applied that can detect gas, smoke or fire in integrated, flexible, and Internet-based
case of emergency the system will control architecture for home
automatically generate SMS. automation system in the Internet era”,
In future the system will be small box The IEEE Proceedings of the
containing the microcontroller and GSM International Conference on Robotics
Module with a reduced size. and Automation, Volume: 2,2002, pp:
1101 -1106.
VLP0202-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0202-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Abstract This paper presents Low power consumption and the benefit of constant operation speed irrespective of the
smaller area are some of the most important criteria for the size of’ the multiplier. The clock speed is only determined
fabrication of DSP systems and high performance systems. by the digit size which is already fixed before the design is
Optimizing the speed and area of the multiplier is a major implemented.
design issue. However, area and speed are usually conflicting
constraints so that improving speed results mostly in larger
areas. In this paper, we try to determine the best solution to 2. THE BASIC TRANSVERSAL FILTER
this problem by comparing a few multipliers.
This project presents an efficient implementation of high An N-Tap transversal was assumed as the basis for this
speed multiplier using the shift and add method, Radix_2, adaptive filter. The value of N is determined by practical
Radix_4 modified Booth multiplier algorithm. In this paper considerations. An FIR filter was chosen because of its
we compare the working of the three multiplier by stability. The use of the transversal structure allows
implementing each of them separately in Transversal FIR relatively straight forward construction of the filter, as
filter. shown in figure 1.
Index Terms-Transversal FIR Filter, Booth algorithms,
VHDL, Xilinx.
1. INTRODUCTION
VLP0301-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
The gradient search algorithm was selected to simplify the data quantizes, etc. One typical AC (multiply-accumulate)
filter design. The filter coefficient update equation is given architecture is illustrated in figure. It consists of
by: multiplying 2 values, then adding the result to the
WK+1 = WK – μ eK XK previously accumulated value, which must then be
Where XK is the filter input at sample k, eK is the error term restored in the registers for future accumulations. Another
at sample k = pk . yk and μ is the step size for updating the feature of MAC circuit is that it must check for overflow,
weights value. which might happen when the number of MAC operation
is large. This design can be done using component because
3. MULTIPLIERS we have already design each of the units shown in figure.
However since it is relatively simple circuit, it can also be
3.1. BINARY Multiplier designed directly. In any case the MAC circuit, as a whole,
can be used as a component in application like digital
A Binary multiplier is an electronic hardware device filters and neural networks
used in digital electronics or a computer or other electronic
device to perform rapid multiplication of two numbers in 3.3. Architecture OF A RADIX 2^n Multiplier
binary representation. It is built using binary adders.
The rules for binary multiplication can be stated as The architecture of a radix 2^n multiplier is given in
follows the Figure. This block diagram shows the multiplication of
(i) If the multiplier digit is a 1, the multiplicand is two numbers with four digits each. These numbers are
simply copied down and represents the product. denoted as V and U while the digit size was chosen as four
(ii) If the multiplier digit is a 0 the product is also 0. bits. The reason for this will become apparent in the
For designing a multiplier circuit we should have following sections. Each circle in the figure corresponds to
circuitry to provide or do the following three things: a radix cell which is the heart of the design. Every radix
It should be capable identifying whether a bit 0 or 1 cell has four digit inputs and two digit outputs. The input
is. digits are also fed through the corresponding cells. The
It should be capable of shifting left partial dots in the figure represent latches for pipelining. Every
products. dot consists of four latches. The ellipses represent adders
It should be able to add all the partial products to which are included to calculate the higher order bits. They
give the products as sum of partial products. do not fit the regularity of the design as they are used to
It should examine the sign bits. If they are alike, the “terminate” the design at the boundary. The outputs are
sign of the product will be a positive, if the sign bits again in terms of four bit digits and are shown by W’s. The
are opposite product will be negative. The sign bit 1’s denote the clock period at which the data appear.
of the product stored with above criteria should be
displayed along with the product. From the above
discussion we observe that it is not necessary to
wait until all the partial products have been formed
before summing them. In fact the addition of
partial product can be carried out as soon as the
partial product is formed.
Binary multiplication (eg n=4)
p=a×b
an−1 an−2…. a1 a0
bn−1 bn−2…. b1 b0
pn−1 pn−2…. p1 p0
where a – multiplicand, b– multiplier, p – product
xxxx a
xxxx b
---------
x x x x b0a20
xxxx b1a21
xxxx b2a22
xxxx b3a23
--------------- Figure 2: Radix 2n multiplier architecture
xxxxxxxx p
3.4. BOOTH MULTIPLIER
3.2. Multiply Accumulate Circuit
The decision to use a Radix-4 modified Booth
Multiplication followed by accumulation is an algorithm rather than Radix-2 Booth algorithm is that in
operation in many digital systems, particularly those Radix-4, the number of partial products is reduced to n/2.
highly interconnected like digital filters, neural networks, Though Wallace Tree structure multipliers could be used
VLP0301-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
but in this format, the multiplier array becomes very large III Shift X circular right shifts because this will prevent us
and requires large numbers of logic gates and U V X X-1 from using
interconnecting wires which makes the chip design large 0000 0000 1100 0 two registers
and slows down the operating speed. 0000 0000 0110 0 for the X
0000 0000 0011 0 value.
3.5. BOOTH MULTIPLICATION ALGORITHM:
VLP0301-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Multiplier output
VLP0301-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Abstract Reversible logic gates are very much in demand for Due to these restrictions, synthesis of reversible circuits
the future computing technologies as they are known to can be carried out from the inputs towards the outputs and
produce zero power dissipation under ideal conditions. This vice versa.
paper proposes an improved design of a multiplier using
reversible logic gates. Multipliers are very essential for the
2. BACKGROUND OF REVERSIBLE CIRCUITS
construction of various computational units of a quantum
computer. The quantum cost of a reversible logic circuit can
be minimized by reducing the number of reversible logic An n×n reversible circuit consists of n inputs and n
gates. For this two 4*4 reversible logic gates called a DPG outputs with mapping of each input assignment to a unique
gate and a BVF gate are used. output assignment and vice versa. Also in the synthesis of
reversible circuits direct fan-out is not allowed as
Index Terms- Reversible logic circuits; Quantum computing; one–to-many concept is not reversible. However fanout in
Nanotechnology. reversible circuits is achieved using additional gates. A
reversible circuit should be designed using minimum
1. INTRODUCTION number of reversible logic gates.
Reversible logic has received great attention in the
recent years due to their ability to reduce the power A. Reversible Gates and Circuits
dissipation which is the main requirement in low power
VLSI design. Quantum computers are constructed using There are two main types of reversible gates: Toffoli [3]
reversible logic circuits. It has wide applications in low and Fredkin [4]. An n×n Toffoli gate passes the first (n-1)
power CMOS and Optical information processing, inputs to outputs unaltered (as control signals) and for the
quantum computation and nanotechnology. R. Landauer last output the nth input inverts (as target signal) if all the
[1] demonstrated that high technology circuits and previous (n-1) signals are „1‟. Assuming xi as
systems constructed using irreversible hardware result in input and yi as output, then [3]:
loss of one bit of information dissipates KTln2 joules of yi= xi 1< i < n-1
energy where K is the Boltzmann‟s constant and T is the yn= xn + (x1,x2….xn)
absolute temperature at which the operation is performed. Toffoli Gate: A 3*3 Toffoli gate [3] as shown in figure 1.
The heat generated due to the loss of one bit of information The input vector is I (A, B, C) and the output vector is O (P,
is very small at room temperature but when the number of Q, R). The outputs are defined by P=A, Q=B, R=AB xor C.
bits is more as in the case of high speed computational Quantum cost of a Toffoli gate is 5.
works the heat dissipated by them will be so large that it
affects the performance and results in the reduction of
lifetime of the components. Furthermore, Bennett [2]
showed that reversible circuits do not lose information due
to the one-to-one mapping between inputs and outputs;
Fig.1 Toffoli Gate
hence no extra energy loss.
In the design of reversible circuits two restrictions should A Toffoli gate with one (two) input(s) is also known as
be considered: NOT (CNOT or Feynman) gate respectively.
Fan-out is not permitted Fredkin Gate: A 3*3 Fredkin gate [4] as shown in figure
Loops are not permitted 2. The input vector is I (A, B, C) and the output vector is O
(P, Q, R). The output is defined by P=A, Q=A′ B xor AC
and R=A′ C xor AB. Quantum cost of a Fredkin gate is 5.
VLP0302-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
0 1 0 1 0 1 0 1
0 1 1 0 0 1 1 1
0 1 1 1 0 1 0 0
1 0 0 0 1 1 1 0
1 0 0 1 1 1 0 1
Fig.2 Fredkin Gate 1 0 1 0 1 1 1 1
1 0 1 1 1 1 0 0
BVF Gate: A 4 * 4 BVF gate as shown in figure 3. This is
1 1 0 0 1 0 0 1
a reversible double XOR gate and can be used for
1 1 0 1 1 0 1 1
duplication of the required inputs to meet the fan-out 1 1 1 0 1 0 0 0
requirements. The input vector is I (A, B, C, D), the output 1 1 1 1 1 0 1 0
vector is O (P, Q, R, S) and the output is defined by P = A,
Q = A xor B, R = C and S = C xor D. Quantum cost of a B. REVERSIBLE GATES IMPLEMENTED USING
BVF gate is 2. In the proposed design this gate is used to ELEMENTARY QUANTUM GATES
copy the operand bits and it is shown that the number of
gates required to copy is reduced by 50% with same Reversible implementations of 3×3 Toffoli, Peres and
quantum cost. Fredkin gates using elementary quantum gates are
shown
in figure 6, figure 7, and figure 8 respectively.
V V V+
++
Fig.7 Implementation of the 3×3 Peres gate [12]
3. PARALLEL MULTIPLIERS
VLP0302-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0302-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0302-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0302-5
VHDL environment for floating point Arithmetic Logic Unit - ALU
design and simulation
1 2 4
Rajit Ram Singh Vinay Kumar Singh 3poornima shrivastav Dr. GS Tomar
singhrajitram@gmail.com
VINDHYAIndore- India
vinay.singh@tatatechnologies.com
TATA Motors Ltd. Luck now -India
shrivastava.poornima@gmail.com
MIST Gwalior -India
Keywords: ALU- Arithmetic Logic Unit, Top-Down Top-down approach (is also known as step-
design, Validation, Floating point, Test- wise design) is essentially the breaking down
Vector\ of a system to gain insight into its
compositional sub-systems. In a top-down
I.INTRODUCTION approach an overview of the system is
formulated, specifying but not detailing any
Floating point describes a system for representing
first-level subsystems. Each subsystem is then
numbers that would be too large or too small to be
refined in yet greater detail, sometimes in
represented as integers. Floating point representation is many additional subsystem levels, until the
able to retain its resolution and accuracy compared to entire specification is reduced to base
fixed point representation. Numbers are in general
elements. A top-down model is often specified
represented approximately to a fixed number of
with the assistance of "black boxes", these
significant digits and scaled using an exponent. The
make it easier to manipulate. However, black
base for the scaling is normally 2, 10 or 16. The typical
boxes may fail to elucidate elementary
number that can be represented exactly is of the form:× mechanisms or be detailed enough to
exponent
Significant digits × Base realistically validate the model
In order to stimulate a device off board, a
e series of logical vectors must be applied to the
S ×B
device inputs. These vectors are called test
IEEE 754 standard for floating point representation vectors and are mostly used to stimulate the
in 1985. Based on this standard ,floating point design inputs and check the outputs against the
representation for digital system should be platform – expected values.
independent and data are interchanged freely among
different digital systems. An pipeline is a technique used in the design
of computers and other digital electronic
Arithmetic logic unit (ALU) is a digital circuit devices to increase their instruction throughput
that performs arithmetic and logical
(the number of instructions that can be
executed in a unit of time).
The fundamental idea is to split the processing of a Table1: select ALU operation.
computer instruction into a series of independent steps,
with storage at the end of each step. This allows the
computer's control circuitry to issue instructions at the
processing rate of the slowest step, which is much faster
than the time needed to perform all steps at once. The
term pipeline refers to the fact that each step is carrying Output status
data at once (like water), and each step is connected to 0000 Normal operation
the next (like the links of a pipe.)The origin of 0001 Overflow
pipelining is thought to be the IBM Stretch
0010 Underflow
project(1954) .Implementing pipeline requires various
phases of floating point operations be separated and be 0100 Result zero
pipelined into sequential stages. 1000 Divide by zero
We propose VHDL environment for floating point
ALU design and simulation. To ease the description,
v. Clock pulse is only provided to the module
verification, simulation and hardware realization.
which is selected using demux.
VHDL is widely adopted standard and has numerous
vi. Concurrent processes are used to allow
capabilities that are suited for designs of this sort .the processes to run in parallel hence pipelining
use of VHDL for modeling is especially appealing since
it provides formal description of the system and allows
the use of specific description styles to cover the
different abstraction levels(architectural, register ,
transfer and logic level) employed in design .
Selection Operation
00 Addition
01 Summation
10 Multiplication
11 Division
Fig: 2 view of selection of a add module Tab:1 setting zero check bit
Align module
After a module
In this module completes
operations its
are task, outputs
perform basedand status on status signal from previous stage zero operands are
signals are sent to the muxes where they multiplexes Checked in the align module as well this module
with other outputs from corresponding modules to introduces implied into the operands shown in table.
produce output result selector pins are routed to these
Zero_a1 a_sign Implied Implied
muxes such that only the output from currently
xor bit for a bit for b
operating functional module is sent to the output port.
zero_b1
Clock is specifically routed rather then tied
0 X(do’t 0 0
permanently to each module since only the selected
care)
functional modules need clock signals. This provides
power savings since the clock is supplied to the 1 1 0 1
required modules only and avoid invalid results at the 1 0 1 0
output since the clock is used as a trigger in every
process.
Tab:2 setting of implied bit
Pipelining floating point addition module:
Add_ sub module
Addition module has two 16 bit inputs and one16 bit This module performs actual addition and subtraction
output selection input is used to enable or disable the of operands. Firstly operands are checked via the status
module this module is further divided into 4 sub signals are carried out results are automatically
modules zero check, align, add_ sub and normalize obtained if either of the operand are zero shown in table
module. 3 normalization is needed if no calculation are done
here the operation is done based on the science and the
relative magnitude of mantissa i.e. summaries in table 4
status signal is set to one is indicate the need of
normalization by the next stage
Division entity has three 16-bit inputs and two 16-bit Align dividend module:
outputs. Selection input is used to enable or disable the
entity. Division module is divided into six modules: This module compares both mantissas. if mant_a is
check zero, align, dividend check sign, subtract greater than or equal to the msant_b then the mant_a
exponent, divide mantissa and normalize concatenate must be aligned .for every bit right shift of the mant_a
modules. Each module is executed concurrently. Status mantissa ,the mant_a exponent is then increased by
indicates the special cases such as overflow, underflow, 1.this increase may result in an exponent overflow, in
and result zero and divides by zero. Fig shows the this case an overflow flag is set. Otherwise, the process
pipeline structure of the division module. continues with the parallel operation of exponent
subtraction and mantissa division. Align_flag is set to 1.
Demux wave:
RTL of Demux:
Multiplexer:
RTL division:
Simulation result of Mux:
Iv COCLUSION
Reference:
[IIANSIWEE Std 754-1985, IEEE Standard for
Binary Flooring-Point Arithmetic, IEEE, New
York, 1985.
[2]M. Daumas, C. Finot, "Division of Floating Point
Expansions with an Application to the
Computation of a Determinant", Journal o/
Universol Compurer Science, vo1.5, no. 6, pp. 323-
338, June 1999.
[3]AMD Athlon Processor techmcal brief, Advance
Micro Devices Inc., Publication no. 22054, Rev. D,
Dec. 1999.
[4]S. Chen, B. Mulgeew, and P. M. Grant, "A
Clustering techmque for digital communications
Channel equalization using radial basis function
Networks,'' IEEE Trans. Neural Networks, vol. 4,
pp. 570-578, July 1993.
[5] Mamu Bin Ibne Reaz, MEEE, Md. Shabiul Islam,
MEEE, Mohd. S. Sulaiman, MEEE. ICSE2002 Proc.
2002,penang-Malaysia.
Simulation of division:
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0401-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Transistor W(µm)/L(µm)
M1-M3 100/2.5
M4 10/2.5
M5,M6 30/2.5
M7 10/2.5
M8-M11 50/2.5
M12,M13 100/2.5
M14 50/0.5
(3) (4)
Fi
g.4 (a) High Pass Filter
VLP0401-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
B. LC Oscillator
An LC oscillator is designed as a signal
generating application, employing proposed
inductor, and is shown in Fig. 5(a). The
condition of oscillation and frequency of
oscillation are given as
(6)
Fig.5 (b) Oscillator Output.
(7)
VLP0401-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0401-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0401-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Abstract — This paper presents an 10]. In general diodes are used as a rectifier
operational transresistance amplifier based having the drawback of threshold voltage,
precision full-wave rectifier using an all- and hence rectification is not permitted
pass filter as a 90◦ phase shifter. The circuit below a voltage of ∼0.7 V for a silicon
gives a dc output voltage that is almost the diode and ∼0.3 V for a germanium diode.
same as the peak input voltage over a Low-voltage rectification is required in
frequency range of 50 Hz–30 MHz with a applications such as amplitude modulated
very low ripple voltage having low signal detectors. Slew rate limitation
harmonic distortion. prevents the fast turning on of the diodes in
Index Terms—OTRA, All-pass filter, high frequency range and thus results in
harmonic distortions, precision rectifier, distortion. In view of above a precision
ripple voltage. rectifying circuit using OTRAs has been
I. INTRODUCTION proposed in this paper. The performance of
State-of-the-Art analog integrated circuit the circuit has been verified in the frequency
design is receiving a tremendous boost due range 50Hz-30MHz using P-SPICE.
to the development and application of II. PROPOSED RECTIFIER CIRCUIT
current-mode processing[1].It is well known The circuit symbol of OTRA is shown in
that the key performance features of current- Fig.1and its port relations can be
mode technique are inherent wide characterized by the following matrix:
bandwidth which is virtually independent of
closed loop gain, greater linearity and large Fig.1 OTRA Circuit symbol
dynamic range. Recently operational
transresistance amplifier (OTRA) has
II. CIRCUIT DESCRIPTION
emerged as an effective alternate analog
building block. It is a high gain current
input, voltage output amplifier [2].OTRA OTRA is a three terminal device, shown
being a current processing building block symbolically in Fig.1 and its port relations
inherits all the advantages of current mode can be characterized by matrix ((1)
technique. It is also free from parasitic input
capacitances and resistances as its input
terminals are virtually grounded thus (1)
eliminating response limitations due to
parasitics. OTRA is now being used as an
analog building block for realizing a number
of circuits having applications in signal
processing and generation[2-6 ].
Precise rectification function is one of the
important requirements in instrumentation
and measurement. It finds applications in ac
voltmeters, ammeters, signal-polarity
Fig.1 OTRA Circuit symbol
detectors, averaging circuits, sample-and-
hold circuits, peak value detectors and
amplitude-modulated signal detectors [7-
VLP0402-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Fig.2(b)Circuit diagram of proposed circuit It is seen that output of the proposed circuit
contains less ripple in comparison to
III. SIMULATION RESULTS previously reported circuit [7] in which one
To verify the theoretical propositions the diode conducts for one half cycle and other
rectifier circuit is simulated using P-SPICE diode conducts for the other half cycle as
program. For simulation C-MOS shown in fig.4(c).
VLP0402-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Fig.4(c) Rectified output of diode based Fig.6 (a)10mV,100KHz Input signal and 90
degrees phase shifted signal
In the proposed circuit, rectification is not Similarly a high frequency signal of
performed by diodes, and frequency 100KHz and amplitude of 10mV
therefore, it has fewer ripples. is analyzed and the result is shown in, (b) is
Low voltage rectification i.e. below the rectified output.
threshold level of the diode was also carried
out. Fig 5shows typical output of the
proposed circuit for 100Hz frequency.
B. Harmonic Distortion
Fig5(a) sinusoidal input of frequency 100 The harmonics in the signal cause distortion
Hz and amplitude 10mV along with 90 in the output of the circuit. Thus the
degrees phase shifted signal harmonic components are required to be
examined for circuit performance analysis.
Being periodic in nature, these harmonic
components can be analyzed by Fourier
series. The magnitude of each harmonic of a
waveform is obtained with fast Fourier
transform using PSPICE. In fig 7(a) FFT of
input signal of frequency 100Hz is shown
along with rectified output .whereas 7(b)
shows FFT of the input of frequency 100
kHz is shown.
Fig 5(b) rectified output with Input signal
VLP0402-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Fig8 (a)
Fig7(a)
Fig8 (b)
VLP0402-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
[2] Salama Khaled N., Soliman Ahmed M., [9] P. Gray, P. J. Hurst, S. H. Lewis, and R.
CMOS operational transresistance amplifier G. Meyer, Analysis and Design of Analog
for analog signal processing, Integrated Circuits. New York: Wiley, 2001.
MicroelectronicsJournal,Vol.30,No.9,pp.235 [10] S. J. G. Gift, “A high-performance full-
-245, March 1999. wave rectifier circuit,” Int. J. Electron., vol.
87, no. 8, pp. 925–930, Aug. 2000.
[3] U. Cam, “A Novel Single-Resistance- [11] Hasan Mustafa, Ahmed M.Soliman,”A
Controlled Sinusoidal Oscillator Employing Modified realization of the
Single Operational Transresistance OTRA”,frequenz60(2006) pp70-76.
Amplifier”, Analog Integrated Circuits and [12] R. A. Gayakwad, Op-Amps and Linear
Signal Processing,Vol. 32, pp. 183-186, Integrated Circuits., 3rd ed. New Delhi,
August 2002. India: Prentice-Hall, 2007, pp. 316–318.
[4]Rajeshwari Pandey, Mayank Bothra,
“Multiphase Sinusoidal Oscillators Using
Operational Trans-Resistance Amplifier”,
IEEE Symposium on Industrial Electronics
and Applications (ISIEA 2009),pp 371-376
October 4-6, 2009.
VLP0402-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0403-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0403-2
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
As frequency increases from 1 to 50 GHz the small signal circuit, the forward gain can be
maximum available gain (GMax) and maximum increased to the desired value.
stable gains (MSG) decreases and both are
coinciding together. It means K=1 and the
device is unconditionally stable. The various
gains at different frequencies are mentioned in
figure 4.
VLP0403-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
The general figure of merit given in equation 5, In estimating the resistance directly under the
for comparing microwave circuits is the cut-off gate Rg , the 2DEG is assumed to be under the
frequency (Fc) and is defined by the on influence of the gate voltage, making n s a
resistance (Ron) and off–state capacitance (Coff)
function of the gate voltage Vg [11]. The
of the device [10]-[11].
1 resistance elements Rsg and Rdg are assumed to
FC (5)
2 Ron Coff not to be controlled by the applied gate voltage
The on resistance of the HEMT is governed by and thus n s is not a function of Vg in the source-
the total source-drain resistances at microwave gate and drain-gate regions.
frequencies for voltages higher than threshold. The capacitance model includes both voltage-
Below threshold voltage the 2DEG is dependent and parasitic capacitances. The
suppressed under the gate and the resistance voltage-dependent capacitances used in
increases dramatically. modeling the GaN HEMT are the source-gate
The general channel resistance R DS is composed and drain-gate capacitances C g and the
of several resistance components and may be capacitances between the gate and inner side of
written as
the source and drain electrodes, Cig . The total
RDS Rg Rsg Rdg (6)
capacitance C DS can be written as [11]
where Rg is the interface (or channel) resistance
CDS Cg Cig C par
(9)
under the gate, Rsg and Rdg are the source-gate
where C is the total parasitic capacitance.
and drain-gate channel resistances respectively. par
VLP0403-4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
VLP0403-5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
Email: vinay.ic01@gmail.com,vinay.ic1@rediffmail.com
VLP0405-1
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
S. No. A B AxB
1 0 0 0
2 0 1 0
3 0 -1 0
4 1 0 0
5 1 1 1
6 1 -1 -1
7 -1 0 0
8 -1 1 -1
9 -1 -1 1
Table 3: Rules for Ternary Multiplication
Example:
(i) (37)10 x (4)10= (148)10
(1 1 0 1 ) 3 * (0 0 1 1]) 3 = [X]3
(e) TERNARY SUBSTRACTION
1 1 0 1
if negative numbers are considered, then by X 0 0 1 1
changing all +1’s to -1’s and vice versa, leaving all ---------------------------------------
zeroes unchanged, gives the negative of the 1 1 0 1
corresponding number. Hence it follows that 1 1 0 1 x
addition and subtraction may be performed with the 0 0 0 0 x x
same hardware in the balanced ternary system by 0 0 0 0 x x x
sign changes of the addend or subtrahend, ----------------------------------------
respectively. 0 1 -1 -1 1 1 1
----------------------------------------
(i) A-B =X (0 1 -1 -1 1 1 1)3 = (148)10
(ii) X=A + B’ where in B’ change all +1 to -1
and vice versa (ii) (14)10 x (15)10= (210)10
(1 -1 -1 -1 ) 3 * (1 -1 -1 0]) 3 = [X]3
1 -1 -1 -1
X 1 -1 -1 0
---------------------------------
0 0 0 0
-1 1 1 1 x
Here there is no need to convert the negative -1 1 1 1 x x
VLP0405-3
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
1 -1 -1 -1 x x x
---------------------------------
0 1 0 -1 -1 1 0
---------------------------------
(0 1 0 -1 -1 1 0)3 = (210)10
III PRINCIPLES OF TERNARYDATA
PATTERNS ENCODING
Ternary data communication increase the data [7] Takasaki, Y., “Digital transmission design and
carrying can be used to increase the speed of data jitter analysis,” Artech House, 1991, pp.35-60.
transmission. In future, this can be increase the data [8] Sandeep Patel, Howard W. Johnson, “Methods
capacity in storage media [8]. and apparatus for implementing a type 8B6T
Encoder and decoder “, 1996, patent no 5,525,983.
VI CONCLUSION
VII REFERENCES
VLP0405-5
“Embedded Implementation of Space Vector PWM using FPGA”
Ashish Gupta
Assistant Professor
Department of Electronics Engineering,
MPEC, Kanpur
ashish3179@rediffmail.com
Abstract - This Paper introduces the working implementation in many fields [2]. FPGA based
principle of space vector pulse width modulation embedded implement of SVPWM can make the
(SVPWM), and presents a new circuit realization of computing power of processor and the logical
SVPWM generator based on a flexible, high processing power of hardware circuit combined,
computation speed and cost effective field thus the processing efficiency of CPU and the
programmable gate array (FPGA) embedded logical units utilization can be improved . Figure 1
technique. Controlling of the machines using the shows a SVPWM control system based on FPGA-
vector control techniques is becoming more popular embedded technique –
nowadays. The need for extensive computations has
no more become an objection to the vector control
implementation. This is due to the wide availability
of high speed digital processors. The method of
decoupling the variables and controlling them
independently is known as vector control. To relieve
the controller from the time consuming
computational task of PWM signal generation, a
new method of Space Vector PWM signal
generation is implemented in FPGA using
Hardware Description Language VHDL. The Space Figure 1: SVPWM control system based on FPGA-
Vector PWM pulses are first designed in embedded technique
MATLAB/SIMULNK environment and relevant
coding are written to generate the pulses and then Recent applications of FPGA’s in industrial
by using software conversion tool the M files are electronics include mobile- robot path planning and
converted into VHDL coding. Thus the triggering intelligent transportation [3], current control applied
pulses are given to the inverter circuit and hence to power converters, real-time hardware in the loop
the switching pattern generated will reduce the testing for control design, Controller
harmonic content and switching losses. implementation, separating and recovering
independent source signals, and neural computation.
Keywords : FPGA- Field Programmable Gate Since the concept of multilevel PWM converter was
Array, SVM, Space Vector PWM, VHDL, Induction introduced, various modulation strategies have been
motor drive developed and studied in detail, such as multilevel
sinusoidal PWM, multilevel selective harmonic
1 Introduction elimination and space vector modulation. Among
these strategies, the space vector PWM (SVPWM)
The Pulse Width Modulation (PWM) Technique [4]stands out because it offers significant flexibility
called “Vector Modulation”, which is based on to optimize switching waveforms and is well suited
space vector theory, is the most important for digital implementation. Complexity and
development in the last few years [1]. Although, computational cost of traditional SVPWM
several of PWM methods have been created in the techniques increases with the number of levels of
past, the vector modulation technique appears to be the converter, and most of all use trigonometric
the best alternative. FPGA’s development reached a functions or pre-computed tables. A symmetrical
level of maturity that made them the good choice of space vector modulation PWM pattern is proposed
1
in this paper, it shows the advantage of lower THD to above equations, the eight switching vectors,
without increasing the switching losses. Thus this output line to neutral voltage (phase voltage), and
paper demonstrates that a more efficient and faster output line-to-line voltages in terms of DC-link
solution is the use of Field Programmable Gate Vdc, are given in Table.1 shows the eight inverter
Array (FPGA’s), it investigates how to generate a voltage vectors (V0 to V7)
variable PWM waveform based on Xilinx FPGA
[5].The rest of the paper is organized as follows.
Section II introduces the principle of symmetrical
space vector PWM method. Section III shows
details on FPGA. Section IV shows the m-file
coding/Simulink blocks required to generate Space
Vector Pulses. Section V explains the experimental
results and Section VI is the conclusion
2
Faster design and verification time, design 4.1 Simulink Model to generate
change without penalty. Space Vector PWM
In this paper programming FPGA using Hardware
Description Languages and coding are used to
generate the Space Vector Modulation for the
inverter circuit. The point to be remember here is
that instead of writing the direct VHDL coding
firstly the M-File coding is written to generate the
SVPWM pulses and then after by using he software
converter VHDL coding is generated. Hence the
work requires less time and fast operation. The
MATLAB/SIMULNK environment is familiar to
large number of software programmers and since
m-file coding is very much common to most of the
programmers it becomes easier to work in this
software. A very attractive high-level design/
simulation tool is provided by FPGA and is called
XILINX. It is a very flexible design tool, which
allows Testing of a high-level structural description
of the design and makes possible quick changes and
corrections. The circuit description structure is very
similar to the way the design could be implemented
later. Therefore mapping tool allowing conversion
of such a structure into VHDL code would save the
designer’s time, which otherwise has to be spent in
rewriting the same structure in VHDL and probably
making mistakes that will need debugging.
3
Tech-
SPWM SVPWM
nique
Output Output
M. I. line THD line THD
(M) voltage (%) voltage (%)
(peak V) (peak V)
0.4 180.80 162.11 192.70 154.07
0.5 266.50 123.35 312.20 108.78
0.6 289.40 117.12 318.10 105.69
0.7 369.20 94.52 436.60 81.19
0.8 396.10 89.73 442.90 78.56
0.9 472.90 70.69 552.30 53.62
1.0 502.40 64.83 567.90 49.15
Parameter used : Fundamental frequency :50 Hz,
Switching frequency:10 KHz ,
DC Voltage : 600 volts
Table 2: Comparisons between SPWM and
SVPWM by varying modulation index.
4
6. Conclusion
In this paper, a theoretical study concerning the
SVPWM control strategy on the voltage inverter
based on FPGA is presented. This aims on one hand
to prove the effectiveness of the SVPWM in the
contribution in the switching power losses
reduction. SVPWM is among the best solution to
achieve good voltage transfer and reduced harmonic
distortion in the output of an inverter. On the other
hand since Field programmable gate array (FPGA)
Fig 7: Delay time
have better advantages compared to microprocessor
and DSP control, this modulation technique is
implemented in an FPGA by initially generating m-
file through Matlab-Simulink environment. The
FPGA coding makes it easier in designing the
vector modulation pattern generator using field
programmable Array. Moreover the MATLAB/
SIMULNK environment is familiar to large number
of software programmers and since m-file coding is
very much common to most of the programmers it
becomes easier for individuals to work in this
software. The switching pattern generated will
Fig 8: Output of each inverter reduce the harmonic content, provides efficient as
well as flexible control and reduces the total size of
the system. This SVPWM IC can be used for high
performance ac drives and power conditioning
equipment as a modulator.
References
[1] Ying-yu Tzou; Hau-Jean Hsu; Tien-Sung Kuo.
Industrial Electronics, Control, and Instrumentation,
1996., Proceedings of the 1996 IEEE IECON 22nd
International Conference. “FPGA based SVPWM
Fig 9: Simulation results of Van, Vab and Vac control IC for 3-phase PWM inverters”. Volume 1,
Issue, 5-10 Aug 1996 Pages(s):138-143.
5
[4] L. Franquelo, M. Prats, R. Portillo, J. Galvan,
M. Perales, J. Carrasco, E. Diez, and j. Jimenez,
“Three-dimensional space-vector modulation
algorithm for four-leg multilevel converters using
abc coordinates”, IEEE Trans. Ind. Electron., vol.
53, no.2, pp. 459-466, Apr. 2006.