Professional Documents
Culture Documents
Huifang Qin, Yu Cao, Dejan Markovic, Andrei Vladimirescu, and Jan Rabaey
Department of EECS, University of California at Berkeley, Berkeley, CA 94720, USA
180
V2 (V)
0.2 V =0.18V
DRV (mV)
DD 170
160 M1
0.1 VTC1
M3
VTC2
150 M4
0 Model
0 0.1 0.2 0.3 0.4 140
0 1 2 3
V1 (V)
Width Scaling Factor
Figure 2. Deterioration of inverter VTC under low-VDD, Figure 3. Modeled and simulated DRV as a function of
with zero SRAM cell noise margins at DRV. transistor width scaling.
bi is about 10mV, depending on bias condition. As demon- disrupt the state of the memory cell. More particurly, these
strated in Fig. 3, Eq. (13) fully captures the impact of transis- are the noise on the supply rail (mostly caused by the output
tor sizing on DRV. Due to the competing behavior of PMOS voltage ripple of the switched-capacitor converter), as well
devices M1 and M3, they have different roles in state preser- as radiation particles. To offset these effects, an appropriate
vation. Fig. 3 infers that transistor sizing is an effective noise margin has to be provided. Simulation shows that
means in tuning the DRV of an SRAM cell, which will be assigning a guard band of 100mV above DRV for standby
further explored in Section 5. Temperature coefficient c is VDD gives 60mV in SRAM cell Static Noise Margin (SNM),
extracted as 0.169mV/°C, which predicts an increase of and that the SNM degrades linearly with the VDD guard band
12.3mV in DRV when T rises from 27°C to 100°C. (becoming zero at the DRV). Here, the SNM is defined as
With DRV obtained, the total leakage of an SRAM cell the edge of the maximum square that can fit into the cross
in the sub-Vth region can be calculated as: section of the VTC diagram of the cross-coupled inverters
I leak = I 2 + I 4 [6]. A guard band of 100mV over the DRV is sufficient to
(14) overcome the 20mV peak-to-peak ripple on the standby VDD
V2 V (with worst case SNM of 55mV).
= S 2i2 exp ⋅ 1 − exp − 1
n2 kT q kT q The radiation particle events pose a more serious hazard.
V1 DRV − V2 With data storage node parasitic capacitance around 1fF in a
+ S4i4 exp ⋅ 1 − exp −
n
1 kT q kT q 0.13µm technology SRAM cell, the critical charge (Qcritical)
for a 1V VDD is approximately 3fC. For a reduced VDD at 100
where i=I0·exp[-Vth/(n·kT/q)]; i2 = 2.95µA and i4 = 6.78µA;
mV above the DRV, Qcritical is reduced to 0.5fC. To ensure
DRV, V1, and V2 are calculated from Eqs. (10-13). Thus, the
reliable state preservation, future memories may have to add
leakage power Pleak under DRV is:
additional storage capacitance [7]. Another option to combat
Pleak = DRV ⋅ I leak . (15) the effect of soft errors is to apply error-correction schemes.
Leakage current Ileak as provided in Eq. (14) represents the
3.1.2. Dual voltage scheme design concerns. Main
minimum achievable leakage while preserving state.
considerations in a dual supply scheme include the operation
delay overhead due to the power switch resistance, memory
3. Ultra low voltage SRAM standby: wake up delay and the power penalty during mode transition.
Design and implementation Targeted for ultra low-power applications, the system
To obtain silicon verification of the presented DRV requirements of this design are much more stringent on
model and explore the potential of SRAM leakage suppres- power than performance [1]. In such a situation the concern
sion with ultra low standby VDD, a 4KB SRAM test chip with of the operation delay overhead is not that crucial. A 200µm
dual rail standby control was implemented in a 0.13µm tech- wide PMOS power switch with 30Ω conducting resistance is
nology. Designed for ultra low-power applications, this used to connect the memory module to a 1V operation
scheme puts the entire SRAM into deep sleep during the supply voltage. With the same switch the memory wake up
system standby period. As shown in Fig. 4, the SRAM sup- time is simulated to be within 10ns, which is typically a
ply rail are connected to the standard VDD and the standby small fraction of the target application system cycle time.
VDD through two big power switches. A customized on-chip The wake up power penalty incurred during switching
SC converter generates the standby VDD with high conver- from the standby mode to the full-VDD mode determines the
sion efficiency. Compared to existing SRAM leakage control minimum standby time for the scheme if net power saving
techniques, the simplicity of this scheme leads to minimized over one standby period are to be acheieved. This break-even
design effort and maximum leakage saving during standby time is an important system-design parameter, as it helps the
mode, which is desired for battery-supported systems. power control algorithm to decide when a power-down
would be beneficial. With the parasitic capacitance
3.1. Design considerations information attained from process model, the minimum
3.1.1. Standby stability and noise margin analysis. In standby time in this design is estimated to be around tens of
an actual implementation, reducing VDD all the way down to microseconds, which is much shorter than the typical system
the DRV is not really an option, as other mechanisms may idle time in a battery-supported system.
5:1 3.2. Test chip implementation
VDD SC
Conv Layout of the 0.13µm SRAM test chip is shown in
Fig. 5. The two main components are a 4KB SRAM module
and a Switch Capacitor (SC) converter. This memory is an
Stby IP module with no modifications from its original design. As
shown in Fig. 6, a representative five-stage step down dc-dc
switch capacitor topology is selected to implement the on-
4k Bytes chip standby VDD generator [8]. Compared to magnetic-based
SRAM voltage regulators, SC converter provides higher efficiency,
smaller output current ripple, and easier on-chip integration
Figure 4. Standby leakage suppression scheme. for small loads in the microwatt range. The design challenge
(a) (b)
Data
output
Figure 5. A 0.13µm SRAM leakage-control test chip. SRAM
supply
(a) SC Conv design C lk Clk
10 pF 10 pF 10 pF 10 pF 10 pF t1 t2 t1 t2
0.66 0.66 0.50 0.50 0.50
End of State “1” End of State “0”
0.24 0.35 0.24 0.35 0.24 0.35 0.24 0.35
standby restored standby restored
Clk
Figure 7. Waveform of DRV measurement.
(a) DRV = 190mV in SRAM cell 1 with state “1”,
(b) DRV = 180mV in SRAM cell 2 with state “0”.
Rmem (numbers indicate transistor w idth in microns) 15
(b) Operation phases
Clk
40 ~13mV due
DRV (mV)
200 to temperature
30 Measured fluctuation
DRV range 150
20 ~100mV due
to process
10 100 variation (3σ)
0 0 1 2 3 4 5 6
0 0.2 0.4 0.6 0.8 1
Supply Voltage (V)
Process variation (σ)
Figure 10. DRV under Process and T variations.
Figure 9. Measured SRAM leakage current.
As observed from measurement and simulation results in
state preservation as analyzed in section 3.1.1. With the re-
previous sections, minimizing process variation is the most
sulting 350mV standby VDD, SRAM leakage current can still
effective method to reduce DRV. While process variation
be reduced by over 80%. Subsequently the leakage power, as
control becomes more difficult in future technology nodes, it
the product of VDD and leakage current, is efficiently reduced
will define an effective lower bound on SRAM VDD. Another
by more than 90%.
way to improve data preservation under low VDD is to ensure
4.3. Dual rail standby scheme measurement low temperature operation (e.g., at room temperature), which
is easier for ultra low-power applications.
The dual rail scheme is shown to be fully functional
through the DRV measurements. With 10MHz switch con- 5.2. Sizing optimization
trol signal, the SC converter generates the standby VDD with While sizing has long been an effective tool in conven-
less than 20mV peak-to-peak ripple. Wake up time of 10ns is tional power and speed optimization, taking DRV into ac-
observed during mode transition, while the sleep time spans count further improves future ultra low-power SRAM de-
around 10µs. Delay overhead during active operation is signs. This section presents DRV-aware SRAM cell optimi-
measured to be about 2X, which is reasonable for an ultra zation analysis based on transistor sizing. Here, leakage
low-power application where the system clock period is typi- power is considered as the power metric since this analysis
cally 10 times the operation cycle of a low leakage SRAM. targets ultra low-power design.
5.2.1. Read delay and area cost models. To facilitate
analytical SRAM cell optimization, delay and area models as
5. SRAM DRV minimization: Analysis a function of transistor sizing are developed. In this model,
While SRAM designs have been well optimized for the SRAM access delay is considered as the time required to
speed and power metrics, improving DRV for future ultra pull down the bit-line voltage by 15% of VDD. Starting from
low-voltage applications poses a new challenge for low- a simple hand-calculation model, a curve-fitting approach is
power SRAM designers. Results from previous sections have used to develop a more accurate and scalable delay model as
illustrated the importance of process variations, chip tem- given by analytical derivation:
perature, and transistor sizing on DRV. A thorough under- K ⋅ SW + (C BL 0 + K BL ⋅ SW ) , (16)
standing of their impacts on DRV and other performance t read =
metrics provides insight into SRAM cell design for ultra 3 2 2 1
K read ⋅ A ⋅ SW + B ⋅ SW S N − C ⋅ SW S N −
low-voltage and ultra low-power designs. 2
In this model SW, SN represent the (W/L) ratio of the
5.1. Process variations and temperature impacts write access transistor and the NMOS pull down transistor,
Process variation and temperature fluctuation are the respectively, while parameters K, Kread, A, B, and C are fit-
main imperfections in a real environment that cause degrada- ting parameters. They are related, but not equal to the fol-
tions in circuit performance. Figure 10 shows the simulated lowing physical parameters:
effect of these two parameters on SRAM DRV. Data shows 2 2 , V − VthW , (17.a)
K read = k '⋅V n ⋅DSAT A = DD
that process variations play a critical role in determining VDD V DSAT
DRV, which increases by 100mV in the presence of 3σ mis- V DSAT , V DD − VthW − V DSAT 2 . (17.b)
B= C=
matches in Vth and channel length (L). On the other hand, 2(V DD − VthN ) V DD − V DSAT
temperature affects DRV in a less severe manner. When T Fitting parameters CBL0 and KBL define a relationship be-
changes from 27°C to 100°C, DRV rises about 13mV, which tween the bit-line capacitance CBL and the W/L ratio SW of
was verified by test chip measurement. The difference in access transistors. The delay model fits simulated results as
process variation and T effects on DRV is because fluctua- shown in Fig. 11. Due to a high layout density of an SRAM,
tion in T affects all transistors in an SRAM cell uniformly, the SRAM cell area is simply modeled as a linear function of
while local process variations may change the drive strength the total transistor area.
ratio between transistors, resulting in an exacerbated sub-Vth
VTC mismatch and thus, a substantial DRV increment. Both 5.2.2. DRV minimization. In conventional performance-
effects are well captured by the analytical models in Sec. 2. optimized SRAM cell design, the pull-down NMOS devices
1.5 area reduced at the same time. Therefore, reducing size of
S N=1.06 the cell both lowers the area and the leakage power at the
expense of increased delay. At a 30% delay increase, a 50%
tread (ns) S N=1.42 leakage power reduction can be achieved. Furthermore, since
1 S N=1.77 Ileak is relatively insensitive to VDD in the range around DRV
S N=2.12 as compared to high-VDD condition (see Fig. 10), there is
S N=2.48 only slight difference in leakage power for the cases with no
Simulation process variation and 3σ variation.
Model
0.5
0 1 2 3 4
Width factor SW of access transistors 6. Conclusions and future work
Figure 11. Read delay model fitting at VDD = 1V.
This paper explores the limit of SRAM data preservation
under ultra-low standby VDD. An analytical model of the
are sized substantially larger than the PMOS devices. This SRAM DRV is developed and verified with measurement
imbalance in the pull-up and pull-down paths leads to exac- results. A commercial SRAM module with high-Vth process
erbated VTC deterioration at low VDD, and degrades DRV. It is shown to be capable of sub-300mV standby data preserva-
can be observed from Fig. 3 that by either decreasing the tion. Under this low standby VDD, leakage power saving of
NMOS size or increasing the PMOS size DRV can be effec- more than 90% can be achieved with a dual-rail standby
tively reduced due to better balance in VTC. Further taking scheme designed for ultra low-power applications. The DRV
delay and area costs into account, NMOS tuning proves to be is observed to be a strong function of process variation and
more desirable. This is because smaller NMOS improves SRAM cell sizing, while the design tradeoffs between read
both DRV and area at certain delay penalty, while larger delay, area and leakage power can be optimized.
PMOS only improves DRV and pays large area cost, which
is expensive in memory design. While existing approaches generally involve significant
tradeoffs with other performance metrics or technology lim-
SRAM cell optimization tradeoffs are shown in Fig. 12, its, more opportunities exist on architectural level innova-
where the marked nominal case is a commercial perform- tions. As an example, more SRAM leakage savings can be
ance-optimized cell design. Fig. 12a plots the lower bound achieved with assistance from error tolerant schemes when
on DRV for both 3σ process variation (DRV0 = 170mV) and the standby supply voltage is scaled down beyond the limit
perfect matching (DRV0 = 78mV) versus NMOS size. In of DRV. Future work will be focused on exploiting design
both cases DRV can be reduced by 20~30mV at some delay techniques to achieve even lower power and higher reliabil-
penalty. Tradeoffs between delay, leakage power and area ity in memory design.
are illustrated in Fig. 12b. It can be observed that around the
nominal design point, area is best traded off for performance. Acknowledgements
However, leakage power is then best traded for performance The sponsorship of the GSRC MARCO center and fabri-
at delay of about 20% higher than the nominal point, with cation support from ST Microelectronics are greatly ac-
200
knowledged. The authors would also like to thank to Profes-
DRV0=170mV sor Seth Sanders and Dr. Bhusan Gupta for their enlighten-
150 (3σ variation) ing technical advice.
DRV (mV)
100
References
DRV0=78mV
50 [1] J. Rabaey et al, “PicoRadios for Wireless Sensor Networks: The
(no variaiton)
Next Challenge in Ultra-Low-Power Design,” Proc. of ISSCC,
0
pp. 200-201, Feb 2002.
0 1 2 3 4 [2] K. Itoh, “Low Voltage Memories for Power-Aware Systems,”
nom
(a) SN / SN Proc. ISLPED, pp. 1-6, Aug. 2002.
2
[3] S. Kaxiras, Z. Hu, and M. Martonosi, “Cache decay: Exploiting
generational behavior to reduce cache leakage power,” Proc. of
nom
1.5 [4] K. Flautner et al, “Drowsy caches: simple techniques for reduc-
Cell Area ing leakage power,” Proc. of ISCA, pp. 148-157, May 2002.
1 [5] J. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated
Circuits: A Design Perspective, 2nd edition, Prentice-Hall 2002.
nom
pLk
0.5 DRV0=170mV [6] E. Seevinck, F. J. List, and J. Lohstroh, “Static-noise margin
analysis of MOS SRAM cells”, IEEE J. Solid-State Circuits,
DRV0=78mV vol. SC-22, No. 5, pp. 748-754, Oct. 1987.
0
0 0.5 1 1.5 2 2.5 [7] C. Lage et al., “Soft error rate and stored charge requirements in
nom advanced high-density SRAMs,” Proc. of IEDM, pp. 821-824,
(b) tread / tread
Dec.1993.
Figure 12. SRAM cell design tradeoffs: [8] K.D.T. Ngo and R. Webster, “Steady-state analysis and design
(a) DRV vs. NMOS transistor size, of a switched-capacitor DC-DC converter,” Proc. of PESC, pp.
(b) Leakage power and area vs. read delay. 378-385, Jun-Jul 1992.