You are on page 1of 11

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO.

12, DECEMBER 2017 3473

A 32-nm Subthreshold 7T SRAM


Bit Cell With Read Assist
Shourya Gupta, Student Member, IEEE, Kirti Gupta, Member, IEEE,
and Neeta Pandey, Senior Member, IEEE

Abstract— The implementation of the six-transistor (6T) static


random access memory cell in deep submicrometer region has
become difficult due to the compromise between area, power, and
performance, with local and global variations only exacerbating
the problem further. To impede the read–write conflict of the
6T cell, the seven-transistor (7T) cell with a noise-margin-free
read operation has previously been proposed. But it severely
lags in terms of its write ability at lower voltages due to its
single-ended write operation. Its single-ended read operation
also degrades severely in performance when operating in sub-
threshold (ST) region. To combat these problems, we propose
a 7T cell which operates in the ST region down to 0.4 V
with improved dynamic write ability. The novel topology also
helps reduce power consumption by achieving a lower data
retention voltage point. A read assist has been proposed to greatly
enhance the performance of the single-ended read operation in
ST region. Large improvements in various performance metrics
Fig. 1. Schematic of standard SRAM cells. (a) 6T cell. (b) 7T cell [14].
of the proposed cell have been attained while simultaneously (c) 8T cell.
achieving a low area of 0.254 µm2 per bit cell on the 32-nm
technology node.
Index Terms— Data retention voltage (DRV), dual port, leakage been the industry standard due to its fast differential sensing
current, low power, read assist, seven-transistor (7T) static ran- and very low area. However, the extensive scaling of supply
dom access memory (SRAM), subthreshold (ST), write margin. voltage has affected the performance of the read and write
operations in SRAMs, thereby making it difficult to implement
I. I NTRODUCTION the conventional 6T cell. Although at strong inversion, device
sizing is enough to ensure proper functioning of the mem-
S CALING of transistors in digital design to decrease
power and improve performance has become a significant
challenge on recent deep submicrometer technology nodes.
ory cell, at low voltages (weak inversion), process-voltage-
temperature (PVT) variations and local mismatch cause the
The circuits become more vulnerable to variability and noise memory to malfunction. Many structures have been proposed
with decrease in technology node [1], [2]. Because of energy to solve this problem. The decoupling of the read and write
constraints of battery-powered devices, research in low power ports to have a read static noise-margin (RSNM)-free read,
consumption circuits has become more imperative than ever as in the seven-transistor (7T) and eight-transistor (8T) cell
before. The scaling of supply voltage to decrease power [Fig. 1(b) and (c)], has been a viable approach to improve
consumption has been a popular choice due to its effect of noise margins but it comes at the expense of increased area
quadratic reduction in power. This includes subthreshold (ST) and degraded read performance due to single-ended sensing.
and near threshold applications, which attempt to reach the Such single-ended sensing degrades severely in ST region due
minimum energy point [3] to save power but pose challenges to loss of drive of nMOS and thus reduced read currents.
like increased susceptibility to noise and loss in performance. The 7T cell, as shown in Fig. 1(b), suffers from the reduced
One of the main components of every system-on-chip (SoC) performance of a single-ended write operation due to the
is the static random access memory (SRAM), which occupies absence of a complementary bitline. The write ability of the
significant area of SoC [2]. The six-transistor (6T) cell, 8T cell, although being a major improvement over the 6T
as shown in Fig. 1(a), which forms the SRAM array has cell in terms of stability, is still insufficient for performing
a write operation at low supply voltages. Write ability in all
Manuscript received May 24, 2017; revised July 17, 2017; accepted such cells is determined by the pull up to access transistor
August 12, 2017. Date of publication September 8, 2017; date of current ratio. The access transistor (ACL) is made wider to make
version November 22, 2017. (Corresponding author: Shourya Gupta.)
S. Gupta and K. Gupta are with the ECE Department, Bharati it stronger than the pull-up transistor for a successful write
Vidyapeeth’s College of Engineering, New Delhi 110063, India (e-mail: operation. Dimensions are made large enough to ensure proper
shourya.gupta94@gmail.com). functioning under PVT variations. While this is sufficient
N. Pandey is with the ECE Department, Delhi Technological University,
New Delhi 110042, India. at strong inversion, this approach fails in ST operation due
Digital Object Identifier 10.1109/TVLSI.2017.2746683 to nonmaintenance of a desired pull up to access transistor
1063-8210 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
3474 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 12, DECEMBER 2017

strength ratio. This is primarily due to the fact that in ST


region, write operation is limited by cells at SF corner (slow
nMOS and fast pMOS). At SF corner, the pMOS becomes
stronger than an equivalently sized nMOS. This effect is
also affected by changes in temperatures due to the inverse
dependence of current on the ST slope (S = nV thln10) [2].
When the temperature is high, the slope is approximately
equal for both nMOS and pMOS, but as the temperature
is lowered, the slope falls faster for a pMOS than for an
nMOS, which in turn further increases pMOS’s drive in ST Fig. 2. Schematic of the proposed 7T cell.
region [2]. Since pMOS current becomes greater than nMOS
current, the required access to pull-up transistor strength ratio
Section V describes the hold state of cell, and Section VI
is severely degraded. Also, the threshold voltage variation due
discusses the layout and area of the proposed cell. Section VII
to device sizing and local mismatch in deep submicrometer
summarizes and concludes this paper. Apart from the func-
region, explained by Pelgrom’s Law [4], contributes further to
tional description and performance metrics analysis of the
this problem and the write operation becomes impossible to
proposed cell, we also present an extensive comparison of
perform at lower voltages without the help of assists.
the proposed 7T cell with the industry standard 6T cell and
Many assist techniques have been proposed to enable read-
the 7T cell in [14] on the basis of stability, performance,
ing and writing into the cell at low voltages. Some of the
power consumption, and area. The comparison has also been
techniques include, reduced cell supply voltage and world line
extended to the 8T cell because it performs a single-ended
underdrive [5], negative bitline [6], boosted word line [7],
read operation like the proposed cell. Also, the 7T cell in [14]
cell ground boost [8], and cell negative ground [9]. However,
and shown in Fig. 1(b), will henceforth be referred to as the
the cell supply collapse, ground boost, and bitline boost
7T-C cell in the rest of the paper.
techniques worsen the partial write disturbance situation of
half-selected cells by degrading the hold noise margin. The
II. P ROPOSED 7T C ELL
word line underdrive makes writing difficult at lower voltages.
And the implementation of a negative ground is difficult The architecture of the proposed asymmetric 7T cell is
because it requires sinking of read current from an entire shown in Fig. 2. It comprises of an inverter (PUR-PDR) and
column into a regulated negative level. [10] uses capacitive a pull-up pMOS (PUL), which are coupled together to store
coupled negative bitline write assist and in [11], the column- one-bit information. An access transistor (ACL) is used for
wise cell supply is floated during a write operation to weaken single-ended write operation and two nMOS (R1, R2) to per-
the pull-up transistor in the cell. Asymmetrical sizing [12] and form single-ended read operation. The write bitline (WBL) and
multi-Vth devices have also been used to enhance read and write word line (WWL) are used for performing write opera-
write ability and improve noise margins. However, increased tions, and the read bitline (RBL) and read word line (RWL)
sizing due to asymmetric sizing may increase leakage currents are used for performing a read operation. An nMOS (DL) with
and bitline/word line capacitances, thus increasing power its gate terminal connected to ground potential is implemented
consumption. Also, the implementation of multiple threshold to provide stability through leakage currents. This implemen-
devices in close proximity is both difficult and expensive [13]. tation has been explained in the next section.
The implementation of all such assists requires silicon area,
which is a significant tradeoff, given the recent demand for III. W RITE A BILITY
larger memories and smaller devices. A single-ended write operation is more difficult to perform
In this paper, we propose a 7T cell on the 32-nm technology than the double-ended one in the conventional 6T cell. This is
node as an improvement over the 7T cell in [14], also shown because a conventional 6T cell uses complementary bit lines to
in Fig. 1(b), by using a dual-port architecture to impede perform a write operation and either of the nodes (X or X̄ ) in
the well-known 6T read–write conflict and implementing the cell is discharged quickly through its corresponding bitline.
architectural changes in the cell to enable a write operation The 7T-C cell accomplishes a write operation through its single
in the ST region for low-power applications, without the bitline, by relying entirely on the mutual effect of inverters to
implementation of any write assist. The single-ended read flip the values. Thus, write ability in the 7T-C cell is achieved
operation, which degrades in the ST region, has also been by modifying the voltage transfer characteristics (VTC) of
improved by implementing a novel read assist technique. each inverter. The trip point of one inverter is increased while
The novel cell and assist, described subsequently, also permit the trip point of the other is decreased. The modification of
minimum sized transistors as part of the cell, which is optimal trip point is done by resizing the transistors, which increases
for reducing energy per operation [3]. This, in contrast to the overall area. Even more, the resizing of transistors to facilitate
7T cell in Fig. 1(b), makes a high-density low-power SRAM. a write operation becomes less effective at lower voltages due
This paper has been constructed as follows. Section II to diminished margins and increased susceptibility to process
describes the proposed cell architecture, Section III describes variations. By dynamically analyzing the 7T-C cell’s write
the write ability and write operation of the proposed cell, “one” operation, we can observe the X node’s inability to
Section IV describes the proposed read assist technique, rise quickly in the beginning due to the pull-down effect of
GUPTA et al.: 32-nm ST 7T SRAM BIT CELL WITH READ ASSIST 3475

the PDL nMOS. Given enough time, the mutual feedback


effect of the inverters eventually flips the cells. However, this
writing process eventually fails for single-ended cells at lower
voltages. To overcome this situation, the PDL nMOS’s pull-
down effect in the 7T-C cell, as shown in Fig. 1(b), has been
eliminated to form the proposed cell (Fig. 2). The result is
enhanced write ability, even at lower voltages. The bistability
is achieved with the help of a pull-up pMOS (PUL) and an
inverter (PUR-PDR). However, because of the elimination of
the pull-down effect of PDL, the hold “zero” state turns unsta-
Fig. 3. Setup for measuring the write margin of the proposed 7T cell.
ble due to the flow of ST leakage currents into the X node.
To overcome this problem, the WBL is kept low during hold
mode and is pulled up only during a write operation. This
type of alteration can be accomplished easily without making
much change to the array architecture. Such a change helps
avoid the flow of leakage current into the X node through
the access transistor by an otherwise high bitline. To further
strengthen the hold “zero” state, the DL nMOS, as shown
in Fig. 2, is low-threshold voltage (LVT) implanted. This LVT
nMOS provides an additional path for leakage currents to flow
from X node toGND. During the write operation, the WBL
is pulled up which enables the flow of leakage current into
the X node of unselected cells that are in hold “zero” state.
But with the help of the LVT nMOS (DL), the leakage
current ratio is maintained and only a small temporary rise
in the X node voltage is observed. This also provides leakage
suppression in hold “zero” state as described in subsequent
sections. A detailed analysis for X node “zero” hold state and
leakage ratio maintenance is also presented in Section V.
Now to understand the write operation, let us suppose “zero”
is stored in the cell. To write “one” into the cell, WBL is
pulled high and WWL is enabled. As opposed to the 7T-C cell,
where the pull-down nMOS is initially conducting, no such
mechanism is present in the proposed cell. This means that
the X node can rise much more quickly. Such a configuration
decreases the “zero” to “one” transition time and enables faster
write operations with better dynamic write ability. Overall,
the proposed cell enables low-voltage writes, down to 400 mV
under 6σ global process, and local mismatch variations.

A. Static Write Margin


Fig. 4. (a) 74.2-mV write margin for the proposed 7T cell at VDD = 0.4 V
A higher static write margin signifies ease of writing while under 6σ global process and 1σ local mismatch variations (10 000-point
MC simulation). (b) Worst-case write margin for different SRAM cells.
a lower margin means a harder write operation. A balanced
margin is necessary because too high of a margin increases
susceptibility to noise while too low of a margin means of the proposed cell for a 10 000-point Monte Carlo (MC)
much harder writes. The bitline sweep method [15] proposed simulation under 6σ global process and 1σ local mismatch
a new approach for measuring write margin and provided variations. Fig. 4(b) compares the write margins of the 6T,
a better correlation with temperature, threshold voltage and 7T-C, 8T, and proposed 7T cell with varying supply voltages.
supply voltage than the conventionally used the butterfly curve Simulations showed that the conventional 6T and 8T cells
method. In [16], the word line sweep method was proposed as lose write ability below 600 mV. The 7T-C cell also performs
an improvement over the bitline sweep method. Thus, the word single-ended write operations like the proposed cell. However,
line sweep method has been accordingly implemented here to it is able to maintain write ability down till only 800 mV, while
measure the write margin of the dual-port proposed 7T cell. the proposed cell maintains a 72.3-mV write margin even at
Fig. 3 shows the setup for measuring the write margin. The 400 mV. Also, the method of improving write margins for the
cell is made to store “one,” the WBL is pulled down to low 7T-C cell by resizing the transistors increases area. In contrast,
level and the access transistor is enabled by sweeping the write the proposed 7T cell enables writing to much lower voltages
world line from zero to VDD . Fig. 4(a) shows the write margin while simultaneously decreasing area per bit cell.
3476 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 12, DECEMBER 2017

Fig. 5. Trajectory of X during write “zero” operation for varying


WWL pulsewidth.

B. Dynamic Write Ability


The write margin is a static measure which assumes infinite
width write pulse and does not correlate the write ability of the
cell with a real-time write operation. The static noise-margin
Fig. 6. Shmoo plots for (a) 7T-C and proposed 7T and (b) 6T and 8T cells.
methods also do not stipulate read and write operation ability
and their correlation with Vmin , the lowest possible value for
which the cell retains its information and performs both read Fig. 7(a) shows the write time as function of write
and write operations successfully. Thus, the static approach pulsewidth for the 7T-C cell and the proposed 7T cell at
of measuring stability does not account for appropriate Vmin 0.8-V supply voltage. For the proposed cell at strong inversion,
determination, both in terms of static and dynamic ability [17]. the write operation takes longest to complete at the SS corner
To characterize the dynamic write ability, the TCRIT para- followed by the SF corner. But as we approach near ST
meter has been used. It is the minimum time pulsewidth region, the time difference between the write operations at
required for a successful write operation. A pulsewidth lower SF and SS corners shrinks as shown in Fig. 7(b), and in
than TCRIT results in an unsuccessful write operation and the ST region, the write operation at the SF corner takes much
flipping of the node back to its original value. A pulsewidth longer than at the SS corner. This is because, when operating
longer than TCRIT results in the nodes staying at the changed at SF corner and near ST region or below, pMOS’s current
values. A pulsewidth equivalent to TCRIT is prone to metastable starts to overtake nMOS’s current, which is exactly opposite
condition and the cell may flip to any value. The TCRIT also to what is needed for a successful write operation i.e., a strong
decreases with increase in temperature. Fig. 5 shows the write access nMOS and a weak pull-up pMOS. Therefore, the write
zero operation at VDD = 0.4 V for varying pulse widths. The time for the proposed cell in the ST region is greatest at SF
TCRIT at 0.4 V supply voltage and 27 °C temperature was corner. Nonetheless, the proposed 7T cell showed about 92%
found to be 62 ps. reduction in the write time over the 7T-C cell on the SF corner
The shmoo plot for varying supply voltages and write pulse at 0.8 V.
widths for the write operation in the 7T-C, proposed 7T-C, 6T,
and 8T cell is shown in Fig. 6. Since the single-ended write
mechanism of the proposed cell is entirely different from that IV. P ROPOSED R EAD A SSIST T ECHNIQUE
of double-ended cells like the conventional 6T and 8T cells, During a read operation in the 6T cell, the low-value node
it is only fair to compare its dynamic write ability with that of rises due to the flow of read current through it. If the node
the 7T-C cell. As evident from the shmoo plots, there is great value rises too much during the read operation, the cell flips
improvement in the write ability of the proposed cell over the leading to destruction of data. This is only worsened by
7T-C cell. While the 7T-C cell fails to write below 800 mV, process variations and lowering supply voltages, which cause
the proposed 7T cell is able to write even in the ST region the read margin to fall even lower. To combat this problem,
down till 400 mV. At equivalent voltages at strong inversion, the 7T-C and 8T cells decouple the read port from their write
the proposed 7T cell is able to perform a write operation with port to perform RSNM-free single-ended read operation. Both
shorter write pulses as compared to the 7T-C cell. the read ports of the 7T-C and 8T cells being architecturally
GUPTA et al.: 32-nm ST 7T SRAM BIT CELL WITH READ ASSIST 3477

Fig. 8. Proposed read assist circuit for the read port in the proposed 7T cell.

7T cell. A charge pump circuit with multiple cascaded nMOS


and capacitors (Dickson charge pump) to increase voltage
level is very slow in reaching a pumped value. It also uses
multiple capacitors which occupy a very large area. The
proposed charge pump circuit uses a pair of cross coupled
inverters (T1-T2 and T3-T4) and two large nMOS (Q1 and Q2)
as capacitors instead of conventional capacitors. During ST
operation, the CLK and CLK signal pass through the Q1 and
Q2 to switch the inverter pair at nearly the CLK voltage level.
The RWL x then reaches a dc voltage level nearly equivalent
to the CLK level (i.e., RWL voltage level). When the RWL
goes high, the RWL x is lifted by the dc level of the RWL to
give an overall 1.8× boost to the read enable signal. However,
this pumped signal switches between the original CLK level
to 1.8× CLK level. Therefore, to provide a full swing RWL x
signal to R2, two inverters are used in cascade. During hold
Fig. 7. (a) Write time at VDD = 0.8 V with percentage reduction in write mode, the RWL x returns to clock level and switch OFF the
time for proposed cell at each corner. (b) Write time for proposed cell near very weak T7 nMOS. Consequently, the N1 node goes high
and below ST region.
and the output of inverter T8-T9 goes to zero, thus providing
a full swing RWL signal.
identical, use R1 and R2 to pull down the long RBL with The read assist can be implemented row wise in order to
large capacitance. The R1 and R2 nMOS are widened to pull enable the read transistor of all cells in a row during a read
down the RBL faster but that in turn increases the area and operation. The versatility of the read assist can also be noted
leakage currents. Also, for higher supply voltages, the RBL in the fact that it can be implemented in all such cells which
is pulled down sufficiently fast due to the transistors being in perform a read operation similar in fashion to the proposed
strong inversion. But in ST region, the read current is greatly 7T cell. The overall area of the proposed read assist circuit
degraded and the delay becomes quite large. The ratio of is low, occupying an area of approximately 4000λ2 (area
read current to leakage current is also degraded, prohibiting of ∼5 cells). The area of a row comprising of 64 proposed
longer columns and low-voltage read operations. Therefore, 7T bit cells is 50 176λ2. Therefore, the total area of each row
to curb this problem the following read assist technique has supplemented with a read assist circuit becomes 54 176λ2.
been implemented. However, if the read assist circuit is not implemented, the read
During a read operation in the proposed cell, the RBL is port MOS (R1 and R2) has to be widened for viable read
precharged to high and the RWL is enabled. The R1 nMOS is operation performance in ST region, thereby increasing the
switched ON or OFF by the value stored in X̄ node. If cell area of a row with 64-bit cells to 55 296λ2. Therefore, it can
stores “one,” R1 is switched OFF and the bitline remains be seen that the implementation of the read assist helps save
high. If the cell stores “zero,” the bitline discharges through an area of approximately 1120λ2 (up to 2% saving). For a row
R1 and R2. A charge pump circuit has been used to overdrive with 128-bit cells, the area savings increase to approximately
the R2 nMOS by biasing its gate to a voltage level of nearly 7%. Conversely, for a row with 32-bit cells, the area overhead
twice the supply voltage. This ensures that R2 remains at is approximately 5% which is a permissible tradeoff given the
strong inversion and is able to sink larger read currents even advantages of the assist method.
when the supply voltage is low. Since the drive capability The implementation of the proposed read assist cir-
of the R2 nMOS has been greatly increased, its width can cuit makes the read operation much more feasible in the
be reduced, thereby decreasing the area. This technique thus ST region, since the read current remains preferably high.
decreases the delay and improves overall performance for both In contrast to the slow (∼μs) Dickson pump charge
below and above threshold regimes. Fig. 8 shows the circuit method, the charge pumped RWL signal using the pro-
for the charge pump circuit for the read port of the proposed posed circuit can be produced in just a few nanoseconds.
3478 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 12, DECEMBER 2017

Fig. 9. Read current distribution for different cells at strong inversion


(VDD = 0.7 V) at 27 °C under 6σ global process variations (20 000-point
MC simulation each). Fig. 11. ION /IOFF ratio for the proposed 7T cell and the 8T cell. The
improvement over the 8T cell is shown in dark.

current and leakage current is called the ION /IOFF ratio.


A better ION /IOFF ratio allows the designer to create larger
SRAM arrays with greater number of bit cells per column.
Fig. 11 shows the ION /IOFF ratio for the proposed 7T cell
with read assist and the 8T cell with respect to varying supply
voltages. The scale is logarithmic and the dark bar is the
improvement over the 8T cell’s read port. While there is about
100% improvement across all cases, up to 500% improvement
can also be seen. For the 8T cell’s read port, the ratio
deteriorates steadily with decrease in supply voltage due to
degradation in read currents. And in ST region the ratio might
be affected immensely to the point where an array becomes
Fig. 10. Read current distribution for the 8T cell and the proposed 7T cell unfeasible. But for the proposed 7T cell, the ratio improves
with read assist below threshold (VDD = 0.4 V) at 27 °C under 6σ global
process variations (20 000-point MC simulation each). with lowering supply voltage due to significantly maintained
levels of read current with the help of the read assist. A deterio-
rated ratio is seen when operating below or near ST region due
At 0.4-V VDD, the time required by the long RBL (1024 cells) to slight dip in read currents. The ratio also deteriorates with
to drop by 90% of its original value was measured to be increase in temperature due to increase in leakage currents.
97 ns for the proposed 7T cell and 216 ns for the 8T cell. As the supply voltage is lowered, read ability is also affected.
Fig. 9 shows the read current distribution for different cells at There is an increase in delay with decrease in supply volt-
strong inversion under 6σ global process variations. As evident age or with increase in number of cells per bitline. For single-
from Fig. 9, the proposed 7T cell with the help of the read ended reading, full swing inverter-based sensing can be used.
assist outperforms the 8T cell. However, it lags behind the 6T However, this is very slow in comparison to the differential
cell because the read path of the 6T cell comprises of a wide sensing of conventional 6T cells because it takes time for
and therefore strong access transistor and an even stronger the full swing to develop. Due to the single ended nature of
pull-down transistor. The ST read current distribution for the the read port, a sense amplifier which needs two bit lines to
8T cell and the proposed 7T cell with read assist is shown compare cannot be used. Instead, pseudo differential sensing,
in Fig. 10. Fig. 10 shows that there is approximately a 2.5× which involves creating a reference voltage can be used instead
increase in the mean read current, and the worst case read to amplify small signals. AC-coupled-based sensing and trip
currents for the proposed 7T cell are comparable to the best point precharge sensing [18] have also been proposed to per-
case for the 8T cell. For the proposed 7T cell operating in ST form read operation at lower voltages with better performance.
region, just a 25% increase in length of the R2 transistor, Nevertheless, it is obvious that single-ended sensing has a
the leakage current is further reduced immensely (up to 4 lot of variables like bitline capacitances (number of cells),
times) and the mean read current is affected only by 5%. But topology, architecture, transistor strengths etc. and it is up
in case of 7T-C or 8T cells, the same small increase in length to the designer to choose appropriate sensing scheme as per
of R2 transistor causes the mean read current to reduce to less situation and requirement.
than half.
During the read operation, a constant RBL leakage occurs V. H OLD S TATE
through R1 and R2. The combined leakage from all the cells In the proposed cell, as the DL transistor remains in cutoff,
in a single column may inadvertently pull down the bitline the X node rises by a small voltage during hold “zero” state,
and cause a false read. The relationship between the read reducing the hold noise margin and making the hold “zero”
GUPTA et al.: 32-nm ST 7T SRAM BIT CELL WITH READ ASSIST 3479

Fig. 12. OFF-state resistance of (a) PUL and (b) ACL and DL (resultant) in
the proposed 7T cell at VDD = 0.4 V and VDD = 1.1 V.

state more vulnerable. If the X node voltage was to rise beyond


the trip point of the PUR-PDR inverter, the cell data would
be destroyed. In order to achieve the lowest X node voltage Fig. 13. X node voltage distribution at (a) VDD = 0.4 V and (b) VDD = 1.1 V
under 6σ global process variations (10 000-point MC simulation).
during hold “zero” state, the effective OFF-state resistance
from X node to GND must be minimal in comparison to the
While many methods have been proposed to determine the
OFF -state resistance of PUL. This can be achieved by imple-
DRV [19], the conventional approach is to run an MC sim-
menting a combination of a wider access transistor (ACL),
ulation until it reaches desired probability. In some cases, a
a longer pull-up transistor (PUL) and an LVT DL. The
few errors might be tolerated because they can be fixed using
OFF -state resistance of PUL and the resultant OFF -state
error correction codes by adding a few parity cells in each
resistance of ACL and DL have been measured and plotted
word, allowing further reduction in Vmin and thus reduction in
in Fig. 12. As seen in Fig. 12, there is a 1000× difference in
static power. However, this paper does not implement such a
the OFF-state resistance of PUL in comparison to the resultant
topology and the DRV has been measured accordingly.
OFF -state resistance of ACL and DL. This ensures that the
In symmetrical SRAM cells, when the supply is lowered
X node voltage remains low at all times. In order to further
beyond DRV, the cell goes into metastable state, making the
analyze the stability of the X node, a 10 000-point MC
zero and one states unintelligible from each other. But in
simulation (6σ global process variations) was performed and
asymmetric cells like the 7T-C cell, one state is more stable
the X node voltage distribution was plotted at VDD = 0.4 V
than the other, and the cell flips to this state when the supply
and VDD = 1.1 V. As seen in Fig. 13, the worst case
is lowered beyond DRV. Therefore, asymmetric cells have
X node voltage at VDD = 0.4 V and VDD = 1.1 V remains
the disadvantage of having higher DRVs. Fig. 14 shows the
below 10 and 100 mV, respectively, which is far below the
contents of the proposed 7T cell, taking into consideration both
trip point (470 mV) of the PUR-PDR inverter. Therefore,
“zero” and “one” hold states as the power supply is lowered.
the proposed 7T cell provides stability in both above and
At the DRV point, the values flip, destroying the data stored
below threshold regions.
in the cell.
The DRV point is essentially when the static noise margin of
A. Data Retention Voltage the cell becomes zero, assuming no noise. The static hold noise
The data retention voltage (DRV) is the minimum supply margin, measured using the conventional Seevinck method
voltage at which the cell can still preserve its data. Due to the is quite pessimistic as it assumes infinite time duration for
strong effect of lowering supply voltage on leakage currents, sources of noise. Nonetheless, it gives a fair idea for compari-
the memory can be maintained at the DRV point to lower son between the noise margins and similarly DRV of different
static power consumption. The DRV of the entire array is cells. Fig. 15 shows the butterfly curve for the proposed cell
determined by the DRV of the worst case cell i.e., the tail of in hold state. The curve was obtained after a 20 000-point MC
the distribution. A protection of a few millivolts is added to simulation under 6σ global process and 1σ local mismatch
safeguard it further from temperature fluctuations and noise. variations. As seen in Fig. 15, one VTC curve undergoes
3480 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 12, DECEMBER 2017

Fig. 16. Worst-case DRV for different cells at 72 °C under 6σ global process
variations (10 000-point MC simulation).
Fig. 14. DRV for the proposed 7T cell at 72 °C under 6σ global process
variations (10 000-point MC simulation).

Fig. 17. (a) Read port leakage in conventional 8T cell. (b) Read port leakage
in proposed cell.

The DRV of the proposed 7T cell and other conventional cells


is shown in Fig. 16. The DRV was measured under 6σ global
process variations for a 10 000-point MC simulation at 72 °C,
since DRV degrades with increase in temperature. As seen
in Fig. 16, the proposed 7T cell provides large improvement
in DRV over the 7T-C cell.
Fig. 15. Butterfly curve for the proposed cell at 0.4-V supply voltage
under 6σ global process and 1σ local mismatch variations (20 000-point
MC simulation). B. Hold Power
Since the SRAM array remains in hold state during most
larger variations than the other. This is due to the absence of of its operating time, it is necessary to reduce its power
a pull-down nMOS at the X node in the proposed cell. Due consumption during that state. This makes the reduction of
to the asymmetric architecture of the proposed cell, one lobe leakage currents that makeup the hold state power consump-
of the butterfly curve is larger than the other. The side length tion, the primary area of concern.
of the smaller lobe determines the noise margin because the The leakage current through the read port in dual-port cells
smaller lobe deflates more quickly to the point of collapse as is a concern due to the hazard of unwanted pull down of the
the supply is lowered. In contrast, the lobes of the butterfly RBL due to the combined leakage current effect in a single
curve for symmetrical cells like the conventional 6T and SRAM column. In the worst case scenario, when the entire
8T cells are similar in size. In case of the asymmetrical cells array stores “zero,” the nMOS (R2) in the read port is enabled
such as the 7T-C cell, the difference in lobe size is improved as shown in Fig. 17(a). Both the source and drain terminals of
further by increasing the transistor sizing, thereby considerably R2 are low and thus gate tunneling leakage is at its maximum.
increasing the cell area and power, at which point alternate The ST leakage current through R1 is also greatest. To help
options open up. Also, unlike the proposed cell, this approach lower this flow of leakage currents, the reconfiguration of
of resizing transistors works with limited efficacy. The smaller the read port of 7T-C and 8T cells, shown in Fig. 17(b),
lobe for the 7T-C cell becomes nonexistent very quickly was implemented in [14]. The improvement was realized
(at around 370 mV) while the proposed 7T cell maintains a due to the reduction in the voltage difference between drain
butterfly curve even at 305 mV. The DRV for the proposed and source of the data driven nMOS R1. However, in this
7T cell is thus improved by about 70 mV over the 7T-C cell. paper, the effect of the read assist has been noted on the
GUPTA et al.: 32-nm ST 7T SRAM BIT CELL WITH READ ASSIST 3481

Fig. 18. Comparison of (a) ST leakage in read port and (b) gate tunneling
leakage for the data driven nMOS in 8T cell, 8T cell with modified read port
and the proposed 7T cell.

same modified read port as well. With the help of the read
assist, R2 can be made longer without affecting much of its
drive capability even when operating in ST region, thereby
Fig. 19. Comparison of hold. (a) “Zero” power. (b) “One” power for different
greatly reducing the ST leakage current. Fig. 18 shows the cells.
leakage current comparison between the conventional 8T cell,
8T cell with modified read port and the proposed 7T cell power consumption of the proposed 7T cell over other cells as
with read assist for varying supply voltages and temperatures. the supply voltage is increased. Despite the slight increase in
As evident from Fig. 18, the modified 8T cell improves leakage current through the PUR-PDR inverter, the proposed
greatly upon the conventional 8T cell. However, the proposed 7T cell consumes less power in comparison to other cells as
cell, with the help of the read assist is able to further reduce shown in Fig. 19(a).
the ST leakage and gate tunneling leakage by many folds. The hold “one” state is when the proposed cell consumes
The combined effect of the novel topology of the cell more power than the 7T-C cell. During the hold “one” state,
and the proposed read assist technique help the cell reach leakage current flows through the access transistor and the
a lower operating point and thus help reduce static power DL nMOS, increasing the overall hold “one” power. However,
consumption. Fig. 19 shows the comparison of hold power the difference in power in this state is not much due to
for different SRAM cells when they store zero and one. the enhanced DRV point of the proposed 7T cell over the
The power consumed by each cell at its respective DRV 7T-C cell. The proposed 7T cell also consumes about the
point and the percentage improvement in the proposed 7T same power as the conventional 6T cell in this hold state.
cell with respect to the 7T-C cell is also shown. As evident Overall, the proposed 7T cell with read assist helps achieve
from Fig. 19, the proposed 7T cell consumes much lower lower power consumption than other conventional cells.
power than the conventional 6T and 8T cells during hold
“zero.” There is also large improvement over the 7T-C cell VI. C ELL L AYOUT AND A REA
because unlike the minimally sized PDR transistor of the Conventional SRAM cells like the 6T and 8T cells
proposed 7T, the PDR transistor of the 7T-C cell is too wide, occupy lower area and provide higher write margin and
which increases ST leakage during hold “zero.” The X node lower leakage power consumption when their pull down to
in the proposed cell rises by a very small voltage during access transistor strength ratio is decreased [14]. Therefore,
the hold “zero” state, thereby helping reduce leakage current the proposed 7T cell with its enhanced write ability has been
through the PUL pMOS. However, the small increase in the X compared to 6T and 8T cells with lower pull down to access
node voltage also results in an increase in the leakage current transistor strength ratios. The aspect ratios of transistors in
through the PUR-PDR inverter. This increases the leakage the asymmetric 7T-C cell have to be decided while taking
power consumption during the hold “zero” state. It can also be write ability into consideration. This is achieved by increasing
observed in Fig. 19(a) that the direct consequence of this effect the trip point of one inverter and decreasing the trip point of
is the steady decline in percentage improvement in leakage the other inverter in the 7T-C cell. While taking these criteria
3482 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 12, DECEMBER 2017

TABLE I
C ELL D IMENSIONS AND A REA

Fig. 21. Layout area comparison of different SRAM cells on the 32-nm
technology node.

symmetrical (LS) layout for the 6T SRAM cell was proposed


in [20]. This LS-cell, as shown in Fig. 20(a), avoids bends
in its layout to avoid mask misalignment. It also reduces
effective cell width and reduces bitline capacitance at the
expense of longer word lines. It also provides minimal area
among other layouts for the 6T cell and thus has been the
industry standard since 65 nm. Fig. 20 shows the cell layout
for different SRAM cells, drawn in a similar fashion to the
LS-cell in order to achieve maximum performance. The cells
have been drawn using lambda-based design rules on the
32-nm technology node. Further reduction for a more compact
cell can be realized using stricter micrometer rules. As seen
Fig. 20. Layout of (a) 6T, (b) proposed 7T, (c) 7T-C, and (d) 8T cell. in Fig. 19, the 6T cell occupies an area of 0.2384 μm2 and
the 8T cell occupies an area of 0.3369 μm2 . The layout for
into consideration, the aspect ratios of transistors for all cells the 7T-C cell is shown in Fig. 20(c). It occupies a greater
included in this paper have been mentioned in Table I. area of 0.3629 μm2 (1.43× the proposed 7T cell) because
For SRAM cells, the layout poses design challenges such it requires larger transistors to maintain ratios for proper
as the evaluation of appropriate aspect ratio of cell to decrease functionality. On the other hand, the proposed 7T cell occupies
bit/word line parasitic capacitances for maximum noise margin a smaller area of 0.254 μm2 . Thus, as shown in Fig. 21, the
and minimal power and delay. Due to the lithography problems proposed 7T cell occupies 30% less area than the 7T-C cell
faced with decreasing technology nodes, the lithographically and makes for a compact SRAM array.
GUPTA et al.: 32-nm ST 7T SRAM BIT CELL WITH READ ASSIST 3483

VII. C ONCLUSION [15] K. Takeda, H. Ikeda, Y. Hagihara, M. Nomura, and H. Kobatake, “Rede-
finition of write margin for next-generation SRAM and write-margin
In this paper, we presented a novel 7T cell, capable of per- monitoring circuit,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig.
forming a write operation in the ST region down to 400 mV for Tech. Papers, Feb. 2006, pp. 630–631.
[16] N. Gierczynski, B. Borot, N. Planes, and H. Brut, “A new com-
low-power applications. We compared the drastically improved bined methodology for write-margin extraction of advanced SRAM,” in
static and dynamic write ability of the proposed 7T cell Proc. IEEE Int. Conf. Microelectron. Test Struct. (ICMTS), Mar. 2007,
with the 7T-C cell, as well as with other conventional cells. pp. 97–100.
[17] J. Wang, S. Nalam, and B. H. Calhoun, “Analyzing static and
The single-ended read operation which degrades severely in dynamic write margin for nanometer SRAMs,” in Proc. 13th Int. Symp.
ST region was also improved by implementing a novel read Low Power Electron. Design (ISLPED), New York, NY, USA, 2008,
assist technique. The read assist improved the performance pp. 129–134.
[18] H. Jeong, T. Kim, T. Song, G. Kim, and S.-O. Jung, “Trip-point bit-
of read operation in ST region by almost a factor of three. line precharge sensing scheme for single-ended SRAM,” IEEE Trans.
The assist also helped to improve the ION /IOFF ratio, which Very Large Scale Integr. (VLSI) Syst., vol. 23, no. 7, pp. 1370–1374,
helps create larger SRAM arrays. It also helped reduce leakage Jul. 2015.
[19] N. Edri, S. Fraiman, A. Teman, and A. Fish, “Data retention voltage
currents through the read port, thereby reducing the static detection for minimizing the standby power of SRAM arrays,” in Proc.
power consumption. The novel topology of the proposed cell IEEE 27th Conv. Elect. Electron. Eng. Israel (IEEEI), Nov. 2012,
improved the DRV point by about 70 mV over the 7T-C pp. 1–5.
[20] K. Osada et al., “Universal-Vdd 0.65–2.0-V 32-kB cache using
cell. The proposed 7T cell also provided a very low power a voltage-adapted timing-generation scheme and a lithographically
“zero” hold state and on the whole, helped achieve lower symmetrical cell,” IEEE J. Solid-State Circuits, vol. 36, no. 11,
static power consumption in comparison to other conventional pp. 1738–1744, Nov. 2001.
cells. Overall, the proposed 7T cell improved on various
performance parameters while simultaneously decreasing the
area per bit cell by 30% in comparison to the 7T-C cell on Shourya Gupta (S’17) was born in New Delhi,
the 32-nm technology node. India, in 1994. He is currently pursuing the
B.Tech. degree in electronics and communication
engineering with Guru Gobind Singh Indraprastha
R EFERENCES University, New Delhi.
His current research interests include the design of
[1] S. O. Toh, Z. Guo, T.-J. K. Liu, and B. Nikolic, “Characterization of low-power logic and static random access memory
dynamic SRAM stability in 45 nm CMOS,” IEEE J. Solid-State Circuits, circuits in emerging and exploratory technologies.
vol. 46, no. 11, pp. 2702–2712, Nov. 2011.
[2] B. H. Calhoun and A. P. Chandrakasan, “Static noise margin variation
for sub-threshold SRAM in 65-nm CMOS,” IEEE J. Solid-State Circuits,
vol. 41, no. 7, pp. 1673–1679, Jul. 2006.
[3] B. H. Calhoun, A. Wang, and A. Chandrakasan, “Modeling and
sizing for minimum energy operation in subthreshold circuits,” IEEE
J. Solid-State Circuits, vol. 40, no. 9, pp. 1778–1786, Sep. 2005. Kirti Gupta (M’15) received the B.Tech. degree in
[4] A. Sheikholeslami, “Process variation and Pelgrom’s law [circuit intu- electronics and communication engineering from the
itions],” IEEE Solid-State Circuits Mag., vol. 7, no. 1, pp. 8–9, Feb. 2015. Indira Gandhi Institute of Technology, New Delhi,
[5] V. P.-H. Hu, M.-L. Fan, P. Su, and C.-T. Chuang, “Analysis of GeOI India, in 2002, the M.Tech. degree in informa-
FinFET 6 T SRAM cells with variation-tolerant WLUD read-assist tion technology from the School of Information
and TVC write-assist,” IEEE Trans. Electron Devices, vol. 62, no. 6, Technology, New Delhi, in 2006, and the Ph.D.
pp. 1710–1715, Jun. 2015. degree in electronics and communication engineer-
[6] S. Mukhopadhyay, R. M. Rao, J.-J. Kim, and C.-T. Chuang, “SRAM ing from Delhi Technological University, New Delhi,
write-ability improvement with transient negative bit-line voltage,” IEEE in 2017.
Trans. Very Large Scale Integr. (VLSI) Syst., vol. 19, no. 1, pp. 24–32, From 2002 to 2008, she was a Lecturer with
Jan. 2011. the Bharati Vidyapeeth’s College of Engineering,
[7] M. Yabuuchi et al., “16 nm FinFET high-k/metal-gate 256-kbit 6 T New Delhi, where she has been an Associate Professor since 2007. She has
SRAM macros with wordline overdriven assist,” in IEDM Tech. Dig., authored more than 45 technical papers in various international conferences
Dec. 2014, pp. 3.3.1–3.3.3. and journals. Her current research interests include digital integrated circuit
[8] A. J. Bhavnagarwala et al., “A sub-600-mV, fluctuation tolerant 65-nm design.
CMOS SRAM array with dynamic cell biasing,” in IEEE Symp. VLSI Dr. Gupta is a Life Member of ISTE.
Circuits Dig., Nov. 2007, pp. 78–79.
[9] M. Yabuuchi, K. Nii, Y. Tsukamoto, S. Ohbayashi, Y. Nakase, and
H. Shinohara, “A 45 nm 0.6 V cross-point 8 T SRAM with negative
biased read/write assist,” in Proc. IEEE Symp. VLSI Circuits, Jun. 2009, Neeta Pandey (M’04–SM’14) received the M.E.
pp. 158–159. degree in microelectronics from the Birla Institute
[10] Y.-H. Chen et al., “A 16 nm 128 Mb SRAM in high-κ metal-gate FinFET of Technology and Sciences, Pilani, India, and the
technology with write-assist circuitry for low-VMIN applications,” in Ph.D. degree from Guru Gobind Singh Indraprastha
IEEE ISSCC Dig. Tech. Papers, Sep. 2014, pp. 238–239. University, New Delhi, India.
[11] M. Yamaoka et al., “Low-power embedded SRAM modules with She has served in Central Electronics Engineer-
expanded margins for writing,” in IEEE Int. Solid-State Circuits ing Research Institute, Pilani; Indian Institute of
Conf. (ISSCC) Dig. Tech. Papers, Feb. 2005, pp. 480–481. Technology, New Delhi; Priyadarshini College of
[12] S. Nalam and B. H. Calhoun, “5 T SRAM with asymmetric sizing for Computer Science, Noida; and Bharati Vidyapeeth’s
improved read stability,” IEEE J. Solid-State Circuits, vol. 46, no. 10, College of Engineering, New Delhi in various capac-
pp. 2431–2442, Oct. 2011. ities. She is currently an Associate Professor with the
[13] S. Borkar, “Design challenges for 22 nm CMOS and beyond,” in IEDM Electronics and Communication Engineering Department, Delhi Technological
Tech. Dig., Dec. 2009, p. 1. University, New Delhi. She has authored more than 150 technical papers
[14] S. A. Tawfik and V. Kursun, “Low power and robust 7 T dual-V t SRAM in reputed national and international conferences and journals. Her current
circuit,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2008, research interests include analog and digital VLSI design.
pp. 1452–1455. Dr. Pandey is a Life Member of ISTE.

You might also like