Ieee 3

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO.
1, JANUARY 2012
47
A 62 mV 0.13 m CMOS Standard-Cell-Based

Design Technique Using Schmitt-Trigger Logic
Niklas Lotze, Student Member, IEEE, and Yiannos Manoli, Senior Member, IEEE
AbstractSupply voltage reduction beyond the minimum

energy per operation point is advantageous for supply voltage
constrained applications, but is limited by the degradation of
on-to-off current ratios with decreasing supply. In this work,
we show that the effective on-to-off ratio can be considerably
improved by the use of Schmitt Trigger structures, which effectively reduce the leakage from the gate output node and thereby
stabilize the output level. A method for applying this concept
to general logic is presented. Design rules concerning transistor
sizing, gate selection and layout necessary to further minimize the
required supply voltage are outlined and applied to the design
of a chip implementing 8 8 bit multipliers as test structures.
The only custom design step is the creation of the Schmitt Trigger
standard-cell library, otherwise a regular digital tool chain is used.
The multipliers exhibit full functionality down to supply voltages
of 84 mV62 mV, depending on the area overhead invested. No
process or post-silicon tuning like body biasing is used. At the
minimum possible supply voltage of 62 mV, a power consumption
of 17.9 nW at an operation frequency of 5.2 kHz is measured for
an 8 8 bit multiplier.
Index TermsSub-threshold, ultra-low voltage logic, low power,
schmitt trigger, process variations.
I. INTRODUCTION
HE ongoing demand for reduction in energy consumption

has motivated the design of sub-threshold digital circuits,
which were shown to be reliable even for complex systems
[1][3] and memories [4][7]. Environmental monitoring or
emerging miniaturized energy-autonomous systems e.g., for
healthcare [8], [9] especially require circuits with extremely
tight energy budgets to enable functionality at practical device sizes, motivating further research in this direction. The
discussion on sub-threshold design mostly focuses on circuits
optimized for the minimum energy per operation point [10],
[11], but there are applications where functionality at supply
below this point is advantageous. This is
voltages
especially true for systems where only low supply voltages
are available, i.e., systems powered by energy harvesting devices, e.g., thermoelectric generators [12] or fuel cells [13],
Manuscript received May 23, 2011; revised June 27, 2011; accepted July 26,
2011. Date of publication October 31, 2011; date of current version December
23, 2011. This paper was approved by Guest Editor Alice Wang.
N. Lotze is with the Fritz Huettinger Chair of Microelectronics, Department
of Microsystems Engineering (IMTEK), University of Freiburg, Freiburg
79110, Germany (e-mail: lotze@imtek.de).
Y. Manoli is with the Fritz Huettinger Chair of Microelectronics, Department
of Microsystems Engineering (IMTEK), University of Freiburg, Germany, and
also with the Hahn-Schickard Gesellschaft Institute of Micromachining and Information Technology (HSG-IMIT), Villingen-Schwenningen, Germany.
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/JSSC.2011.2167777
[14]. In these applications, the minimum

required by the
electronics often defines the instant when an active operation
minimization even at the
can start [15], [16], making
cost of additional area, energy per operation or leakage worthwhile. Furthermore, power consumption keeps on decreasing
beyond the minimum energy per operation point.
with
This can e.g., be exploited in systems where most components are in sleep mode, but which have some blocks that
cannot be switched off, e.g., wake-up/surveillance circuits or
to the very
state-holding elements. Reducing the system
required to keep the active blocks operational
minimum
can herein yield considerable standby power reductions [17].
Efficient DC-DC conversion to low voltage and output power
levels is critical in this context, but some efficient approaches
for sub-threshold circuits have been demonstrated [18], [19].
requirements have been
Various approaches to reduce
demonstrated: An early work [20] is implemented in an optimized process with near-zero threshold devices, achieving
supply voltages of 125 mV. An FFT processor, presented in
[21], operates at 180 mV due to careful transistor sizing for the
gates combined with topology optimizations. A memory with
of 135 mV is demonstrated in [22], using a dynamic
a
forward body biasing scheme not applicable to general logic
though. Body biasing [23] is often employed to mitigate global
variations, e.g., in [24] for a microcontroller fully functional
down to 160 mV (210 mV without body biasing). A body
minimization [25] makes
biasing scheme optimized for
the relative strengths of PMOS and NMOS blocks equal (an
approach also proposed in [26]), achieving supply voltages
of 85 mV for an FIR filter (160 mV without body biasing).
A drawback of body biasing though is the required supply
voltage for bias generation, which is usually higher than the
, limiting its usefulness for supply-voltage limited
digital
applications.
The fundamental limit for supply voltage reduction in digital
circuits is the excessive degradation of the transistor on-to-off
decreases. The sub-threshold drain-current
current ratio as
is given by [27]
(1)
with the threshold voltage
, DIBL coefficient , suband
threshold ideality factor , thermal voltage
the
constant summarizing factors setting the transistor
strength. This can be used to derive the on-to-off current ratio
0018-9200/$26.00 2011 IEEE
48
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO. 1, JANUARY 2012
Fig. 1. (a) Modeling of gate chain with feedback configuration. (b) The VTCs of the gate pair are used in butterfly plots to determine SNMs. (c) A Monte Carlo
simulation (5000 runs) gives the SNM distribution to extract failure probabilities.
, exhibiting an exponential de. The ratio between the maximum current that
crease with
can be delivered from the active block (NMOS or PMOS part)
of a CMOS gate and the leakage through the complementary
block therefore is reduced to a point where the output levels
and GND levels [28], finally
start to deviate from the ideal
resulting in a logic failure when the fan-out logic can no longer
correctly interpret the logic levels.
In this work, we therefore propose to improve the on-to-off
current ratio of gates by the use of Schmitt Trigger (ST) structures, effectively reducing the leakage current drawn from the
critical output node. It should be noted that the proposed technique does not reduce the absolute value of leakage, but rather
shifts leakage paths to make them less critical in terms of output
voltage deviation. Similar uses of ST structures have been proposed before: So-called T-structures in [29] are used for leakage
suppression in charge-based analog circuits, static noise margin
in dynamic gates is optimized by applying ST structures in the
evaluation blocks [30] and an SRAM operational at 160 mV
uses ST structures in the inverter NMOS blocks [31].
Major contributions of this work are the concept of applying
the ST structure to general CMOS gates with the aim of supply
voltage minimization and its successful implementation in a
standard 0.13 m CMOS process. The minimum supply voltage
achieved for the 8 8 bit multipliers used as test structures is
62 mV, which is to the authors knowledge the lowest supply
voltage reported to date for CMOS digital circuits implementing
general logic (i.e., logic more complex than inverter chains). No
post-silicon optimizations like body biasing have been applied.
It should be noted that this work aims to explore the minimum
possible supply voltages for digital circuits in a standard silicon
CMOS technology. Therefore several overheads are accepted
which will not be beneficial in all possible applications. The
corresponding trade-offs are mentioned throughout the paper
though and would need to be evaluated for a specific implementation. The rest of the paper is organized as follows: Section II
introduces the modeling concept used to estimate the minimum
achievable with a given set of gates, motivating crucial
design rules for supply voltage minimization. Section III explains the concept of leakage quenching in ST structures and
highlights properties which are important for implementation.

Optimizations for supply voltage minimization are discussed in
Section IV, followed by a description of the implemented test
chip in Section V and measurement results in Section VI.
II. MINIMUM SUPPLY VOLTAGE MODELING
Two fundamental problems make it challenging to predict
the supply voltage where a digital circuit will fail: First, a
logic failure can usually not be attributed to a single gate,
but rather occurs by a gradual level degradation over multiple
gates. Second, in relation to the first point, it is impossible to
predict how much output level degradation of a gate is critical,
as any level is acceptable as long as it is evaluated correctly by
the fan-out logic. To address these points, we use a modeling
approach proposed in [18] and [32], where cross-coupled gate
pairs are used to mimic a gate chain of infinite length (Fig. 1(a))
and analyzed in a similar way as the stability analysis in
memory cells. The amount of headroom to a failure is measured with the worst-case static noise margin (SNM) present
in the gate pair, which can be determined using the DC noise
source configuration shown in Fig. 1(a), or equivalently from
a butterfly plot shown in Fig. 1(b). Here the voltage transfer
curve (VTC) of the first gate is plotted combined with the
inverse VTC of the second one. Positive SNM corresponds to
the existence of two areas entirely enclosed by the VTC curves,
corresponding to the two stable states of the gate pair. The
SNM values are the side lengths of the largest squares which
can be inscribed into the areas (Fig. 1(b)).
For gates with more than one input, multiple possible VTCs
exist depending on the input configuration. In [18], [32], this
problem is solved by considering only gate combinations which
are expected to be critical due to their topology, also resulting
in an obvious choice for the critical VTC (e.g., the combination NAND2 and NOR2 with one delivering a weak low and the
other a weak high value due to the stacked transistors in the corresponding output path). This approach fails for gates optimized
for low voltage operation though, as these usually use PMOS
and NMOS blocks equalized in drive strength. It is however
possible to typically find two worst-case VTCs corresponding
to the weakest pull-up/pull-down conditions in a gate, which
LOTZE AND MANOLI: A 62 mV 0.13 m CMOS STANDARD-CELL-BASED DESIGN TECHNIQUE USING SCHMITT-TRIGGER LOGIC
49
Fig. 3. Illustration of leakage quenching in the Schmitt Trigger inverter. Graph

shows simulated on-to-off current ratios for different topologies and the result
expected from modeling (5).
with the desired circuit yield. The estimated minimum

is the supply voltage where
(which increases with de) approaches
.
creasing
Fig. 2. VTC envelope comprising all possible VTCs for NAND2 with corresponding worst-case input configurations.
form a worst-case envelope for all VTCs that can occur. E.g.
for a NAND2 gate, the worst-case pull-up VTC occurs for one
, the worst-case pull-down VTC for both ininput stable at
puts switching, corresponding to the VTC envelope depicted in
Fig. 2. For more complex gates using more than one transistor
size within the NMOS or PMOS block, it is also possible that
consideration of more than two worst-case VTCs is necessary to
account for different behaviors with regard to process variations.
that can be achieved with
To estimate the minimum
a given gate library, the worst-case VTCs of all gates are extracted, followed by a step to determine the worst-case SNM
value that occurs from checking all possible VTC combinations. This is possible in acceptable simulation time due to the
small number of gates in the libraries used here and because
VTC simulation and SNM calculation are separated instead of
using the simulation approach depicted in Fig. 1(a). Mismatch
and process variations are considered using Monte-Carlo (MC)
simulations, which are carried out at a set of supply voltages
(temperature is kept constant). The result is a worst-case SNM
distribution at a each supply voltage, an example is shown in
.
Fig. 1(c). The occurrence of a failure corresponds to
The corresponding probability is extracted approximating the
worst-case SNMs as normal distributions. This is not the real
distribution shape and slightly optimistic in terms of achievable
due to a less pronounced tail section, but suffiminimum
cient for the estimation made here. As illustrated in Fig. 1(c), the
for the number
modeling yields the failure probability
used in a single MC step for VTC extraction. To
of gates
apply this result to a circuit with
gates, the acceptable
set failure probability is approximated as
(2)
III. SCHMITT TRIGGER LOGIC

Schmitt Trigger logic offers a very effective suppression of
leakage currents from the output node of a gate, which is best understood using the example of a Schmitt Trigger inverter with a
low input value as shown in Fig. 3. The ST inverter has the same
basic functionality as a normal inverter with the PMOS transis/
pulling the output voltage to a high value, whereas
tors
/
are off, though still conducting a
the NMOS transistors
leakage current which may potentially degrade the output level
at low supply voltages as discussed in Section I. The improvement introduced by the ST structure is due to the effect of
the feedback transistors / , which also cause the hysteresis
typical for the Schmitt Trigger at super-threshold voltages. The
feedback transistor is active in the block where the driving transistors are off (i.e., the block which is leaking), as the gates of
/
are controlled by the output
. For a low input,
pulls the middle node
to a high potential. This forces
the drain-source voltage of
close to zero and, more importantly, its gate-source voltage into the negative region, reducing
exponentially. The result is an effective
the leakage through
leakage quenching from the output node Z, consequently minimizing the output level degradation.
For a more quantitative analysis, the ratio of the leakage cur(with
rent through a single transistor equivalent to
) compared to
in the ST structure (with
) is calculated using (1), neglecting
DIBL for simplicity:
(3)
is derived equating the currents through

and
(assuming leakage through
is negligible), again using (1), with
.
:
A closed-form solution can be approximated using
50
Fig. 5. SNM plots for both possible states of a gate pair with hysteresis. Two
VTC pairs occur due to the differing rise/fall VTCs for each gate. One of the
SNMs in each pair is degraded due to the reduced ability to change state.
Fig. 4. Derivation of a ST NAND2 gate from the Schmitt Trigger inverter.
B. Hysteresis
(4)
and
are constants representing the transistor
strength of
and
(analogous to
in (1)) and are impacted by the respective W/L ratio. Combining (3) and (4) gives
)
(again with
(5)
A plot comparing the on-to-off current ratio of the ST structure with that of a single transistor and a two-transistor stack is
given in Fig. 3 along with the result expected from (5), showing
that it approximates the general behavior well even though there
is an error due to neglecting .
A. Schmitt Trigger Gates
To exploit the advantages of the ST structure for the implementation of a low-voltage digital circuit, it is necessary to apply
this principle to more complex functions than simple inversion.
The fundamental requirement of the ST structure is the existence of a middle node within the NMOS and PMOS blocks
which can be tied to the required voltage for leakage quenching.
For correct operation, and to avoid shorts, both NMOS blocks
connected to this node have to be conducting when a low output
value is required and non-conducting otherwise. The same is
true for the PMOS blocks and a high output. The solution used
here (similar to that proposed in [30] for dynamic gates) is to
and
in the ST
replace the driving transistors
inverter with the corresponding NMOS/PMOS function blocks
of the gate function to be implemented, which allows an implementation of arbitrary functions as ST gates. For example,
to implement a ST NAND2 gate, the PMOS driving transistors
are replaced with the parallel PMOS structure and the NMOS
driving transistors with the series NMOS structure of a conventional NAND2 CMOS gate, as shown in Fig. 4.
Hysteresis is a characteristic effect of the Schmitt Trigger

operated at normal supply voltages, but is not present at very
low voltage operation. The existence of an instability is a
prerequisite for the occurrence of hysteresis, as otherwise
the VTC cannot differ between a rising and falling output
transition. For a falling transition, the instability occurs in
and
(
and
for a rising
the feedback loop via
output transition) as follows: Lowering output voltage
causes a reduction of the gate-source voltage for
, which
. This increases the current through
,
results in a reduced
. If this resulting change
further reducing output voltage
is larger than the initial reduction of
(as expressed
when opening the
by
for analysis), the process is self-amplifying,
loop at node
resulting in a steep high to low output voltage transition in the
VTC at the input voltage where the loop instability condition
though (typically below approx. 100
is first met. At low
mV) this condition is never met due to the decreasing overall
amplification, therefore also no hysteresis occurs.
It is important to note though that an absence of hysteresis is
minimization, which becomes
actually advantageous for
obvious from the butterfly plots of a gate pair with hysteresis
as shown in Fig. 5. Due to the hysteresis-induced shift of the
VTCs, the worst-case SNM is reduced compared to a gate pair
. A more illustrative exwhere the VTCs are centered at
planation is the fact that a gate with hysteresis requires a better
voltage level at its input to change its state (which needs to be
delivered by the driving gate), therefore the smaller SNMs in
Fig. 5 can be interpreted as an increased risk to be stuck in one
state. The SNM degradation due to hysteresis is also observed
from the fact that the SNM of conventional CMOS logic exceeds
the one of ST logic at supply voltages above approx. 350400
mV.
C. Output Level Deviations
As discussed above, leakage quenching on the critical
(output) node of the ST structure effectively reduces output
level deviations caused by leakage from the complementary
logic block. A new leakage path is introduced in the ST structure though, which is due to the leaking feedback transistor
in the active block (
in Fig. 3). The resulting output level
51
Fig. 6. Impact of global variations on switching point (input voltage where

output voltage is 50% V ) and high/low output level for ST and CMOS inverter, exhibiting the reduced sensitivity of the ST inverter to global variations.
CMOS inverter is derived from ST by removing feedback transistors.
deviation is therefore due to the relative strength of the feedand

back compared to the outer transistor (
in Fig. 3), which appears similar to the situation in a standard
CMOS gate. The ST structure is nonetheless advantageous in
terms of level deviations as follows.
In a standard CMOS gate, increasing PMOS block drive
strength improves the logic high output, but at the same
time degrades the possible low output level due to the increased PMOS leakage when switched off. The ST gates
considerably reduce this PMOS/NMOS interdependence
due to the reduced leakage from the switched-off block.
Instead, output level deviation is traded off against leakage
quenching efficiency, as a weak feedback transistor optimizes the first, and a strong feedback transistor the latter, as
seen from (5). A good compromise is found though using a
weaker feedback due to the high inherent efficiency of the
leakage quenching effect.
Global variations mainly affect the relative NMOS to
PMOS drive strength. As output level deviations in a
ST gate primarily depend on the relative strength of two
NMOS or two PMOS blocks, instead of the relative NMOS
to PMOS strength (as is the case for standard CMOS), the
sensitivity to global variations is considerably reduced, as
shown in Fig. 6 for an inverter.
Regarding the ratio of the middle to the outer transistor (e.g.,
to
in Fig. 3), the outer one is usually sized stronger as the
leakage from both the feedback and the complementary block
flows over this transistor when the block is on, making it the
more critical one.
D. Leakage Currents
When comparing the leakage currents in an ST gate to the one
in a CMOS gate with equal drive-strength, the following can be
observed.
The ST gate has an additional leakage path via the feedback
in Fig. 3).
transistor in the active circuit block (
The leakage in the switched-off block is increased, which
becomes obvious as follows: In the transistor stack
without the feedback transistor, the mid-point voltage
is such that the leakage current through both transistors is
equal. In order for the feedback to be effective, transistor
has to pull
above this value to reduce the leakage
Fig. 7. Comparison of ST and CMOS structures (derived from ST structure by

removing feedback transistors) in terms of achievable on-to-off current ratios
and leakage overhead (overall leakage in ST in multiples of leakage in CMOS)
mV. The leakage ratio is
for different sizing configurations at V
adjusted by the relative drive strength of the feedback to the driving transistors.
= 150
through
. This means though that leakage through
is increased due to the increased drain-source voltage.
For both leakage paths, reducing the drive-strength of the feedback transistor reduces the leakage at the cost of lower ST efficiency. This trade-off is illustrated in Fig. 7, making it obvious
that a leakage overhead of at least approx. 2x is necessary for the
leakage suppression from the output node to be efficient. The ST
technique by itself therefore is no leakage current but a supply
voltage reduction technique.
IV. OPTIMIZATION FOR MINIMUM SUPPLY VOLTAGE
A. Equalization of VTC and Transistor Sizing
The relative size of the transistors in the ST has a great impact
on the achievable minimum
. Apart from the ST-specific
sizing strategies discussed in Section III-C for minimization of
output deviation, the position of the VTCs is also crucial. As
obvious from the discussion in Section II, the worst-case SNM
is maximized if the worst-case VTC envelope is symmetric to
(or slightly off this point if variability tends to shift the
VTC into one direction with higher probability). This condition
is similar to the equalization of NMOS and PMOS blocks discussed in other publications (e.g., [25], [26]). Sizing of any transistor in the ST structure is similarly effective to equalize the
VTC positions, with changing the strength of the driving transistors having a similar effect. Not so for the feedback transistor:
in Fig. 3 shifts the VTC transition to
while upsizing
leads to higher ones.
lower input voltages, upsizing
Apart from the use of the design guidelines described above,
transistor dimensions for the ST gates have been optimized by
analyzing the most common reasons for failures occurring in
estimation as described in Section II, and
the minimum
a corresponding readjustment of transistor sizes. The resulting
dimensions for the NAND2 gate are shown in Table I, column
S1. In the technology used, the drive strength of a min-sized
NMOS is approx. 4x the one of a min-sized PMOS at
mV. It should be noted that for NMOS transistors, increasing
channel length initially results in increased drive strength due
52
TABLE I
TRANSISTOR DIMENSIONS IN NAND2 GATES (MULTIPLES OF MIN. DIMENSIONS
nm,
nm)
WITH
W = 160
L = 120
Fig. 9. Gate-based DFF used to avoid tri-state structures in the implementation.
Fig. 10. Reduction of minimum

due to increases in transistor area used
in gates. Transistor dimensions used for library implementations are marked as
S1/S4/S16.
Fig. 8. Comparison of AOI21 and NAND2 in terms of worst-case P/N drive

strength ratio and corresponding VTCs. It becomes obvious that number of inputs, not stack size is relevant for width of worst-case VTC envelope. Calculations use equivalent resistances (inversely related to drive strength) and neglect
. Unit drive strength for all transistors is
transistors connected to
assumed. The unusual representation of AOI21 is used to illustrate the optimum
sizing for a minimum-width VTC envelope, transistor sizing instead of doubling
would be used in an actual implementation.
V =GND
to a threshold voltage reduction caused by the reverse short

channel effect (RSCE) at sub-threshold voltages [33].
B. Gate Library
Apart from the design of the individual gates, the set of gates
used in a circuit greatly impacts its reliability at low supply
voltages. It has been pointed out in several publications that the
use of long transistor stacks degrades reliability (e.g., [1]). More
relevant for gates optimized for low voltage operation is the
number of inputs though. The discussion in Section II implies
that not only the position of the worst-case VTC envelope, but
also its width should be optimized, which is dependent on the
worst-case ratio of pull-up/pull-down drive strength. This value
depends on the number of inputs rather than stack size, which becomes obvious from the example shown in Fig. 8 (using CMOS
gates for simplicity, the situation is similar for ST gates), where
a NAND2 gate is compared to an AOI21 gate with the function
. Assuming transistors of equal drive strength,
and using simple series/parallel connection drive strength approximations, the resulting ratio between the strongest pull-up
and strongest pull-down input configuration is 4 for the NAND2
gate whereas it is 9 for the AOI21, which is equal to the result
for a NAND3 or NOR3 gate including 3-transistor stacks. The
corresponding simulated VTC envelopes are shown in Fig. 8,
again illustrating the similar result for the different 3-input gates.
To minimize the achievable supply voltage, the gate library
therefore is limited to inverters, NAND2 and NOR2 gates and
flip-flops. Most CMOS flip-flop architectures use transmission
structures to implement multiplexers. These
gates or
multiplexers are structurally equivalent to complex gates and
also exhibit similar VTC envelopes, resulting in the same
restrictions as the use of more than 2-input gates. Therefore the
gate-based edge-triggered flip-flop architecture shown in Fig. 9
[34] is used. The gates used within the flip-flop are also SchmittTrigger gates with a sizing equal to the library gates, but the
flip-flop is nonetheless provided as an additional standard-cell
to the synthesis and layout tools.
It should be noted that even though the extensive limitation of
minimization, it inlibrary gates is clearly advantageous for
creases the number of required gates and therewith the leakage.
Its usefulness is therefore highly application-dependent.
C. Impact of Random Variability
Despite an effective suppression of global variations in the
ST gates, transistor-to-transistor variations greatly limit supply
voltage reduction. The most dominant variability component are
, hence
random dopant fluctuations (RDF) [35] impacting
causing severe discrepancies between designed and actual transistor drive strengths.
53
Fig. 11. Layout of ST NAND2 standard cells in library S1/S4/S16. To limit systematic V shifts, no transistor chaining is used (length of diffusion effect) and
safety margins from well borders are kept (well proximity effect). The larger margin for NMOS is due to a stronger NMOS WPE in the process used. Graph shows
area overhead compared to a NAND2 from a commercial standard cell library.
The only effective approach to reduce RDF is an increase in

transistor area, the effect of which has been analyzed with the
estimation approach described in Section II. The
minimum
results in Fig. 10 exhibit an almost linear relationship between
and
(with
the tranminimum
sistor area used in the gates) in good accordance with Pelgroms
Law [36]. The extra transistor area was used to either increase
transistor width only (width scaling in Fig. 10) or both transistor
length and width (square scaling in Fig. 10). The scaling strategy
(with slight adhas only minor impact on the minimum
vantages for square scaling due to the improved sub-threshold
swing in long-channel transistors), this degree of freedom can
therefore be used to trade off circuit speed against leakage.
To test the impact of increasing transistor area in the implemented chip, three different gate libraries have been created,
called S1, S4 and S16 (Fig. 10). Square scaling is used to keep
leakage power consumption approximately constant and due to
its smaller layout area caused by lower transistor drain/source
area overheads. Transistor dimensions for the NAND2 gate in
each sizing are given in Table I. The given values result from
individual optimizations for each sizing necessary due to the
highly nonlinear dependencies between transistor dimensions
and drive strength at sub-threshold voltages.
D. Standard-Cell Layout
Since layout-induced systematic variations degrade the minsimilar to process variations, a careful layout considimum
ering these effects is inevitable. As it can be seen in the layout
of the NAND2 gates in Fig. 11, safety margins from the well
borders are maintained to limit the impact of the well proximity
effect (WPE) [37], and no transistor chaining is used to avoid
threshold voltage shifts due to the length of diffusion (LOD) effect [37].
The resulting area overhead for a NAND2 gate compared

to a NAND2 in a commercial standard cell library ranges between 3x (S1) and 7.5x (S16) as depicted in Fig. 11. It should
be noted that the overhead required for the reduction of systematic variations could be avoided if a parasitic extraction modshifts was available to adapt traneling the corresponding
sistor strengths accordingly, which is not the case for the process
used here.
V. TEST CHIP
The structure of the implemented test chip is shown in Fig. 12.
8 8 bit multipliers using gate libraries S1, S4 and S16 are employed as test structures to determine the achievable minimum
supply voltages in a circuit of reasonable size. Additionally, four
identical blocks (SG0 to SG3) giving access to individual gates
for VTC extraction are implemented, each containing 16 sets
of all gates available in S1, S4 and S16. The reason to use four
blocks instead of a single large one is the improved floor planning flexibility.
The multiplier blocks contain a combinational feedback allowing operation in a ring-oscillator mode. It requires appropriate input vectors where a LSB flip in the B input results in
a MSB flip in the output (e.g.,
). The
resulting delay is typically close to the worst-case delay, making
this mode useful for speed measurements. An additional feedback including a register is implemented for S16 to test the
flip-flops at the lowest supply voltages possible. A synchronous
reset presets the register to value 1, thus the complete structure
after clock cycles as long as no overflow occurs.
outputs
The input vectors to the multipliers are applied via down level
shifters to avoid any gate overdrive effects, which would not be
54
Fig. 12. Top-level organization of test chip and details of multiplier blocks, test structures for individual gates and level shifters. AM are analog multiplexers,
LQ-LS leakage quenching level shifters shown in Fig. 14(b).
critical in terms of functionality, but might yield overly opti.

mistic results for minimum
Analog outputs are implemented to track the degradations at
the multiplier outputs and to record VTCs. To limit the number
of necessary output pads for the multipliers, analog multiplexers
are used as shown in Fig. 12, followed by two-stage source followers as output drivers. The additional analog input shown in
Fig. 12 is necessary to characterize amplification and bias level
of the output stages prior to the actual measurements.
The control logic block (labeled test/IO interface in Fig. 12)
implements a serial interface to set parameters, apply multiplier
input vectors, and read out multiplier results via up level shifters,
simplifying verification compared to the use of the analog outputs. Delay measurements are enabled by presetting the expected multiplier output value, with an output pad being set
when this value is detected via the level shifters.
A. Analog Multiplexers
The gates operating at low supply voltages exhibit a high
output resistance, making the design of pass gates used within
the analog multiplexers (AM in Fig. 12) critical, as their off
resistance should still be orders of magnitude higher to avoid
any interferences between the gates to be measured. The pass
Fig. 13. (a) Structure used for implementation of analog multiplexers with reduced leakage. A negative supply voltage is used to switch the pass gates more
effectively. (b) Illustration of excess cross currents occurring in control logic if
no level shift structure is applied.
gate therefore comprises 4 series-connected hightransistors, which are controlled with a negative voltage of approx.
200 mV in the off state to further reduce leakage by a factor of
approx. 40 compared to the application of GND. The negative
supply may cause excess cross currents in the control logic as
illustrated in Fig. 13(b). This is avoided by the use of a levelshifter structure shown in Fig. 13(a).
Fig. 14. (a) Conventional level shifter and (b) proposed implementation using ST structures. (c) Output V
similar pull-up and pull-down drive strength in architectures (a) and (b).
55
regions where functionality can be guaranteed for
B. Level Shifters
The level shifters at the output of the multipliers use a three
stage architecture as shown in Fig. 12. The first stage is especially critical due to the low input
. There is a minimum and
for the level shifter shown in Fig. 14(a):
maximum output
If
is too small, the PMOS drive currents are too weak
relative to the leakage of the NMOS transistors, if it is too large,
the NMOS transistors can not pull down the middle nodes
sufficiently for the level shifter state to flip. Reducing input
makes the problem more severe, as the on-to-off current ratio
of the NMOS transistors is reduced exponentially. Especially
when considering process variations, the output supply voltage
window where functionality can be guaranteed narrows considerably, and for low input supply voltages, no safe output supply
is found as shown in Fig. 14(c).
An approach similar to the ST gates is therefore used: The
driving transistors of the level shifter are replaced with corresponding ST structures, thereby greatly improving on-to-off
sensitivity is
current ratio. As shown in Fig. 14(c), output
is lowreduced considerably and the safe minimum input
ered by approx. 30 mV. Contrary to the ST gates, the feedback
transistors in the NMOS blocks are connected to the inverted
input signal instead of the gate output, as shown in Fig. 14(b),
is changed.
to avoid parameter shifts if the output
Nevertheless, long PMOS transistors are necessary to compensate the differences in drive strength between the NMOS and
PMOS block, therefore a combination between level shifter designs weakening the pull-up structure, as proposed e.g., in [38],
[39], and the approach shown here might be even more advantageous, but is left for future research.
C. Top-Level Layout
Blocks implemented as custom layout are the standard cells,
input and output blocks, single gate blocks and the top level
structure. On the contrary, the multipliers are designed with a
fully automatic standard digital tool chain, using a high-level
VHDL source, synthesis (SYNOPSYS Design Compiler) and
standard-cell based place&route (CADENCE Encounter). The
only specific low-voltage optimization is the addition of guard
rings around the multiplier blocks to limit the coupling of noise
Fig. 15. Die micrograph of implemented chip and layout of S1.
which might be present in the regular

digital blocks. A die
micrograph of the resulting test chip is shown in Fig. 15.
VI. MEASUREMENT RESULTS
The described test chip has been implemented in a 0.13 m
250 mV. Out of ten chips
standard CMOS process with
delivered, nine are fully functional, one failed due to a bonding
error. The accuracy of statistical data shown in this section
therefore is limited due to the small sample size, it is nevertheless presented to illustrate general trends. Measurements are
shown for maximum supply voltages of 300 mV, even though
of
the ST logic is operational up to the maximum core
the process. The peripherals, i.e., analog multiplexers and level
shifters, are not optimized for high supply voltages though, and
therefore severely limit performance in this region.
The intermediate supply voltages for the level shifters have
been set to 160 mV and 460 mV, resulting in a measured safe
operability down to input supply voltages of 33 mV, except for
is 54 mV.
one outlier chip where the minimum input
A. Individual Gates
VTCs of the gates in blocks SG0-SG3 described in Section V
have been extracted by applying slow triangular input signals at
56
Fig. 16. Examples of measured VTCs in sizing S1 and S16 at V

the worst-case VTC envelope are shown.
= 250 mV, 120 mV, 90 mV and 60 mV. The strongest pull-up and pull-down VTCs forming
Fig. 17. Measured minimum and maximum (dotted: mean) difference in

switching point from rising and falling input transition for S1/S4/S16, showing
mV.
that no hysteresis occurs below V
100
generated by extracting SNM distributions

Fig. 18. Estimated minimum V
from measured VTC curves. Shown are results for individual chips (solid lines
fitted to the points for S1, S4 and S16 of each chip) compared to simulation
(dotted line).
Fig. 19. Measured degradation of output levels with decreasing supply voltage
for the 16 multiplier output bits in S1 and S16, normalized to ideal V
and
levels.
GND
be used for an analysis similar to the minimum

estimation described in Section II by randomly choosing sets of measured VTCs (corresponding to the MC simulation), extracting
the corresponding worst-case SNMs, and applying the statistical analysis described in Section II to the resulting SNM distributions. The results for the individual chips are shown in
Fig. 18, exhibiting an approximately linear dependency between
and the minimum
estimation, which is simvs. area
ilar to the simulation (Fig. 10). The minimum
slope for most chips is smaller than the simulated one, though.
This can be explained by global/systematic variations being of
greater importance than predicted by the simulation (resulting
in the increased values at all transistor areas, i.e., also at large
), combined with a lower impact of random variations
(explaining the reduced slope and i.e., reduced values for small
).
B. Multiplier Output Levels
the gate inputs. The example VTCs for S1 and S16 in Fig. 16
. The disillustrate the VTC degradation with decreasing
appearing hysteresis below approx. 100 mV also becomes apparent, which is furthermore verified in Fig. 17 where the maxcrossing points for the rising and
imum difference in
falling edge is extracted. The measured VTCs can furthermore
dependence of the multiplier output

Typical plots for the
levels are shown in Fig. 19, exhibiting the behavior observed in
most cases: The levels of the primary outputs first start to deis decreased, until
viate gradually from the ideal values as
a steep transition occurs on one of the outputs at a well-defined
supply voltage (85 mV for S1 and 61 mV for S16 in Fig. 19),
corresponding to the first critical logic failure. This behavior is
57
TABLE II
SELECTED MEASUREMENT DATA FOR INDIVIDUAL CHIPS AT DIFFERENT SUPPLY VOLTAGES
due to the failure being caused by a set of inner nodes followed

by a path with high amplification to the output where it can
be observed. Keeping the circuit close to the minimum supply
voltage, a transition region is observed where the critical output
is flipping randomly as it is controlled by circuit noise, this region is typically as small as 1 mV though. As supply voltage is
reduced beyond this point, many failing outputs flip back to the
correct logic value due to canceling failures.
C. Multiplier Minimum Supply Voltages
To find the minimum supply voltage of the multipliers,
a proper input test vector set is required. As discussed in
Section II, failures in digital circuits typically are not caused by
a single gate, but a critical set of gates. This makes the test challenging, as all possible gate set configurations need to be tested,
corresponding to all possible input vectors. The multipliers
therefore have been tested with both a full input vector set and
a stuck-at test vector set comprising 24 vectors. The maximum
difference in minimum
found is only 2 mV, indicating
that the stuck-at test is sufficient for most applications. This
result though may not be applicable for different designs. The
operability of the flip-flops has been verified by cycling a test
vector set in the feedback loop including a register shown in
Fig. 12, forcing each flip-flop into high and low state, typically
.
resulting in a stuck-at error beginning at the minimum
The minimum supply voltages are 84 mV/68 mV/62 mV for
S1/S4/S16 for the best chips and 88 mV/71 mV/66 mV on average, as shown in Fig. 20 and detailed in Table II. The flip-flops
implemented in sizing S16 are operational in the same voltage
of 61 mV and 64 mV on average,
range with a minimum
also shown in Fig. 20 and Table II.
Comparing the measurement to the simulation discussed in
Section II and revisited in Fig. 20, it is apparent that the measured minimum supply voltages are below the predicted values
.
and also show a non-linear dependence on
There are two reasons for this effect: First, the characteristic of
variations found on most chips seems different than predicted
by the simulation (see Section IV-A). Second, the simulation
Fig. 20. Minimum V

of multipliers and flip-flops versus gate transistor area,
compared to Monte Carlo SNM simulation results. Error bars depict measured
value ranges instead of standard deviations due to the low accuracy of the latter.
approach using cross-coupled gate pairs introduces an inherent

correlation: The most critical SNM values are typically caused
by pairs where both gates exhibit relatively high VTC shifts.
Even though the probability that a similar gate pair exists
in an actual circuit is equally high, the feedback used in the
simulation corresponds to an equally bad gate following in the
fan-out, which is much less likely. This effect also explains the
difference in the shape of the curve, as it is much more relevant
for gate sets with high variability where the differences between
individual gates are high. This results in a variability-limited
, as opposed to circuits with large transistor
minimum
areas which are ultimately limited by sub-threshold slope.
D. Operation Speed and Power Consumption
Operation frequencies have been measured using the ring oscillator configuration described in Section V. For power measurements, a random set of input vectors is applied at different
frequencies. Active and leakage currents are extracted using the
slope and offset of a linear fit for the resulting current-frequency
curve. The corresponding data is shown in Fig. 21 and exhibits
58
Fig. 22. Plots of measured leakage currents for sizings S1 and S16, showing
in all chips.
that leakage reduces down to the minimum possible V
Fig. 21. Dependence of maximum operation frequency, active energy per operation, and leakage power consumption on supply voltage. Error bars depict
measured value ranges. Dotted line in the leakage power graph is for a corresponding multiplier synthesized with a commercial standard cell library (modeling). Points outline the supply voltages where its leakage power would be
equal to the minimum leakage powers measured for the ST multipliers.
the general behavior expected for a normal sub-threshold circuit: Maximum frequency is exponentially, active energy per
operation quadratically and leakage power approximately lin. Above approx. 250 mV the onset of
early dependent on
transistor saturation can be observed, resulting in reduced increase rates for on currents and higher gate capacitances, which
and the increased
is expressed by the reduced slope of the
curves.
slope of the
At low supply voltages, leakage power consumption is especially important as it dominates the active one even if op, e.g., for S16 at minimum
, active power
erating at
consumption only contributes 2%. Two effects occur as
is decreased. First, leakage currents are reduced due to the decreasing drain-source voltages as obvious from (1). Second, gate
and GND
output levels deviate considerably from the ideal
, which
values when getting close to the minimum possible
in theory might cause an overall increase in leakage in this
region. The measurements though exhibit decreasing values for
both leakage power and currents down to the minimum possible
for all chips as illustrated in Fig. 22.
For comparison, the leakage power of a multiplier synthesized with a commercial standard cell library is also shown in
Fig. 21. As standard-cell schematics for this library are not availcan
able to the authors, only the leakage power at nominal
be determined, which has been extrapolated using the DIBL coefficient extracted for a gate with minimum transistor length. At
, the S1, S4 and S16 ST multitheir minimum functional
pliers consume the same leakage current as a standard CMOS
multiplier if it can operate at 195 mV, 155 mV and 135 mV respectively. However, judging from previous publications (e.g.,
[21], [1], [40], [24]), it is unlikely that a standard CMOS multiplier can operate below 180 mV (a precise value for minimum
cannot be given due to non-availability of the schematics).
Therefore, the S4 and S16 designs should consume less leakage
than a standard CMOS design when all circuits are operated at

.
their minimum functional
The minimum energy per operation point typically occurs at
260 mV/233 mV/226 mV for S1/S4/S16 as shown in Fig. 23
and Table II. The circuit is not optimized for operation at this
point, therefore relatively low operation speeds cause
to occur at comparatively high supply voltages and increased
.
values for
Comparing the different sizing implementations (S1, S4,
S16), the following can be observed. Even though at isothe leakage current of S16 is higher, the lowest leakage power
is meaconsumption of all implementations at minimum
sured for one implementation of S16. S4 achieves the highest
speed as its slightly increased current levels outweigh the
increased gate capacitances. Regarding active energy per operation, minimum values occur for S1, but due to the additional
interconnect loads, the increase for S4 and S16 is much lower
than what could be expected regarding gate capacitances only.
E. Temperature Dependence
The temperature dependence of minimum
and
is
leakage power is shown for one chip in Fig. 24 (active
far less temperature dependent). Both circuit speed and leakage
power consumption increase approximately exponentially with
and increasing
. The
temperature due to the decreasing
for S1/S4/S16 increases by approx. 3 mV/9
minimum
mV/11 mV between room temperature and 80 C, which is
mainly due to a degradation of sub-threshold slope with in. The significantly higher temperature sensitivity
creasing
of S4 and S16 can again be explained with the effect of the minbeing here more sub-threshold slope-limited than
imum
variability-limited, also resulting in an increased sensitivity to
sub-threshold slope changes.
VII. CONCLUSION
It could be shown that an operation of digital circuits at
supply voltages as low as 62 mV is possible without any need
for process or post-silicon tuning. The on-to-off current ratio
59
Fig. 23. Measured minimum energy per operation plots for S1 and S16.
Fig. 24. Temperature dependence of minimum V
, leakage power consumption and maximum operation frequency.
improvement achieved by the proposed Schmitt Trigger technique thus proves to be effective for lowering supply voltage
requirements and for mitigating global variations. Nevertheless
also a careful gate design in terms of sizing and layout is
necessary. Random variations are shown to be an important
limitation for supply voltage reduction, which can be compensated by an increase in gate sizes, resulting in a supply voltage
reduction of 25% at the cost of additional layout area. The
fact that a simple standard-cell design approach is used makes
the proposed technique interesting for practical applications in
supply voltage limited circuits and system power reduction.
REFERENCES
[1] Y. Pu, J. P. de Gyvez, H. Corporaal, and Y. Ha, An ultra-low-energy multi-standard JPEG co-processor in 65 nm CMOS with sub/near
threshold supply voltage, IEEE J. Solid-State Circuits, vol. 45, no. 3,
pp. 668680, Mar. 2010.
[2] A. Agarwal, S. Mathew, S. Hsu, M. Anders, H. Kaul, F. Sheikh, R.
Ramanarayanan, S. Srinivasan, R. Krishnamurthy, and S. Borkar,
A 320mV-to-1.2V on-die fine-grained reconfigurable fabric for
DSP/media accelerators in 32nm CMOS, in 2010 IEEE Int.
Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2010,
pp. 328329.
[3] J. Kwong, Y. Ramadass, N. Verma, M. Koesler, K. Huber, H. Moormann, and A. Chandrakasan, A 65nm sub-Vt microcontroller with
integrated SRAM and switched-capacitor DC-DC converter, in 2008
IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb.
2008, pp. 318616.
[4] B. Calhoun and A. Chandrakasan, A 256-kb 65-nm sub-threshold
SRAM design for ultra-low-voltage operation, IEEE J. Solid-State
Circuits, vol. 42, no. 3, pp. 680688, Mar. 2007.
[5] B. Zhai, S. Hanson, D. Blaauw, and D. Sylvester, A variation-tolerant
sub-200 mV 6-T subthreshold SRAM, IEEE J. Solid-State Circuits,
vol. 43, no. 10, pp. 23382348, Oct. 2008.
[6] T.-H. Kim, J. Liu, J. Keane, and C. Kim, A 0.2 V, 480 kb subthreshold
SRAM with 1 k cells per bitline for ultra-low-voltage computing,
IEEE J. Solid-State Circuits, vol. 43, no. 2, pp. 518529, Feb. 2008.
[7] I. J. Chang, J.-J. Kim, S. Park, and K. Roy, A 32 kb 10 T sub-threshold
SRAM array with bit-interleaving and differential read scheme in 90
nm CMOS, IEEE J. Solid-State Circuits, vol. 44, no. 2, pp. 650658,
Feb. 2009.
[8] G. Chen, H. Ghaed, R.-U. Haque, M. Wieckowski, Y. Kim, G. Kim,
D. Fick, D. Kim, M. Seok, K. Wise, D. Blaauw, and D. Sylvester,
A cubic-millimeter energy-autonomous wireless intraocular pressure
monitor, in 2011 IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig.
Tech. Papers, Feb. 2011, pp. 310312.
[9] S. Hanson, M. Seok, Y.-S. Lin, Z. Y. Foo, D. Kim, Y. Lee, N. Liu, D.
Sylvester, and D. Blaauw, A low-voltage processor for sensing applications with picowatt standby mode, IEEE J. Solid-State Circuits, vol.
44, no. 4, pp. 11451155, Apr. 2009.
[10] B. Calhoun, A. Wang, and A. Chandrakasan, Modeling and sizing for
minimum energy operation in subthreshold circuits, IEEE J. SolidState Circuits, vol. 40, no. 9, pp. 17781786, Sep. 2005.
[11] B. Zhai, D. Blaauw, D. Sylvester, and K. Flautner, Theoretical and
practical limits of dynamic voltage scaling, in Proc. 41st Design Automation Conf., 2004, pp. 868873.
[12] H. Bottner, J. Nurnus, A. Gavrikov, G. Kuhner, M. Jagle, C. Kunzel, D.
Eberhard, G. Plescher, A. Schubert, and K.-H. Schlereth, New thermoelectric components using microsystem technologies, J. Microelectromechan. Syst., vol. 13, pp. 414420, Jun. 2004.
[13] S. Kerzenmacher, J. Ducree, R. Zengerle, and F. von Stetten, Energy
harvesting by implantable abiotically catalyzed glucose fuel cells, J.
Power Sources, vol. 182, no. 1, pp. 117, 2008.
[14] M. Frank, M. Kuhl, G. Erdler, I. Freund, Y. Manoli, C. Muller, and
H. Reinecke, An integrated power supply system for low power 3.3
V electronics using on-chip polymer electrolyte membrane (PEM) fuel
cells, IEEE J. Solid-State Circuits, vol. 45, no. 1, pp. 205213, Jan.
2010.
[15] Y. Ramadass and A. Chandrakasan, A batteryless thermoelectric energy-harvesting interface circuit with 35 mV startup voltage, in 2010
IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb.
2010, pp. 486487.
60
[16] E. Carlson, K. Strunz, and B. Otis, A 20 mV input boost converter with

efficient digital control for thermoelectric energy harvesting, IEEE J.
Solid-State Circuits, vol. 45, no. 4, pp. 741750, Apr. 2010.
[17] B. Calhoun and A. Chandrakasan, Standby power reduction using dynamic voltage scaling and canary flip-flop structures, IEEE J. SolidState Circuits, vol. 39, no. 9, pp. 15041511, Sep. 2004.
[18] J. Kwong, Y. Ramadass, N. Verma, and A. Chandrakasan, A 65 nm
sub-Vt microcontroller with integrated SRAM and switched capacitor
DC-DC converter, IEEE J. Solid-State Circuits, vol. 44, no. 1, pp.
115126, Jan. 2009.
[19] Y. Ramadass and A. Chandrakasan, Minimum energy tracking loop
with embedded DC-DC converter enabling ultra-low-voltage operation
down to 250 mV in 65 nm CMOS, IEEE J. Solid-State Circuits, vol.
43, no. 1, pp. 256265, Jan. 2008.
[20] J. Burr, Cryogenic ultra low power CMOS, in Proc. 1995 IEEE Symp.
Low Power Electronics, Oct. 1995, pp. 8283.
[21] A. Wang and A. Chandrakasan, A 180-mV subthreshold FFT processor using a minimum energy design methodology, IEEE J. SolidState Circuits, vol. 40, no. 1, pp. 310319, Jan. 2005.
[22] M.-E. Hwang and K. Roy, A 135 mV 0.13 W process tolerant 6
T subthreshold DTMOS SRAM in 90 nm technology, in Proc. IEEE
Custom Integrated Circuits Conf. (CICC), 2008, pp. 419422.
[23] S. Jayapal and Y. Manoli, Minimizing energy consumption with
variable forward body bias for ultra-low energy LSIs, in Proc. Int.
Symp. VLSI Design, Automation and Test (VLSI-DAT 2007), 2007, pp.
14.
[24] S. Hanson, B. Zhai, M. Seok, B. Cline, K. Zhou, M. Singhal, M.
Minuth, J. Olson, L. Nazhandali, T. Austin, D. Sylvester, and D.
Blaauw, Exploring variability and performance in a sub-200-mV
processor, IEEE J. Solid-State Circuits, vol. 43, no. 4, pp. 881891,
Apr. 2008.
[25] M.-E. Hwang, A. Raychowdhury, K. Kim, and K. Roy, A 85 mV
40 nW process-tolerant subthreshold 8 8 FIR filter in 130nm technology, in 2007 IEEE Symp. VLSI Circuits Dig., 2007, pp. 154155.
[26] G. Ono and M. Miyazaki, Threshold-voltage balance for minimum
supply operation [LV CMOS chips], IEEE J. Solid-State Circuits, vol.
38, no. 5, pp. 830833, May 2003.
[27] S. Henzler, Power management of digital circuits in deep sub-micron CMOS technologies, in Advanced Microelectronics. Berlin,
Germany: Springer, 2007.
[28] J. Chen, L. Clark, and Y. Cao, Maximum-ultra-low voltage circuit
design in the presence of variations, IEEE Circuits and Devices Mag.,
vol. 21, no. 6, pp. 1220, 2005.
[29] K. Ishida, K. Kanda, A. Tamtrakarn, H. Kawaguchi, and T. Sakurai,
Managing subthreshold leakage in charge-based analog circuits with
low-VTH transistors by analog T- switch (AT-switch) and super cut-off
CMOS (SCCMOS), IEEE J. Solid-State Circuits, vol. 41, no. 4, pp.
859867, Apr. 2006.
[30] L. Wang and N. Shanbhag, An energy-efficient noise-tolerant dynamic circuit technique, IEEE Trans. Circuits Syst. II: Analog Digital
Signal Process., vol. 47, no. 11, pp. 13001306, Nov. 2000.
[31] J. Kulkarni, K. Kim, and K. Roy, A 160 mV robust Schmitt trigger
based subthreshold SRAM, IEEE J. Solid-State Circuits, vol. 42, no.
10, pp. 23032313, Oct. 2007.
[32] J. Kwong and A. Chandrakasan, Variation-driven device sizing for
minimum energy sub-threshold circuits, in Proc. IEEE Int. Symp. Low
Power Electronics and Design (ISLPED06), Oct. 2006, pp. 813.
[33] T.-H. Kim, J. Keane, H. Eom, and C. Kim, Utilizing reverse shortchannel effect for optimal subthreshold circuit design, IEEE Trans.
Very Large Scale Integr. (VLSI) Syst., vol. 15, no. 7, pp. 821829, Jul.
2007.
[34] J. F. Wakerly, Digital Design: Principles and Practices. Upper
Saddle River, NJ: Pearson Education, 2006, 07458.
[35] G. Roy, A. R. Brown, F. Adamu-Lema, S. Roy, and A. Asenov, Simulation study of individual and combined sources of intrinsic parameter
fluctuations in conventional nano-MOSFETs, IEEE Trans. Electron
Devices, vol. 53, pp. 30633070, Dec. 2006.
[36] M. Pelgrom, A. Duinmaijer, and A. Welbers, Matching properties of
MOS transistors, IEEE J. Solid-State Circuits, vol. 24, no. 10, pp.
14331439, Oct. 1989.
[37] P. Drennan, M. Kniffin, and D. Locascio, Implications of proximity
effects for analog design, in Proc. IEEE Custom Integrated Circuits
Conf. (CICC06), Sep. 2006, pp. 169176.
[38] I. J. Chang, J.-J. Kim, and K. Roy, Robust level converter design for
sub-threshold logic, in Proc. IEEE Int. Symp. Low Power Electronics
and Design (ISLPED06), Oct. 2006, pp. 1419.
[39] H. Shao and C.-Y. Tsui, A robust, input voltage adaptive and low
energy consumption level converter for sub-threshold logic, in Proc.
33rd European Solid State Circuits Conf. (ESSCIRC 2007), 2007, pp.
312315.
[40] J. Kao, M. Miyazaki, and A. Chandrakasan, A 175-mV multiply-accumulate unit using an adaptive supply voltage and body bias architecture, IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 15451554,
Nov. 2002.
Niklas Lotze (S10) received the Dipl.-Ing. (M.Sc.)
degree in Microsystems Engineering from the University of Freiburg, Freiburg, Germany, in 2004. He
was part of the Ph.D. program Embedded Microsystems of the University of Freiburg (20052008).
He is now a research assistant at the Fritz Huettinger Chair of Microelectronics of the Department
of Microsystems Engineering (IMTEK), University
of Freiburg. His research interests lie in the field of
ultra-low-power, ultra-low-voltage digital circuits.
Yiannos Manoli (M82SM08) was born in Famagusta, Cyprus, in 1954. As a Fulbright scholar,
he received the B.A. degree (summa cum laude) in
physics and mathematics from Lawrence University,
Appleton, WI, in 1978, and the M.S. degree in
electrical engineering and computer science from
the University of California, Berkeley, in 1980. He
received the Dr.-Ing. degree in electrical engineering
from the Gerhard Mercator University, Duisburg,
Germany, in 1987.
From 1980 to 1984, he was a research assistant at
the University of Dortmund, Germany, in the field of A/D and D/A converters.
In 1985, he joined the newly founded Fraunhofer Institute of Microelectronic
Circuits and Systems, Duisburg, Germany, where he established a design group
working on mixed-signal CMOS circuits especially for monolithic integrated
sensors and application specific microcontrollers. From 1996 to 2001, he held
the Chair of Microelectronics as a full Professor with the Department of Electrical Engineering, University of Saarland, Saarbrcken, Germany. In 2001, he
joined the Department of Microsystems Engineering (IMTEK) in the Faculty of
Engineering of the University of Freiburg, Germany, where he established the
Chair of Microelectronics. With an endowment of the Fritz Httinger Foundation and in memory of the founder of todays Httinger Elektronik, the University of Freiburg named the Chair Fritz Huettinger Chair of Microelectronics in
2010. Since 2008, he has been Vice-Dean of the Faculty of Engineering. Since
2005, he has also served as one of the three directors at the Institute of Micromachining and Information Technology of the Hahn-Schickard Gesellschaft
(HSG-IMIT) in Villingen-Schwenningen, Germany. His current research interests are the design of low-voltage/low-power mixed-signal CMOS circuits, energy harvesting electronics, sensor read-out circuits, and analog-to-digital converters. His additional research activities concentrate on motion and vibration
energy transducers and on the field of inertial sensors and sensor fusion. In 2000,
he had the opportunity to spend half a year on a research project with Motorola
(now Freescale) in Phoenix, AZ. In 2006, he spent his sabbatical semester with
Intel, Santa Clara, CA, working on the read-out electronics for a high-resolution
accelerometer.
Prof. Manoli and his group have received best paper awards at ESSCIRC 1988
and 2009, PowerMEMS 2006, MWSCAS 2007, and MSE-2007. The MSE2007 award was granted for SpicyVOLTsim (www.imtek.de/svs), a web-based
application for the animation and visualization of analog circuits for which Prof.
Manoli also received the Media Prize of the University of Freiburg in 2005.
He was the first to receive the Best Teaching Award of the Faculty of Engineering when it was introduced in 2008. For his creative and effective contributions to the teaching of microelectronics, he has also received the Excellence
in Teaching Award of the University of Freiburg and the Teaching Award of
the State of Baden-Wrttemberg, both in 2010. Prof. Manoli is a Distinguished
Lecturer of the IEEE. He is on the Senior Editorial Board of the IEEE JOURNAL
ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS and on the
Editorial Board of the Journal of Low Power Electronics. He served as guest
editor of the IEEE TRANSACTIONS ON VLSI in 2002 and the IEEE JOURNAL
OF SOLID-STATE CIRCUITS in 2011. He has served on the committees of several
conferences such as ISSCC, ESSCIRC, IEDM, and ICCD, and was Program
Chair (2001) and General Chair (2002) of the IEEE International Conference
on Computer Design (ICCD). He is a member of VDE, Phi Beta Kappa, Mortar
Board, and a senior member of the IEEE.

Ieee 3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ieee 3

Uploaded by

Copyright:

Available Formats

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO.

A 62 mV 0.13 m CMOS Standard-Cell-Based

AbstractSupply voltage reduction beyond the minimum

HE ongoing demand for reduction in energy consumption

[14]. In these applications, the minimum

0018-9200/$26.00 2011 IEEE

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO. 1, JANUARY 2012

highlights properties which are important for implementation.

Fig. 3. Illustration of leakage quenching in the Schmitt Trigger inverter. Graph

with the desired circuit yield. The estimated minimum

III. SCHMITT TRIGGER LOGIC

is derived equating the currents through

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO. 1, JANUARY 2012

Hysteresis is a characteristic effect of the Schmitt Trigger

Fig. 6. Impact of global variations on switching point (input voltage where

deviation is therefore due to the relative strength of the feedand

Fig. 7. Comparison of ST and CMOS structures (derived from ST structure by

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO. 1, JANUARY 2012

Fig. 9. Gate-based DFF used to avoid tri-state structures in the implementation.

Fig. 10. Reduction of minimum

Fig. 8. Comparison of AOI21 and NAND2 in terms of worst-case P/N drive

to a threshold voltage reduction caused by the reverse short

The only effective approach to reduce RDF is an increase in

The resulting area overhead for a NAND2 gate compared

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO. 1, JANUARY 2012

critical in terms of functionality, but might yield overly opti.

regions where functionality can be guaranteed for

Fig. 15. Die micrograph of implemented chip and layout of S1.

which might be present in the regular

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO. 1, JANUARY 2012

Fig. 16. Examples of measured VTCs in sizing S1 and S16 at V

Fig. 17. Measured minimum and maximum (dotted: mean) difference in

generated by extracting SNM distributions

be used for an analysis similar to the minimum

dependence of the multiplier output

due to the failure being caused by a set of inner nodes followed

Fig. 20. Minimum V

approach using cross-coupled gate pairs introduces an inherent

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO. 1, JANUARY 2012

than a standard CMOS design when all circuits are operated at

Fig. 24. Temperature dependence of minimum V

, leakage power consumption and maximum operation frequency.

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 47, NO. 1, JANUARY 2012

[16] E. Carlson, K. Strunz, and B. Otis, A 20 mV input boost converter with

You might also like

A 62 mV 0.13 m CMOS Standard-Cell-Based