You are on page 1of 4

2011ME10040

EEL-201 IIT Delhi 1st Semester 2013-14

Clock Gating & Power Gating For Low Power Circuits


Srejan Goyal (2011ME10040), IIT Delhi

Abstract CMOS circuits are very commonly used in most computation devices. But recent development of portable devices has lead to demand for power efficient circuits. Clock is the most common signal in sequential circuits which is also responsible for unnecessary switching in latches. Also in a larger circuit there are many blocks that are nominally ON but ar e not performing any operation for long durations. In this paper I have tried to describe the various techniques of saving vital power for electronic circuits. Starting from causes of power dissipation in CMOS circuitry an understanding is developed of saving power by gating the clock signal or directly the input power. For both discussions are done on possible and optimal implementations. I have also tried to elucidate the various design parameters and most importantly an analysis of feasibility is done depending upon the tradeoff between power saving and compromise on are, performance and reliability of the circuit.

I. INTRODUCTION1

Electronic

expected activity level. Apart from the implementation I also go on to carry out a feasibility analysis with respect to the performance and power trade-off on the basis of available literature on the subject.

circuit designers have for long have been II. POWER DISSIPATION IN CMOS CIRCUITS CMOS circuits dissipate power primarily through two routes: static power, which is due to the resistance from the power supply to ground, and dynamic power, which occurs due to the gate switching and corresponding switching of capacitive loads between the 0 and 1 voltage states. Since dynamic power is dissipated only when the nodes change values and is nearly zero otherwise, it can be said to be frequency dependant. On the other hand static power is independent of frequency and is dissipated whenever the chip is turned ON. Though even in the absence of any circuits consuming this power it never really is zero due to dissipation of leakage current through nominally off transistors which is set by the sub-threshold current of the transistor. Generally with small static power, dynamic power generally forms a major component of power consumed on a CMOS circuit. To charge a load capacitance C by additional V volts, and then discharging it to its original voltage, a gate pulls C V from the V dd supply and then sinks this charge to Ground to discharge the node. So power consumed is Here is the number of times this node cycles each clock cycle and is usually called the activity ratio.[1] The total

involved in the prospect of optimizing the circuits for performance and area. Power consumption wasnt really an attribute that was given much weight while judging the efficiency. But the advent of portable computation devices has compelled the designers to ensure that circuits consume the available power optimally. Sequential circuits comprise a major component of power dissipation since they have an input signal namely the clock which keeps changing at a very high frequency. So despite no change in relevant inputs or states in the FSM (Finite State Machine), clock causes large scale gate switching. Also the universal extent of usage of this signal in the architecture requires a very rigorous design of the entire clock net which necessarily acts as a larger capacitance for the system. In realistic terms this is a very major waste of power that can be saved. Apart from this, in the overall architectural terms there are several components which are powered on even in the No Activity period which also contributes to the overall power liability. In this paper I briefly describe the nature of power dissipation in CMOS circuitry and then in section III and IV illustrate two very powerful techniques of countering these issues i.e. by selective clocking through appropriate gating of the signal supplied as clock to Flip flops and overall gating of power to dormant sections of the architecture depending upon

dynamic power for the whole chip is summed over all the nodes. So to reduce the dynamic power, either reduce the capacitance, the voltage swing, the power supply voltage, the activity ratio or operating frequency. Gating the clock allows us to reduce this power by bringing down the activity ratio by removing redundant transitions. When the clock is turned off, none of the latch outputs change state, and thus the logic outputs are also stable. Gating the clock has the added advantage that it reduces the clock load on the master clock tree that toggles each cycle, since the clocks in the inactive blocks are effectively turned off and are isolated from the master. But it does not cater to the problem of static power dissipation which although small can accumulate over long periods. One common approach to solve this problem is to make sure that idle blocks do not use any power. This can be achieved by gating the power supply to the nodes with sleep transistors which effectively create a virtual supply which draws energy from the actual supply and is activated by a function of level of activity of that component of the system.

produces control signals that when asserted, turn off the clock to the FSM. Although gated clocks can lead to increase in clock skew, causing problems for highperformance designs, there exist many synthesis tools provide effective clocking equalization schemes that eliminate this problem. To generate the excitation/activation function I describe the approach according to Reference 2. A. Activation Function Generation Consider a Moore type FSM (does not lead to loss of generality since free conversion between Moore and Mealy is possible) which can be described by a sextuple Where X is the set of inputs, Y is the set of outputs; S is the set of states. Equation si+1= (X, si) and Y= (X) define the next state and output functions respectively. So The self loop function for given FSM is representing the conditions under which the FSM is in a state of self loop, i.e. when there is clock switching without any changes in the state.

III. CLOCK GATING Recent studies indicate that the clock signals in digital computers consume a large (1545%) percentage of the system power. As a result the circuit power can be greatly reduced if clock power dissipation can be limited. Most efforts for clock power reduction have centered on ideas such as reduced voltage swings, buffer insertion, and clock routing. Usually, switching of the clock causes a great deal of unnecessary gate activity. For that reason, circuits are being developed with controllable clocks. This means that from the master clock other local clocks are derived which, based on certain conditions of the environment they feed, can be slowed down or stopped completely with respect to the master clock. This scheme results in power savings due to the following factors. 1) It reduces the load that the master clock has to switch every time it switches at usually high frequency thereby also reducing need for buffers in clock tree, hence reducing the power dissipation of clock tree. 2) The flip flop/latch receiving this pseudo clock is not triggered in self loop cycles and the corresponding dynamic power dissipation is thus saved. 3) The next state function of the flip flop/latch triggered by the derived clock may be simplified by using dont care condition whenever the clock isnt switching.[3] The foundational idea behind a gated clocking routine is that the environment around a functional block of FSM Whenever the circuit is in self loop it should deactivate the clock. So the activation function can be defined as So it can be expressed as the union of all these self loops

The activation function here is defined as a Boolean function with inputs as FSMs primary inputs and current state. Since any unreachable state is absent from inputs to ( we can use the unreachable codes as a don't-care (DC) set to reduce the cover size. We then use to selectively gate the clock for power savings. Because the activation function , takes stable-one signals as inputs, is also a stable-one signal and can be used to control the clock switching. Not all applications of sequential circuits follow the design approach starting from a state diagram. So for a given gate level description of the circuit first generation of the state diagram and then that of activation function comes out to be that of exponential order. Since implicitly we can see that since when is satisfied we have self loops, for each bit in the state current and next state vector the self loops can represented by the single equation

2011ME10040

EEL-201 IIT Delhi 1st Semester 2013-14

Thus, we can generate the activation function easily using Boolean decision diagram-based (BDD-based) symbolic manipulation of logic networks, even if the state diagram is too large to be explicitly represented. [2] Not needing the complete state diagram to generate f, widens the applicability of this techniques and makes it suitable for resynthesis and low power optimization of existing large sequential circuits. B. Feasibility There is no guarantee that above said implementation of activation function is efficient since it is possible that the activation functions complexity comes out to be of the same order as that of the combinational part of the FSM (which was the power being saved), we must reduce the size of the activation function to realize the most power savings. So instead of using entire activation function if we can find a reduced function which has a small overhead but can optimize power then we get the most optimal solution. The most appropriate trade-off seems like would come from actual probabilities of switching and incorporating the highly likely states but in the absence of that information and just on the basis of functional specification of the circuit there isnt a closed form solution possible. But this problem can be solved iteratively if we fixed the number of min terms in the reduced activation function. Therefore by varying this upper limit on min terms and checking the optimal solution for each limit will yield us the most appropriate balance of power consumption. Another important issue is that of hazards (i.e. unwanted glitch in output over input changes for a gate). Usually they only consume power by rarely affect the performance. But for gated clocks it can be catastrophic since it can impact the entire clock line and spoil the desired functioning. Hence we must be careful of applying this principle in circuits with tightly timed clocks.

for a circuit block. When the logic detects the onset of a sufficiently long idle period of the target circuit block, it generates a sleep signal for the gate of the header or footer transistor to turn-off the supply voltage to the circuit block. Similarly, once it is determined that the circuit block is being shifting to active load switching, the sleep signal is removed to restore the voltage at the virtual Vdd.

Figure 2. Source Reference[4]

IV.

POWER GATING

Figure 1. Source: Reference [4]

In the standard chip, all logic circuits are connected to Vdd through a hierarchical power mesh. As a result the entire chip is fully powered at all times. By using a suitable sized header/footer sleep transistor Power gating can be achieved

A. Timing Characteristics Figure 2 shows the key intervals of power gating, assuming power gating with a header device The interval of inactivity begins at t = 0, and at t = T1 (Tidle detect) the control circuit makes a decision to power-gate the unit. Until this point the unit still operates in the active mode, dissipating leakage energy. During the interval [T1, T2] (T2 T1 = Tidle delay) the sleep signal is re-buffered and distributed to the header device incurring an overhead energy, Eoverhead 1. When the sleep signal is delivered to the gate of the header device at t = T2, the voltage at the virtual Vdd V starts going down. If the leakage current were independent of the power supply, the voltage drop at the virtual Vdd would be linear in time, and savings in the leakage energy would begin only after the virtual Vdd is totally discharged (t = T4 = Tfull discharge). In reality, as the voltage at the virtual Vdd goes down, the leakage current also reduces, and the savings in leakage energy begins as soon as the sleep signal is asserted. As the voltage at the virtual Vdd goes down, the amount of leakage energy saved per cycle increases, resulting in a super-linear growth in the aggregate saved leakage energy. At t = T5 the control logic detects the next busy interval, and the sleep signal is de-asserted, resulting in an energy overhead of Eoverhead 2 dissipated for generating and driving the signal. At t =T6 (T6T5 = Tbusy delay) the header device is turned on, and during the interval [T6, T7] (T7 T6 = Twakeup) the virtual Vdd is charged up to the Vdd level. As the virtual Vdd is charged up, the amount of leakage energy savings per cycle gradually reduces, reaching zero at T7. B. Design Constraints

Some design limitations are posed by this scheme of power gating. Firstly the size of the sleep transistor must be large enough to ensure that the current drawn by the macro does not cause excessive voltage drop. As a result to maintain the performance level in gated chip the Vdd has to be raised by some amount to compensate for voltage drop across the transistor. However, dynamic and leakage power consumption increase follow from increasing the supply voltage which may nullify the intended power savings. Also, when sleep transistors are switched, considerable power supply currents are drawn in a short period of time resulting in large voltage fluctuations on the power grid. This power gating noise (PGN) affects the reliability of systems-on-chips (SoCs). Therefore the need arises to add extra decoupling capacitance (decap) to counter the PGN and compensate for the additional leakage currents associated with the intentionally introduced capacitors. Due to all this the switching power required for sleep transistor increases considerably. Normally in a chip without power-gating, the non-switching part of the circuit serves as a natural decoupling capacitor to help reduce the power grid noise. But, in the power-gated chip, since the gated macros are disconnected from the power supply during their idle state, the ability of circuit to act as a self decap for the active circuit is restricted. Thus the non-gated macros adjacent to the gated circuitry may need extra decap to cap the power grid noise within tolerable margins. However, gated macros not only stop providing decap to other macros, but also act as an obstacle for decap insertion. Extra decap will enlarge the chip size and again consume more leakage power. These constraints in practice tend to alter the power consumption but nevertheless cannot be avoided. As a consequence a need arises to do a feasibility analysis of any such subsystem. C. Feasibility

Figure 3 represents the power schedule of power gated circuits alongside that for normal chips. The dark line represents the gated power and the other one is for ungated. As discussed the power savings are mainly in the sleep portion of the cycle. Some extra power is dissipated in the active zone. So if we define Erf and Etf as the power dissipated in time periods tr and tf, we can establish the relation for economy of operation of such a system as:

From this equation we can see that not only should the sleep time be large but also the active time should not be very large. [6] So we can define a design parameter in the form of activity ratio which alone can

determine the performance of such a system. Depending upon these values we can judge the feasibility of a particular power gating scheme for a given circuit. [4] provides a very rigorous analysis for feasibility and consequent selection of the units like Tidle (idle time after which sleep mode is activated) depending upon the circuit and other dynamic timing techniques to give an optimal implementation. V. CONCLUSION As we have seen both these techniques are very powerful and effective in their own right. The crux of this analysis lies in the trade-off between energy saving on one side and performance, reliability, area and cost on the other hand. Generally it causes substantial savings in energy with little or manageable costs thereby finding use in a large number of circuits. However any analysis to be done as described by techniques discussed in this paper is highly situation specific and similarly feasibility varies from circuit to circuit. REFERENCES
[1] Horowitz, Mark, Thomas Indermaur, and Ricardo Gonzalez. "Lowpower digital design." Low Power Electronics, 1994. Digest of Technical Papers., IEEE Symposium. IEEE, 1994. Benini, Luca, Polly Siegel, and Giovanni De Micheli. "Saving power by synthesizing gated clocks for sequential circuits." Design & Test of Computers, IEEE 11.4 (1994): 32-41. Wu, Qing, Massound Pedram, and Xunwei Wu. "Clock-gating and its application to low power design of sequential circuits." Circuits and Systems I: Fundamental Theory and Applications, IEEE Transactions on 47.3 (2000): 415-420. Hu, Zhigang, et al. "Microarchitectural techniques for power gating of execution units." Proceedings of the 2004 international symposium on Low power electronics and design . ACM, 2004. Jiang, Hailin, Malgorzata Marek-Sadowska, and Sani R. Nassif. "Benefits and costs of power-gating technique." Computer Design: VLSI in Computers and Processors, 2005. ICCD 2005. Proceedings. 2005 IEEE International Conference on . IEEE, 2005. Agarwal, Kanak, et al. "Power gating with multiple sleep modes." Proceedings of the 7th International Symposium on Quality Electronic Design. IEEE Computer Society, 2006.

[2]

[3]

[4]

[5]

[6]

Figure 3. Source Reference [5]

You might also like