VHDL Synchronous Counters

Synchronous Counters - Final Report 26/4/17, 5*16 am
by Lee Chin Wei

and Andrew Long
Contents
1. An Overview
2. Different types of Synchronous Counters
Binary UP Counters
Binary Down Counters
Binary Up/Down Counters
MOD-N/Divide-by-N Counters
BCD Counters
Ring Counters
Johnson/Twisted-Ring Counters
Loadable/Presettable Counters
3. A Comparison between Synchronous and Asynchronous Counters
4. Synchronous Counter Design
Multiplexer
Read-Only Memory
Programmable Logic Array
5. Making Fast Counters
Where speed is a concern...
General Structure of a Synchronous Binary Counter
Prescaling
Pipelining
Fast Counters From Xilinx
6. Referennces
An Overview
The purpose of the survey is to collate information on Digital Synchronous Counters. Particular emphasis was placed on the following
areas :
1. Types of Synchronous Counters and How they work

2. Fast Counter Techniques
3. Fast Counters from Xilinx
4. Implementation of Counters :
Dedicated Hardware and Alternative Devices
The material is presented in a manner suitable for a teaching tool. It seeks to enlighten and to spark off interest in the design of counters.
As R.S.S Obermann remarks "....design of counters has, in my experience, always been an excellent proving ground for anyone who has
mastered Boolean algebra... Have fun reading !!!!!
Different types of Synchronous Counters

Binary Up Counters
A synchronous binary counter counts from 0 to 2N-1, where N is the number of bits/flip-flops in the counter. Each flip-flop is used to
represent one bit. The flip-flop in the lowest-order position is complemented/toggled with every clock pulse and a flip-flop in any other
position is complemented on the next clock pulse provided all the bits in the lower-order positions are equal to 1.
Take for example A4 A3 A2 A1 = 0011. On the next count, A4 A3 A2 A1 = 0100. A1, the lowest-order bit, is always complemented. A2
https://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cwl3/report.html Page 1 of 28
is complemented because all the lower-order positions (A1 only in this case) are 1's. A3 is also complemented because all the lower-
order positions, A2 and A1 are 1's. But A4 is not complemented the lower-order positions, A3 A2 A1 = 011, do not give an all 1
condition.
To implment a synchronous counter, we need a flip-flop for every bit and an AND gate for every bit except the first and the last bit. The
diagram below shows the implementation of a 4-bit synchronous up-counter.
4-bit Synchronous Binary Up-Counter
From the diagram above, we can see that although the counter is synchronous and is supposed to change simultaneously, we have a
propagation delay through the AND gates which add up to give an overall propagation delay which is proportional to the number of bits
of the counter. To overcome this problem, we can feed the outputs from the flip-flops directly to a many-input AND gate as follows :
4-bit Synchronous Binary Up Counter using speedup technique
This method does overcomes the problem of additive propagation delay but introduces some other problem of its own. From the
diagram above, we can see that the third flip-flop gets its J-K input from the output of a 2-input AND gate and the fourth flip-flop gets
its input from a 3-input AND gate and so on. If we have a counter that counts to for example 16 bits, we will need to have :
1 * 15-input AND gate,
1 * 14-input AND gate,
...
...
...
1 * 3-input AND gate and
1 * 2-input AND gate.
This method obviously usus a lot more resources than the first method. Not only that, in the first method, the output from each flip-flop
is only used as an input to one AND gate. In the second method, the output from each flip-flop is used as an input to all the higher-order
bits. If we have a 12-bit counter, the output of the first flip-flop will have to drive 10 gates (called fan-out. The output from the flip-flop
may not have the power to do this.
The "solution" to this is to use a compromise between the two methods. Say we have a 12-bit counter, we can organise it into 3 groups
of 4. Within each group of 4, we use the second method and between the 3 groups, use the first method. This way, we only have an
overall gate propagation delay and a maximum fan-out of 3 instead of 10 using the first and second method respectively.
There are many variations to the basic binary counter. The one described above is the binary up counter (counts upwards). Besides the
up counter, there is the binary down counter, the binary up/down counter, binary-coded-decimal (BCD) counter etc. Any counter that
counts in binary is called a binary counter.
Binary Down Counters

In a binary up counter, a particular bit, except for the first bit, toggles if all the lower-order bits are 1's. The opposite is true for binary
down counters. That is, a particular bit toggles if all the lower-order bits are 0's and the first bit toggles on every pulse.
Taking an example, A4 A3 A2 A1 = 0100. On the next count, A4 A3 A2 A1 = 0011. A1, the lowest-order bit, is always complemented.
A2 is complemented because all the lower-order positions (A1 only in this case) are 0's. A3 is also complemented because all the lower-
order positions, A2 and A1 are 0's. But A4 is not complemented the lower-order positions, A3 A2 A1 = 011, do not give an all 0
condition.
4-bit Synchronous Binary Down Counter
The implementation of a synchronous binary down counter is exactly the same as that of a synchronous binary up counter except that
the inverted output from each flip-flop is used. All the methods used improve a binary up counter can be similarly applied here.
Binary Up/Down Counters

The similarities between the implementation of a binary up counter and a binary down counter leads to the possibility of a binary
up/down counter, which is a binary up counter and a binary down counter combined into one. Since the difference is only in which
output of the flip-flop to use, the normal output or the inverted one, we use two AND gates for each flip-flop to "choose" which of the
output to use.
3-bit Synchronous Binary Up/Down Counter
From the diagram, we can see that COUNT-UP and COUNT-DOWN are used as control inputs to determine whether the normal flip-
flop outputs or the inverted ones are fed into the J-K inputs of the following flip-flops. If neither is at logic level 1, the counter doesn't
count and if both are at logic level 1, all the bits of the counter toggle at every clock pulse. The OR gate allows either of the two outputs
which have been enabled to be fed into the next flip-flop. As with the binary up and binary down counter, the speed up techniques apply.
MOD-N/Divide-by-N Counters
Normal binary counter counts from 0 to 2N - 1, where N is the number od bits/flip-flops in the counter. In some cases, we want it to
count to numbers other than 2N - 1. This can be done by allowing the counter to skip states that are normally part of the counting
sequence. There are a few methods of doing this. One of the most common methods is to use the CLEAR input on the flip-flops.
3-bit Synchronous Binary MOD-6 Counter
In the example above, we have a MOD-6 counter. Without the NAND gate, it is a MOD-8 counter. Now, with the NAND gate, the
output from the NAND gate is connected to the asynchronous CLEAR inputs of each flip-flop. The inputs to the NAND gate are the
outputs of the B and C flip-flops. So, all the flip-flops will be cleared when B = C = 1 (1102 = 610 ). When the counter goess from state
101 to state 110, the NAND output will immediately clear the counter to state 000. Once the flip-flops have been cleared, the B = C = 1
condition no longer exists and the NAND output goes back to high. The counter will therefore count from 000 to 101, and for a very
short period of time, be in state 110 before the counter is cleared. This state is called the temporary state and the counter usually only
remains in a temporary state for a few nanoseconds. We can essentially say that the counter skips 110 and 111 so that it goes only six
different states; thus, it is a MOD-6 counter. We also have to note that the temporary state causes a spike or glitch on the output
waveform of B. This glitch is very narrow and will not normally be a problem unless it is used to drive other circuitry outside the
counter. The 111 state is the unused state here. In a state machine with unused states, we need to make sure that the unused states do not
cause the system to hang, ie. no way to get out of the state. We don't have to worry about this here because even if the system does go to
the 111 state, it will go to state 000, a valid state) on the next clock pulse.
Binary Coded Decimal (BCD) Counters

The BCD counter is just a special case of the MOD-N counter (N = 10). BCD counters are very commonly used because most human
beings count in decimal. To make a digital clock which can tell the hour, minute and second for example, we need 3 BCD counters (for
the second digit of the hour, minute and second), two MOD-6 counters (for the first digit of the minute and second), and one MOD-2
counter (for the first digit of the hour).
Ring Counters
Ring counters are implemented using shift registers. It is essentially a circulating shift register connected so that the last flip-flop shifts
its value into the first flip-flop. There is usually only a single 1 circulating in the register, as long as clock pulses are applied.
4-bit Synchronous Ring Counter
In the diagram above, assuming a starting state of Q3 = 1 and Q2 = Q1 = Q0 = 0. At the first pulse, the 1 shifts from Q3 to Q2 and the
counter is in the 0100 state. The next pulse produces the 0010 state and the third, 0001. At the fourth pulse, the 1 at Q0 is transferred
back to Q3, resulting in the 1000 state, which is the initial state. Subsequent pulses will cause the sequence to repeat, hence the name
ring counter.
The ring counter above functions as a MOD-4 counter since it has four distinct states and each flip-flop output waveform has a
frequency equal to one-fourth of the clock frequency. A ring counter can be constructed for any MOD number. A MOD-N ring counter
will require N flip-flops connected in the arrangement as the diagram above.
A ring counter requires more flip-flops than a binary counter for the same MOD number. For example, a MOD-8 ring counter requires 8
flip-flops while a MOD-8 binary counter only requires 3 (23 = 8). So if a ring counter is less efficient in the use of flip-flops than a
binary counter, why do we still need ring counters? One main reason is because ring counters are much easier to decode. In fact, ring
counters can be decoded without the use of logic gates. The decoding signal is obtained at the output of its corresponding flip-flop.
For the ring counter to operate properly, it must start with only one flip-flop in the 1 state and all the others at 0. Since it is not possible
to expect the counter to come up to this state when power is first applied to the circuit, it is necessary to preset the counter to the
required starting state before the clock pulses are applied. One way to do this is to apply a pulse to the PRESET input of one of the flip-
flops and the CLEAR inputs of all the others. This will place a single 1 in the ring counter.
Johnson/Twisted-Ring Counters
The Johnson counter, also known as the twisted-ring counter, is exactly the same as the ring counter except that the inverted output of
the last flip-flop is connected to the input of the first flip-flop.
4-bit Synchronous Johnson Counter
The Johnson counter works in the following way : Take the initial state of the counter to be 000. On the first clock pulse, the inverse of
the last flip-flop will be fed into the first flip-flop, producing the state 100. On the second clock pulse, since the last flip-flop is still at
level 0, another 1 will be fed into the first flip-flop, giving the state 110. On the third clock pulse, the state 111 is produced. On the
fourth clock pulse, the inverse of the last flip-flop, now a 0, will be shifted to the first flip-flop, giving the state 011. On the fifth and
sixth clock pulse, using the same reasoning, we will get the states 001 and 000, which is the initial state again. Hence, this Johnson
counter has six distinct states : 000, 100, 110, 111, 011 and 001, and the sequence is repeated so long as there is input pulse. Thus this is
a MOD-6 Johnson counter.
The MOD number of a Johnson counter is twice the number of flip-flops. In the example above, three flip-flops were used to create the
MOD-6 Johnson counter. So for a given MOD number, a Johnson counter requires only half the number of flip-flops needed for a ring
counter. However, a Johnson counter requires decoding gates whereas a ring counter doesn't. As with the binary counter, one logic gate
(AND gate) is required to decode each state, but with the Johnson counter, each gate requires only two inputs, regardless of the number
of flip-flops in the counter. Note that we are comparing with the binary counter using the speed up technique discussed above. The
reason for this is that for each state, two of the N flip-flops used will be in a unique combination of states. In the example above, the
combination Q2 = Q1 = 0 occurs only once in the counting sequence, at the count of 0. The state 010 does not occur. Thus, an AND gate
with inputs (not Q2) and (not Q2) can be used to decode for this state. The same characteristic is shared by all the other states in the
sequence.
A Johnson counters represent a middle ground between ring counters and binary counters. A Johnson counter requires fewer flip-flops
than a ring counter but generally more than a binary counter; it has more decoding circuitry than a ring counter but less than a binary
counter. Thus, it sometimes represents a logical choice for certain applications.
Loadable/Presettable Counters
Many synchronous counters available as ICs are designed to be presettable. This means that they can be preset to any desired starting
value. This can be done either asynchronously (independent of the clock signal or synchronously (on the active transition of the clock
signal). This presetting operation is also known as loading, hence the name loadable counter. The diagram below shows a 3-bit
asynchronously presettable synchronous up counter.
3-bit Synchronous Binary Presettable Counter
In the diagram above, the J, K and CLK inputs are wired the same way as a synchronous up counter. The asynchronous PRESET and
CLEAR inputs are used to perform the asynchronous presetting. The counter is loaded by applying the desired binary number to the
inputs P2, P1 and P0 and a LOW pulse is applied to the PARALLEL LOAD input, not(PL). This will asynchronously transfer P 2, P1
and P0 into the flip-flops. This transfer occurs independently of the J, K, and CLK inputs. As long as not(PL) remains in the LOW state,
the CLK input has no effect on the flip-flop. After not(PL) returns to high, the counter resumes counting, starting from the number that
was loaded into the counter.
For the example above, say that P2 = 1, P1 = 0, and P0 = 1. When not(PL) is high, these inputs have no effect. The counter will perform
normal count-up operations if there are clock pulses. Now let's say that not(PL) goes low at Q2 = 0, Q1 = 1 and Q0 = 0. This will
produce LOW states at the CLEAR input of Q1, and the PRESET inputs of Q2 and Q0. This will make the counter go to state 101
regardless of what is occuring at the CLK input. The counter will remain at state 101 until not(PL) goes back to HIGH. The counter will
then continue counting from 101.
A comparison between Synchronous and

Asynchronous Counters
Asynchronous counters, also known as ripple counters, are not clocked by a common pulse and hence every flip-flop in the counter
changes at different times. The flip-flops in an asynchronous counter is usually clocked by the output pulse of the preceding flip-flop.
The first flip-flop is clocked by an external event. A synchronous counter however, has an internal clock, and the external event is used
to produce a pulse which is synchronised with this internal clock. The diagram of an ripple counter is shown below.
4-bit Ripple Counter
It can be seen that a ripple counter requires less circuitry than a synchronous counter. No logic gates are used at all in the example
above. Although the asynchronous counter is easier to construct, it has some major disadvantages over the synchronous counter.
First of all, the asynchronous counter is slow. In a synchronous counter, all the flip-flops will change states simultaneously while for an
asynchronous counter, the propagation delays of the flip-flops add together to produce the overall delay. Hence, the more bits or number
of flip-flops in an asynchronous counter, the slower it will be.
Secondly, there are certain "risks" when using an asynchronous counter. In a complex system, many state changes occur on each clock
edge and some ICs respond faster than others. If an external event is allowed to affect a system whenever it occurs (unsynchronised),
there is a small chance that it will occur near a clock transition, after some IC's have responded, but before others have. This
intermingling of transitions often causes erroneous operations. And the worse this is that these problems are difficult to forsee and test
for because of the random time difference between the events.
Synchronous Counter Design

A synchronous counter usually consists of two parts: the memory element and the combinational element. The memory element is
implemented using flip-flops while the combinational element can be implemented in a number of ways. Using logic gates is the
traditional method of implementing combinational logic and has been applied for decades. Since this method often results in minimum
component cost for many combinational systems, it is still a popular approach. However there are other methods of implementing
combinational logic which offers other advantages. Some of the alternative methods which are discussed here are: multiplexers (MUX),
read-only memory (ROM) and programmable logic array (PLA).
Multiplexer
The multiplexer, also called the data selector, it has n select inputs, 2n input lines and 1 output line (and usually also a complement of
the output). The 2n possible combinations of the select inputs connects one of the input lines to the output. When used as a
combinational logic device, the n select inputs represent n variables and the 2n input lines represent all the minterms of the n variables.
Read-Only Memory
The ROM is usually used as a storage unit for fixed programs in a computer. However, it can also be used to implement combinational
logic. It is useful for systems requiring changeable functions. When a different function is required, a different ROM producing this
function can be plugged into the circuit. No wiring change is necessary. The ROM has n input lines pointing to 2n locations within the
ROM that store words of M bits. As with the MUX, each input line is used to represent a variable and the 2n locations represent the
minterms.
Programmable Logic Array

The PLA is very similar to the ROM. It can be thought of as a ROM with a large percentage of its locations deleted. A ROM with 16
input address lines must have 216, or 65,536 storage locations, and all the words stored in these have to be decoded. The PLA only
decodes a small percentage of the minterms. The PLA is sometimes used to produce a system with a small number of chips in a
minimum time.
More information on these devices are given in article 2 of cwl3.
Making Fast Counters

Where speed is a concern....
In certain application, speed is an important factor affecting the choice of a counter. For example, counters used in communication and
certain instrumentation applications are necessarily fast. We will be looking at some technique commonly used to improve the speed of
a counter. To reinforce, the concepts presented, some commercial counters (by Xilinx) will be considered.
General Structure of a Synchronous Binary Counter

There are two common ways in which a synchronous binary counter is structured. These are, namely, the series carry synchronous
counter and the parallel carry synchronous counter. These two counters are illustrated
as follows :
Series Carry Synchronous Counter
Parallel Carry Synchronous Counter
Both counters depicted above are binary-up counters.
The T implies a T flip-flop. The flip-flop complements/toggles its output on the rising edge of a clock pulse provided its enable (EN)
input is high.
From the diagrams, it can be seen that the least significant bit Q0 toggles on every clock pulse, and subsequent bits toggle when
preceding bits are high. The important distinction between the two counters is the way the EN signals propagate from Q0 to Q3. This is
illustrated by the highlighted paths. The signals are propagated serially and in parallel (to each AND gate) in the first and second case
respectively.
The parallel carry scheme results in a much faster counter. This difference in speed is accounted for by the delay encountered during the
propagation of the EN signals. To illustrate the worst case delay in both cases, we consider a change in Q0 from 0 to 1. (see diagrams
above)
In the series carry scheme, the time to propagate the change in Q0 must take into account the propagation delays of the 3 AND gates (A,
B, C). In the parallel carry scheme, only the propagation delay of 1 AND gate has to be considered. Therefore, the minimum clock
period of the parallel scheme is shorter. Thus, the parallel synchronous carry counter operates at a greater maximum frequency. This
structure is believed to be the fastest synchronous binary counter structure. In applications that require speed, this scheme is commonly
used.
This structure does have limitations. From the diagrams, it can be seen that a single flip-flop output(consider Q0) has to drive a number
of subsequent AND gates. The output current of a flip-flop may not be large enough to drive that many gates. It becomes a problem
when the counter gets bigger. To overcome this, a tree of AND gates is usually used. How exactly this tree will look like is an
engineering choice. This choice will reflect the trade-off between speed requirements and the constraint mentioned above.
Although the series carry scheme is slower, it does not suffer the same drawback as the parallel carry scheme. This makes it a suitable
basis for making big counters. Its speed can be improved by using some form of Prescaling. This technique will be considered in
subsequent sections.
Prescaling
" The Concept "
The idea of prescaling is to provide a "prescaling" stage between the incoming clock frquency and the counting circuit. The prescaling
stage is sometimes provided by a dedicated prescaling device known as the Prescaler. This device/circuit is designed primarily using
Emitter Coupled Logic (ECL) . ECL benefits from very fast switching capabilities. This makes it suitable for high speed counting work.
Despite its suitability to high speed counting work, it has little or no counting features since such features will only impede its operating
speed. The reader does not have to concern himself or herself with the implementation of the prescaler. The reader should, however,
understand the function it performs in the overall counter.
A prescaler generates a "clock" pulse after it has received a number of input pulses. This "clock" pulse is then fed to the counting
circuit. For example, a divide-by-n prescaler will generate a pulse when it has received n input pulses. At present, there are prescalers
that can accept a range of frequencies ranging from a few hundred Megahertx to a few Gigahertz. The point of the prescaler is to divide
an incoming clock and, thereby provide a clock to a larger, slower counting circuit.
The curious reader would probably be wondering how the actual (and faster) incoming clock frequency is actually reflected in the
slower counting circuit. There are a number of ways in which a prescaler can be used, but one sophisticated setup is the "pulse
swallowing" counter. The characteristic of a "pulse swallowing" counter is that it stops counting when a predetermined number of
pulses has been received.
The following diagram shows a down-counting Binary Coded Decimal (BCD) counter in a simplified "pulse swallowing" setup.
BCD Pulse Swallowing Counter
In the above setup, the Tens and Units sections of a BCD counter are shown. Note that a section stops counting when zero has been
reached. Consequently, a carry is also generated ( UC and TC) . Both sections are presettable via P3-P0. The outputs(Q3-Q0) reset to
the preset values when Pe is high. UC is fed back to the prescaler as the Mode(M) input signal. When M is high or low, the prescaler
divides-by- 10 or 11 repectively before generating a "clock" pulse. To demonstrate the principle of "pulse swallowing", let's consider an
example.
Suppose we preset a value of 32 (0011 0010). The outputs will have values as shown below :
Tens Units Mode(M) Decimal Value
---------after 0 clock pulses-----------
0011 0010 0 32
---------after 11 clock pulses----------
0010 0001 0 21
0001 0000 1 10
0000 0000 1 00
----------------------------------------
Effectively, a "pulse swallowing" counter "swallows up" fast incoming clock pulses. This is reflected in the slower counter by
simultaneously driving the Tens and Units section. Therefore the net effect of such a combination (of prescaler and counter) is a counter
operating at a much higher speed than what it was capable of alone.
Pipelining
Pipelining is a "predict and store" technique. It "predicts" an event one(usually) clock cycle before it is to occur. Upon prediction,
certain output value(s) (resulting that from that event) are set. These new value(s) are stored/latched using flip-flops (usually D type).
They appear at the outputs on the next clock pulse when the event actually occurs. How does this actually help in speeding things up?
Let's say the detection of an event and the setting of the required outputs take 20ns. The propagation of the outputs takes another 10ns.
Consider the two situations where pipelining is used and not used. If the above actions had to be performed in one clock cycle, the
minimum clock period would be 30 ns(without pipelining). If these two sets of actions were performed in two separate clock periods,
the minimum clock period is 20ns(with pipelining). With pipelining, the overall frequency/speed of the circuit is improved. This is
illustrated schematically as follows :
" Pipelining speeds things up!! "
Fast Counters From Xilinx

# Synchronous Presettable Counter
(Xilinx Application Notes XAPP 003.002)
Maximum Clock Frequency

8 bits : 71 MHz
16 bits : 55 Mhz
This counter demonstrates the parallel carry synchronous counter structure and the pipelining technique.
Let's consider an up-counting version of this counter.
Presettable Up Counter
Q- counter bits
D- preset via these inputs
On first sight, it looks complicated but the reader may have noticed that there are many similar blocks of logic circuitry. Let's take a
look at some of these blocks and see how they work.
Consider block producing Q0
Block Producing Q0 (least significant bit)
TERMINAL COUNT high
As seen below, an inverted version of Q0 is propagated through AND gate A. D0 is not propagated through A because an inverted
version of TERMINAL COUNT is fed into B. Therefore the output of B is low. With this setup, Q0 toggles on every rising edge of the
clock pulse.
when TERMINAL COUNT high
TERMINAL COUNT low
As seen below, the inverted version of Q0 is not propagated through A. D0 is propagated through B because an inverted version of
TERMINAL COUNT is fed into B. The output of the OR gate will have the value of D0. Therefore, Q0 will have the value ofD0 on the
next clock pulse.It is noted that the preset value D appears as the output Q on the next clock pulse (after terminal count). This applies to
all bit stages.
when TERMINAL COUNT low
Consider block producing Q3
Block Producing Q3
TERMINAL COUNT high
As seen below, the output of the EX-OR gate C will be propagated through A. The output of C is high when either(not both) T3 or Q3 is
high. T3 is the ANDED version of all preceding outputs(Q0-Q2 ).(Note that in the Q1 stage, the T input is replaced by Q1) Effectively,
Q3 stays the same when T3 is low. When T3 is high, Q3 toggles. Therefore, in all the bit stages, an output bit toggles when the
preceding bits are high.
when TERMINAL COUNT high
TERMINAL COUNT low
The preset value is loaded on the next clock pulse as before.
Consider the Carry connections
The Carry Connections
Let's focus our attention on the generation of the T outputs(see above). This counter uses an adapted version of the parallel carry scheme
by employing an AND gate tree. The different outputs driving the AND gates are summarised schematically as follows :
AND Gate Tree Diagrams
In this setup, Q0 is fed directly to the next bit stage and in parallel to all the T(T2-T6) AND gates. This minimises the worst case delay
(compare with the series carry scheme). Subsequent bits feed in parallel to the relevant AND gates. The additional gate delay introduced
by Tx does not affect the critical paths from Q0 to Q7 because of the way the numbers change.
Consider the Pipelining block
Pipelining TERMINAL COUNT
When the counter output is 11111110, the NAND gate output is low. Therfore, when the value preceding terminal count is detected, the
required TERMINAL COUNT value (low) is fed to the input of the flip-flop. On the next clock pulse(when it is terminal count),
TERMINAL COUNT is low ("load preset value"). This propagates the preset values (D0-D7) to the inputs of the flip-flops. Note : when
any other values are detected, the NAND gate output is high. Thus TERMINAL COUNT is high ("do not load preset value").
# High-Speed Synchronous Prescaler Counter

( Xilinx Application Notes 001.002)
Max Clock Frequency

8 bits : 200 MHz
16 bits : 115 MHz
The counter demonstrates prescaling and pipelining.
The counter can be represented in a block diagram as follows :
Non-Loadable Binary Counter
The counter is implemented on a Field Programmable Logic Device (FPGA). This requires it to be implemented as tri-bit blocks(TB1
and TB2 ) for optimal resource usage. The reader does not need to concern himself/herself with this.
The counter employs the concept of prescaling but does not use a dedicated prescaler (ECL device). Instead, the least significant (LS)
tri-bit(Q0-Q2 ) provides the prescaling function. All tri-bits respond (increment) to a clock pulse if its Count-Enable inputs (CE, CEP ,
CET) are high. The CEO of the LS tri-bit is high once in every 8 clock cycles when all its outputs are high. The "prescaler" pulse
effectively reduces the clock rate to the rest of the tri-bits by a factor of 8. Note that there is no change in the original clock rate. The 7
clock cycles when the LS CEO is low gives the CEO-CET ripple chain (of subsequent tri-bits) time to settle. If this prescaling was not
done, the settling time would have to be taken into account when determining the minimum clock period of the counter. This would
significantly limit the minimum clock period, thereby slowing the counter down. This would become clearer when we examine the
actual implementation of this counter.
TB1
TB2
Note : all clock inputs are assumed to be driven by a common clock. Qa and Qc represent the LSB and MSB of a tri-bit respectively.
Generation of CEO in TB1(A) and TB2(B)
CEOs are high only when CEs(or CET s) and the outputs Q(Qa-Qc)are high.
Generation of Qa in TB1(A) and TB2(B)
When the Count-Enable inputs are high, the EX-OR gate complements Qa. Therefore, the Count-Enable inputs effectively "enable" or
"disable" the complementing function.
Generation of Qb in TB1(A) and TB2(B)
When both Qa and the Count-Enable inputs are high, the complementing function is enabled.
Generation of Qc
The generation of Qc is similar to the generation of Qb except that the value of Qb is also fed into the AND gate.
The Ripple Chain
The delay of this ripple chain is the sum of all the gate delays presented by the chain of AND gates. To appreciate the sigificance of this
delay, let's consider an example. Suppose the current values from Q23-Q3 is 01111...(all ones)...110. On the next "prescaler" pulse, Q3
will become 1. This "information" about the change in Q3 has to be propagated to subsequent tri-bits before the next clock pulse. This is
necessary to ensure the correct changes(on the next clock pulse) to subsequent bits. As evident from the diagrams, the worst case delay
is the propagation of this "information" from Q3 through the ripple chain and to the flip-flop input of Q23. This delay is a major and
common problem with most binary counters. The "prescaler" accommodates this delay by allowing time for the propagation of this
"information". This does not affect the effective operating frequency of the counter because the "prescaler"(LS tri-bit) still operates at
the faster clock rate.
The speed of the counter can be improved further by pipeliningthe LS CEO signal :
Pipelining CEO
We see that when 110 is detected, CEO is set and fed to the flip-flop input. This value appears as CEO on the next clock pulse(when 111
occurs).
The actual implemetation of LS tri-bit with pipelining is seen below :
Implemention of LS tri-bit with Pipeline
# Ultra-Fast Synchronous Counter

(Xilinx Application Notes XAPP 014.001)
Maximum Clock Frequency

8 bits : 256MHz
16 bits : 108MHz
Ultra-Fast Counter
In this counter, the LS bit Q0 acts as the "prescaler". The effective "clock" rate provided by this prescaler is 1/2 of the actual clock rate.
In the previous example, the distribution of the CEO signal (from the LS to the MS tri-bit) introduces a line transmission delay. This
counter eliminates the delay by replicating QO for bits after Q1. This is done by the following chain/network of flip-flops :
Network To Replicate Q0
To best describe the function of such a network, let's take a look at the timing diagram depicting the output values :
The Timing Diagram
It is seen that all QX0 outputs are in sync with Q0 after the initial delays. The effect of this is that bits after Q1 appear to be driven
directly by Q0 and without the line transmission delay. This improves the minimum clock period.
The Second "Prescaler"
Here Q1 and Q2 act as the second "prescaler". This additional prescaler is needed to accommodate a large counter(more bits). The
effective "clock" rate provided by this prescaler to the rest of the counter is 1/8 of the actual clock rate. This second prescaling stage
allows the rest of the counting circuit to employ the series carry scheme . The use of such a carry scheme allows a larger counter to be
constructed.
From the diagram, CEP2 is pipelined. When the LS three bits are 101(Q2-Q0), the output of AND gate A is high. Since Q0 is high at
this point, the value of A is selected and appears at the output of multiplexer B. This value is fed to the Flip-flop input. On the next
clock cycle, the value appears as CEP2(high). At this point, the LS three bits is 110. Since QY01 is low, CEP2 is selected by the
multiplexer. Thus, on the next clock cycle, CEP2 is high again. This is summarised below :
Q2 Q2 Q1 A D-input(flip-flop) CEP2
1 0 0 1 0 0
1 0 1 1 1 0
1 1 0 0 1 1
1 1 1 0 0 1
References
Excellent Good Fair Poor
1. Title: Counting and Counters
Author(s): R M M Oberman
Source: The Macmillan Press Ltd
Type: Usefulness: Readability:
2. Title: Electronic Counters

Author(s): R M M Oberman
Source: The Macmillan Press Ltd
3. Title: Logic Design Principles

Author(s): Edward J. McCluskey
Source: Prentice Hall International
4. Title: Digital System Design

Author(s): Barry Wilkinson with Rafic Makki

1/2
5. Title: Digital Design : Principles and Practices

Author(s): John F. Wakerly

1/2
6. Title: Digital Systems : Principles and Practices

Author(s): Ronald J. Tocci
7. Title: Practical Digital Design Using ICs

Author(s): Joseph D. Greenfield
Source: John Wiley & Sons
8. Title: Digital Logic Design

Author(s): Brian Holdsworth
Source: Butterworth-Heinemann Ltd
Type: Usefulness: 1/2 Readability:
9. Title: Digital Design

Author(s): M. Morris Mano
Source: Prentice Hall International>

1/2
10. Title: Logic Design Principles

Author(s): Edward J. McCluskey
11. Title: Digital Electronics

Author(s): Christopher E. Strangio
12. Title: Digital Logic and State Machine Design

Author(s): David J. Comer
Source: Saunders College Publishing
13. Title: Digital Circuits and Microprocessors

Author(s): Herbert Taub
14. Title: The Programmable Logic Data Book

Author(s): Xilinx Inc.
Source: Xilinx Inc.
15. Title: Digital Logic and Computer Design

Author(s): M. Morris Mano
16. Title: ISE Second Year Digital Electronics Notes

Author(s): Mike Brookes
Source: Mike Brookes

VHDL Synchronous Counters

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

VHDL Synchronous Counters

Uploaded by

Copyright:

Available Formats

Synchronous Counters - Final Report 26/4/17, 5*16 am

by Lee Chin Wei

1. Types of Synchronous Counters and How they work

Different types of Synchronous Counters

4-bit Synchronous Binary Up-Counter

4-bit Synchronous Binary Up Counter using speedup technique

Binary Down Counters

4-bit Synchronous Binary Down Counter

Binary Up/Down Counters

3-bit Synchronous Binary Up/Down Counter

3-bit Synchronous Binary MOD-6 Counter

Binary Coded Decimal (BCD) Counters

4-bit Synchronous Ring Counter

4-bit Synchronous Johnson Counter

3-bit Synchronous Binary Presettable Counter

A comparison between Synchronous and

4-bit Ripple Counter

Synchronous Counter Design

read-only memory (ROM) and programmable logic array (PLA).

Programmable Logic Array

More information on these devices are given in article 2 of cwl3.

Making Fast Counters

General Structure of a Synchronous Binary Counter

Series Carry Synchronous Counter

Parallel Carry Synchronous Counter

Both counters depicted above are binary-up counters.

" The Concept "

BCD Pulse Swallowing Counter

Tens Units Mode(M) Decimal Value

---------after 0 clock pulses-----------

---------after 11 clock pulses----------

---------after 22 clock pulses----------

---------after 32 clock pulses----------

" Pipelining speeds things up!! "

Fast Counters From Xilinx

Maximum Clock Frequency

Let's consider an up-counting version of this counter.

Consider block producing Q0

Block Producing Q0 (least significant bit)

TERMINAL COUNT high

when TERMINAL COUNT high

TERMINAL COUNT low

when TERMINAL COUNT low

Consider block producing Q3

TERMINAL COUNT high

when TERMINAL COUNT high

TERMINAL COUNT low

The preset value is loaded on the next clock pulse as before.

Consider the Carry connections

The Carry Connections

AND Gate Tree Diagrams

Consider the Pipelining block

Pipelining TERMINAL COUNT

# High-Speed Synchronous Prescaler Counter

Max Clock Frequency

The counter demonstrates prescaling and pipelining.

The counter can be represented in a block diagram as follows :

Non-Loadable Binary Counter

Generation of CEO in TB1(A) and TB2(B)

Generation of Qa in TB1(A) and TB2(B)

Generation of Qb in TB1(A) and TB2(B)

The Ripple Chain

The actual implemetation of LS tri-bit with pipelining is seen below :

Implemention of LS tri-bit with Pipeline