You are on page 1of 20

- 1 -

Advanced Digital Signal Processing




Term paper Tutorial

Fast Algorithms for Discrete Cosine
and Sine Transforms




R99943006




- 2 -

Contents
I. Abstract ............................................................................................................. 3

II. Introduction of DCT/DST ........................................................................ 3

III. Fast Algorithm for 1-D DCT/DST
1. The explicit forms of orthonormal DCT/DST .............................................. 5
2. 1-D DCT-I/DST-I ................................................................................................. 8
A. The split-radix DCT-I algorithm ..................................................................... 8
B. The split-radix DST-I algorithm ............................................................... 11
3. 1-D DCT-II/DST-II
A. The split-radix DCT-II algorithm ............................................................. 13
B. DCT-II computation via Walsh Transform ............................................ 15

IV. Fast Algorithm for 2-D DCT
1. Introduction of 2-D DCT ................................................................................. 17
2. Fast Algorithm for 2-D DCT-II ...................................................................... 18

V. Conclusion .................................................................................................... 19

VI. Reference ...................................................................................................... 19






- 3 -

I. Abstract
Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) are
widely utilized in different area. Since the DCT became the standard of image
and video compression and processing, the fast algorithm for DCT/DST has been
more and more essential nowadays. In this tutorial, we first analyze the properties
of DCT and DST. After that, we not only introduce split-radix fast algorithm for
1-D DCT-I, DST-I, DCT-II/DST-II, but also show another fast transform for
DCT-II via Walsh transform. In addition to the 1-D algorithm, we also extend the
concept of 1-D fast algorithm to 2-D algorithm.
II. Introduction of DCT and DST
A DCT/DST uses sum of cosine/sine functions oscillating at different
frequencies to express a sequence of data. DCTs are important in many areas like
lossy compressions of audio and images and spectral methods for the numerical
solution of partial differential equations. Besides, when input is even or function,
either DCT-I or DST-I is capable of replacing Discrete Fourier Transform (DFT)
in spectral analysis to reduce computation complexity [2]. There are eight types
of DCTs and DSTs respectively, and four types of DCT/DST are more common.
Four common types of DCT are shown in matrix form and denoted by
I
N+1
C ,
II
N
C ,
III
N
C ,
IV
N
C . And they are defined as









- 4 -








Their corresponding four types of DST in matrix form denoted by
I
N-1
S ,
II
N
S ,
III
N
S ,
IV
N
S , are respectively defined as















There are some important properties of the four-type DCTs and four-type
DSTs and are shown as follows:
They are real-valued and orthonormal.
They are intrinsically related to generalized DFT. [3]
The inverse of DCT/DST matrix is simply the transpose matrix of
original matrix.
- 5 -

Since the range is large if we illustrate all of DCT and DST mentioned above,
we only focus on type-I and type-II DCT/DST which have been discussed in our
class. The application of type-I DCT/DST is the replacement of DFT to reduce
computation complexity. And the type-II of DCT is mainly used in image
compression because of its excellent capability in energy compaction.
Moreover, there are relations between DCT and corresponding DST, which
can be used to reduce the implementation complexity. Specifically, the matrix
II
N
C is related to matrix
II
N
S by [4, 5]

where
N
J is the cross-identity (reflection) matrix and =diag{( 1) },
k
N
D
0,1, , 1 k N = . Therefore, the efficient implementation of DST-II can be
obtained from the one of DCT-II by proper order reversal and sign changes.
Consequently, in the following chapter, only DCT-I, DST-I and DCT-II will be
discussed.
III. Fast Algorithm for 1-D DCT/DST
1 The explicit forms of orthonormal DCT/DST
Before we introduce the fast algorithm for DCT/DST, we first derive the
explicit forms of DCT/DST matrices for N=2, 4 and 8. By deriving the explicit
form, the properties of symmetry and anti-symmetry in DCT/DST are observed,
which can be used in the fast algorithm. Moreover, the higher-order matrices (ie.
the matrices with N>8) can be generated from lower-order ones (ie. the matrices
with N = 2, 4 and 8) by recursive sparse matrix factorization.
For N = 2, 4 and 8, the explicit forms of DCT-I matrix
I
N+1
C defined in
Chapter II are as follow:

- 6 -
















The elements of the DST-I matrix
I
N-1
S are defined in Chapter II. For N = 2,
the matrix
( )
I
1
S 1 = , and for N = 4 and 8, the explicit form are as following:











Finally, For N = 2, 4 and 8, the explicit forms of DCT-II matrix
II
N
C
- 7 -

defined in Chapter II are as follow:












By investigating the rows of the matrices
I
N+1
C ,
I
N-1
S and
II
N
C , the properties
of symmetry can be found. We observe that the location of the symmetry center is
determined by the number of the row (or the number of column because the two
are the same). Here, we call the number of the row of a matrix as the order of a
matrix. When the order of the matrix is even, the symmetry center is in the
midpoint between two center elements of the row. And when the order of the
matrix is odd, the symmetry center is in the midpoint of the row. We denote the
-th n element of the -th k row in the matrix by
kn
+ . Then, the row is symmetric if
, 1 k N n kn
+ = + , and anti-symmetric if
, 1
, 0,1, , 2 1.
k N n kn
n N

+ = + = We
can see that even-indexed rows are symmetric and odd-indexed rows are
anti-symmetric, which can be expressed as follow:
( )
, 1
1 , 0,1, , 2 1
k
k N n kn
n N

+ = + =
By Ref. [6][7], after proper bit-reversal, matrices
N
T with even order and
symmetric property may be factorized into sparse matrices as two following
forms:

- 8 -




Where and are matrices of order 2 N with symmetry and anti-symmetry.
And and are respectively the results of the multiplication of and
by reflection matrix
In addition to even order matrices, it is possible for odd order matrices to use
sparse matrix factorization as well, except for the center rows and columns. The
symmetry and anti-symmetry in the rows of the DCT and DST matrices are
significant characteristics in factorizing the matrices.
2 1-D DCT-I/DST-I
As we have mentioned before, DCTs/DSTs are intrinsically related to
generalized DFT. The fast algorithm for DFT, usually called fast Fourier
transform (FFT) algorithm, has been investigated and studied for decades. Hence,
the concepts of FFT can also be exploited in Fast algorithm of DCT/DST. In the
following sections, one of the FFT called split-radix FFT is going to be extended
to DCT-I, DST-I and DCT-II. Therefore, we introduce its concepts first.
Spit-radix FFT algorithm is a variant of the Cooley-Tukey FFT algorithm.[8]
Instead of one radix, a blend of radices 2 and 4 is used in spit-radix FFT. It
recursively decomposes a N-point DFT into one N/2-point DFT and two
N/4-point DFTs. With the use of different radix in the decomposition, split-radix
FFT achieves lower computation than radix-2
m
FFT, for m is positive integer.
A. The split-radix DCT-I algorithm
The idea of split-radix FFT algorithm can be extended to the DCT-I [9]. The
formulae of the split-radix fast DCT-I algorithm without normalized factors can
be derived as follows:
- 9 -













Therefore, the first stage decomposes the (N+1)-point DCT-I into one
(N/2+1)-point DCT-I, one (N/4+1)-point DCT-I and (N/4-1)-point DST-I. The
decomposition works recursively and the final output sequence is in bit-reversed
order. The generalized signal flow graph with regular structure is shown in
Fig.3.1 for N = 2, 4 and 8 with =1 2 o .
Based on the split-radix algorithm, the matrix
I
N+1
C can be decomposed
recursively and is shown as follows:










1 N
P
+
is a permutation matrix for reordering from bit-reversal to natural order.
- 10 -

















Fig. 3.1.
And and are un-normalized DST-I and DCT-I matrices
respectively shown as










- 11 -

where and are bit-reversal permutation matrices. And is a
rotation matrix as follow:








B. The split-radix DST-I algorithm
The idea of split-radix FFT algorithm used in last Section can be extended to
the DST-I as well. And the formulae of split-radix fast DST-I algorithm without
normalization factors can be expressed as follows:













So, we can see that the first stage decomposes the (N+1)-point DST-I into
one (N/2+1)-point DST-I, one (N/4+1)-point DCT-I and (N/4-1)-point DST-I. The
decomposition works recursively and the final output sequence is in bit-reversed
- 12 -

order. Since the DST-I matrix for N = 2 is trivial, the generalized signal flow
graph with regular structure is only shown for N = 4 and 8 with =1 2 o in Fig.
3.2.









Fig. 3.2
Based on the split-radix algorithm, the matrix
I
N-1
S can be decomposed
recursively and is shown as follows:











- 13 -






-1 N
P is a permutation matrix for reordering from bit-reversal to natural order.
And and are un-normalized DST-I and DCT-I matrices respectively
shown as



where and are bit-reversal permutation matrices. And is a
rotation matrix as follow:








3 1-D DCT-II/DST-II
A. The split-radix DCT-II algorithm
In addition to 1-D DCT-I and 1-D DST-I, the idea of split-radix FFT
algorithm used can be extended to the DST-II as well. And the formulae of
split-radix fast DCT-II algorithm without normalization factors can be expressed
as follows:


- 14 -































Fig. 3.3
- 15 -

Hence, the first stage decomposes the (N+1)-point DCT-II into one
N/2-point DCT-II, one N/4-point DCT-II and N/4-point DST-II. The
decomposition works recursively and the final output sequence is in bit-reversed
order. The generalized signal flow graph with regular structure is shown in Fig.
3.3 for N = 2, 4 and 8. Moreover, the matrix
II
N
C can be decomposed recursively
and is shown as follows:


where is bit-reversal matrices reordering from bit-reversal to natural order .
And matrix is given as follow:





where is a rotation matrix as follow:








B. DCT-II computation via Walsh Transform
Besides split-radix algorithm, Walsh transform (WT), another kind of
generalized DFT, can also be exploited to fast algorithm of DCT. The concept of
fast algorithm of DCT-II via WT is that the DCT-II matrix with
symmetry/anti-symmetry can be realized via other simpler matrix with
- 16 -

symmetry/anti-symmetry. Since the properties of symmetry/anti-symmetry in WT
matrix are the same with the DCT-II matrix and the element in WT matrix
consists only 1 , WT matrix is an ideal matrix to replace the DCT-II matrix.
Denote the DCT-II and WT respectively in matrix-vector notation without
the normalized factors as

If we rearrange the rows of
II
N
C and
N
W , the formulae can be expressed as

Since

N
W is an orthonormal matrix, we can get the following results by
substitute the relation

T
N N N
W W I = into the last equation:

where
N
T is the conversion matrix transferring Walsh domain vector to the
DCT-II domain. Therefore, there are two steps in this fast algorithm: first,
compute the WT of the input vector; second, convert from Walsh domain to the
DCT domain via
N
T . The signal flow graph for the DCT-II computation via WT
for N = 8 is shown in Fig.3.4(a) and (b).











Fig.3.4(a). Step1: Compute the WT of the input vector.

- 17 -













Fig.3.4(b). Step2: Convert from Walsh domain to the DCT domain via
N
T .
IV. Fast Algorithm for 2-D DCT
1 Introduction of 2-D DCT
The most common and significant use of 2-D DCT algorithm is the
application of DCT-II (typically 4 4, 8 8, 16 16 ) in compression of digital
image and video. Therefore, among eight types of DCT and their corresponding
DST, 2-D DCT-II is the only one that research efforts have been concentrated on.
And there is no exception for this tutorial that we only focus on the topic of 2-D
DCT-II fast algorithm.
The 2-D DCT-II for an N N input data matrix { }
mn
x , m, 0,1, , 1 n N = ,
is defined as follow


and the inverse 2-D DCT-II as follow
- 18 -






2 Fast algorithm for 2-D DCT-II
Generally, there are two kinds of approach of existing fast 2-D DCT-II
algorithm: indirect and direct. In the indirect approach, the 2-D DCT is computed
via other 2-D discrete orthogonal transforms like 2-D DFT of the same size. In
the direct approach, there are two methods based on the direct 2-D DCT
computation. The first one is called row-column method. It utilizes the
separability property of 2-D DCT kernel by applying any fast 1-D DCT algorithm
to the rows and columns of the input matrix sequentially. Hence, for an N N
input data matrix, where =2
n
N , the row-column method requires totally
2
2 log N N multiplication. The second method is a 2-D vector method [10]
utilizing 2-D decomposition process. Although its computation efficiency
outperforms the row-column method, in this tutorial, we only introduce the
row-column one because it combines the 1-D DCT fast algorithm we have
already learned in previous sections and will be easier to understand.
In the row-column method [11], the transform kernel of 2-D DCT-II is
derived in the form as
cos( ) cos( )=[ cos( )+cos( )] 2 o | o | o | + .
Thus, the multiplications required are reduced and only the ones for the
computation of 1-D -point N DCT-II remain. Therefore, for N N DCT-II, data
reordering and mapping before N 1-D -point N DCT-II is needed. Moreover, a
regular butterfly structure for additions after 1-D DCT-II is also included. The
corresponding signal flow graph for 4 4 DCT-II computation is shown in
- 19 -

Fig.4.1.










Fig. 4.1. Signal flow graph for 4 4 DCT-II computation.
Any 1-D DCT-II fast algorithm can be used.
V. Conclusion
In this tutorial, we first analyze the symmetric and anti-symmetric properties
of DCT and DST. After that, split-radix fast algorithm for 1-D DCT-I, DST-I,
DCT-II/DST-II is introduced. Moreover, another fast transform for DCT-II via
Walsh transform is also shown. In addition to the 1-D algorithm, we also extend
the concept of 1-D fast algorithm to 2-D algorithm utilizing the separablility
property.
VI. References
[1] V. Britanak, P. C. Yip, K. R. Rao, Discrete Cosine and Sine Transforms:
- 20 -

General Properties, Fast Algorithms and Integer Approximations, Academic
Press, 2007
[2] ,
[3] S. A. Martucci, Symmetric convolution and the discrete sine and cosine
transforms, IEEE Transactions on Signal Processing, Vol. 42, No. 5, May
1994, pp. 10381051.
[4] Z. Wang, A fast algorithm for the discrete sine transform implemented by the
fast cosine transform, IEEE Transactions on Acoustics, Speech, and Signal
Processing, Vol. ASSP-30, October 1982, pp. 814815.
[5] Z.Wang, Fast algorithms for the discreteWtransform and for the discrete
Fourier transform, IEEE Transactions on Acoustics, Speech, and Signal
Processing, Vol. ASSP-32, August 1984, pp. 803816.
[6] S. Venkataraman, V. R. Kanchan, K. R. Rao and M. Mohanty, Discrete
transforms via the WalshHadamard transform, Signal Processing, Vol. 14,
No. 4, June 1988, pp. 371382.
[7] G. Plonka and M. Tasche, Fast and numerically stable algorithms for discrete
cosine transforms, Linear Algebra and its Applications, Vol. 394, No. 1,
January 2005, pp. 309345.
[8] Split-radix FFT algorithm , Wikipedia,
http://en.wikipedia.org/wiki/Split-radix_FFT_algorithm
[9] C. W. Sun and P. Yip, Split-radix algorithms for DCT and DST,
Proceedings of the Asilomar Conference on Signals, Systems and Computers,
Pacific Grove, CA, November 1989, pp. 508512.
[10] S. C. Chan and K. L. Ho, A new two-dimensional fast cosine transform
algorithm, IEEE Transactions on Signal Processing, Vol. 39, No. 2, February
1991, pp. 481485.
[11] N. I. Cho and S. U. Lee, Fast algorithm and implementation of 2-D discrete
cosine transform, IEEE Transactions on Circuits and Systems, Vol. 38, No. 3,
March 1991, pp. 297305.

You might also like