You are on page 1of 5

Analysis of Circular Buffer Rate Matching for LTE Turbo Code

Jung-Fu (Thomas) Cheng*, Ajit Nimbalker+, Yufei Blankenship+, Brian Classon+, and T. Keith Blankenship+ * Ericsson Research, RTP, NC, USA + Motorola Labs. Schaumburg, IL, USA
Email: thomas.cheng@ericsson.com, {a.nimbalker, yufei.blankenship, brian.classon, keith.blankenship}@motorola.com
Abstract This paper discusses circular buffer rate matching (CBRM) algorithms for the turbo code in the Long Term Evolution (LTE) of the WCDMA-based air interface. To enhance performance at high code rates, systematic bit puncturing is incorporated in conjunction with the CBRM. The RM algorithm is further optimized based on the algebraic properties of the QPP interleavers and the 8-state recursive systematic convolutional code of the LTE turbo code. Index Terms turbo codes, rate-matching, circular buffers, catastrophic puncturing.

HE explosive growth of mobile phone users and the increasing demand for broadband wireless access has led to the development of a long term evolution (LTE) for the WCDMA-based air interface by the Third Generation Partnership Project (3GPP). Some key minimum requirements of LTE include packet data support with peak data rates of 300 Mbps on the downlink and 50 Mbps on the uplink, a low maximum latency of 10 ms MAC layer round trip delay, and flexible bandwidth support. These requirements have led to the adoption of OFDM based modulation and multiple access, MIMO antenna schemes, and adaptive modulation and coding with advanced channel coding and hybrid ARQ protocols. To address the high data rate requirements of LTE, the 3GPP working group undertook a rigorous evaluation of advanced channel coding candidates (turbo and LDPC codes). Consequently, it was decided to adopt the rate-1/3 WCDMA turbo code with a new contention-free internal interleaver based on quadratic permutation polynomial (QPP) to facilitate efficient high speed turbo decoding [1-3]. The QPP interleaver requires small parameter storage, provides excellent performance, and, most importantly, allows highly flexible parallelization due to its maximum contention-free property. The 3GPP working group also investigated the performance of the LTE turbo code in conjunction with different rate matching (RM) algorithm proposals. An RM algorithm repeats or punctures the bits of a mother codeword to generate a requested number of bits according to a desired code rate that may be different from the mother code rate of the turbo coder. The RM algorithm should also facilitate enhanced hybrid ARQ (HARQ) operation by minimizing repetition of coded bits (when possible) for subsequent retransmissions of a packet in order to increase coding gains via incremental redundancy (IR). Considering that the RM algorithm in 3GPP HSDPA is

I. INTRODUCTION

lacking in performance for certain code rates and block sizes, the topic of RM was extensively studied to devise a better solution for LTE. In this paper, we investigate the design and optimization of low-complexity high-performance RM algorithms based on circular buffers for LTE. Section II describes the design of the circular buffer rate matching algorithms with systematic bit puncturing. The optimization of sub-block interleavers based on the congruence property of the QPP interleaver is also presented. The optimization of sub-block interleavers is further analyzed with the theory of catastrophic puncturing avoidance in Section III. Numerical analysis is presented in Section IV and the paper is concluded in Section V. II. CIRCULAR BUFFER RATE MATCHING DESIGN The 3GPP turbo code is a systematic parallel concatenated convolutional code with two 8-state constituent encoders and one turbo code internal interleaver. Each constituent encoder is independently terminated by tail bits. For an input block size of K bits, the output of a turbo encoder consists of three lengthK streams, corresponding to the systematic bit and two parity bit streams (referred to as the Systematic, Parity 1, and Parity 2 streams in the following), respectively, as well as 12 tail bits due to trellis termination. Thus, the actual mother code rate is slightly lower than 1/3. In LTE, the tail bits are multiplexed to the end of the three streams, whose lengths are hence increased to (K+4) bits each. In the circular buffer rate matching (CBRM) method for rate-1/3 turbo codes [4], each of the three output streams of the turbo coder is rearranged with its own sub-block interleaver.

Figure 1 Operations of circular buffer rate matching for turbo code.

Figure 2 Conceptual composition of a circular buffer for LTE turbo code rate matching. White, red and blue cells contains bits from the Systematic, Parity 1 and Parity 2 streams, respectively. Green cells mark starting points of redundancy versions defined for RM. The number of columns for each stream is 32 (not shown to scale).

Then, a single output buffer is formed by placing the rearranged systematic bits in the beginning followed by bit-bybit interlacing of the two rearranged parity streams (see Figure 1). Interlacing allows equal levels of protection for each constituent code. Since each sub-block interleaver is typically based on row-column permutations, the CB can also be visualized as in 2-dimensional format (1-dimensional CB is obtained by reading bits column-by column) as shown in Figure 2. Assuming 32 columns in each sub-block interleaver, the first 32 columns in Figure 2 represent the 32 columns of the systematic bits after sub-block interleaving, while the remaining 64 columns represent bit-by-bit interlaced columns of the permuted Parity 1 and Parity 2 streams. For a desired code rate, the number of coded bits Ndata to be selected for transmission is passed to the RM algorithm. The bit selection step of the CBRM simply reads out the first Ndata bits from the start of the buffer. In general, the bits to be selected for transmission can be read out starting from any point in the buffer. If the end of the buffer is reached, then the reading continues by wrapping around to the beginning of the buffer (hence the term circular buffer). Thus, puncturing and repetition is achieved using a unified method. The CRBM algorithm has advantages in flexibility (in code rates achieved) and also granularity (in stream sizes). Typically, lowcomplexity sub-block interleavers that facilitate uniform puncturing at any code rate are preferred [4]. Next, we describe two performance enhancements that were adopted in the LTE CBRM algorithm. A. Systematic Bit Puncturing Incremental Redundancy (IR) based HARQ operation is a key performance enabler in LTE. Thus, an LTE RM algorithm is expected to provide different subsets, denoted by redundancy version (RV), of the codeword for different transmissions of a packet (i.e., minimize repetition of coded bits when possible). In CBRM, different RVs can be specified by simply defining different starting points (to start reading out) in the CB. For the first transmission (RV=0), it is conventionally assumed the bits are read out from the beginning of the CB, which means that all systematic bits are always selected and puncturing, if needed, is applied to parity bits only. However, it is wellknown that, for turbo codes with well-designed interleavers, most of the Hamming weight in the minimum distance (and other terms in the distance spectrum) resides in the parity streams. Therefore, if excessive puncturing is applied to the parity bits, the effective minimum distance of the punctured

code can degrade significantly at high code rates. Therefore a small amount of systematic bit puncturing has been proposed in the literature to enhance performance for high coding rates [5-7]. Initial transmissions with high code rates (with HARQ operation to improve reliability) are expected in LTE to reach the aggressive data rate targets. Therefore, for LTE, it was agreed to puncture a small portion of systematic bits by starting the first RV (RV=0) at an offset compared to the beginning of the CB. With the 2-D circular buffer, the systematic bit puncturing is easily enabled by skipping the first columns in the CB for first RV. The setting of =2 adopted in LTE thus results in puncturing ~6% of the systematic bits. B. Sub-block Interleaver Optimization with Address Offset In LTE a total of 188 byte-aligned QPP interleavers are defined for the turbo code [1]. Each QPP interleaver has the even-even property by which even indices in the input block are mapped to even indices after the permutation. Keeping this under consideration, for the CBRM, consider a sub-block interleaver of length K+4 derived as follows, where K is the QPP interleaver length. The input block is written row-by row starting from the top row into a ceil((K+4)/32)32 matrix (dummy bits prepended if the matrix is not full). The columns are then permuted via a length-32 permutation given below. ColPerm = [0, 16, 8, 24, 4, 20, 12, 28, 2, 18, 10, 26, 6, 22, 14, 30, 1, 17, 9, 25, 5, 21, 13, 29, 3, 19, 11, 27, 7, 23, 15, 31]. (1) The output of the subblock interleaver is then generated by reading the column-permuted matrix column-by-column nominally starting from the 1st column (dummy bits are skipped). It can be seen that the sub-block interleaver places all even indices followed by all odd indices in the rearranged subblock (ignoring the tail bits). Thus, if the same sub-block interleaver is used for all three streams to form the CB, due to the even-even property of the QPP interleaver, the first K bits following the systematic bits in the CB are associated with K/2 even-indexed positions in the 1st constituent encoder and K/2 even indexed positions in the 2nd constituent encoder leading to unequal levels of protection for odd and even indices of the input block. This can be remedied by modifying the sub-block permutation of the second parity stream such that all odd indices are placed in the front, followed by all even indices. Thus, for index i, if sys(i) denotes the permutation of the systematic sub-block interleaver, then the permutations of the two parity sub-block interleavers are given by par0(i) = sys(i), and par1(i) = (sys(i)+) mod (K+4), where denotes an offset. In summary, the CBRM algorithms to be investigated in this paper can be characterized by the two parameters and . The baseline algorithm with systematic bit puncturing but no subblock interleaver offset is referred to as the CBRM(=2,=0) algorithm. Exploiting the congruence property of the QPP interleavers, the enhanced CBRM(=2,=1) algorithms sets the sub-block interleaver offset to =1.

III. AVOIDING CATASTROPHIC PUNCTURING IN CBRM To reap the full benefits of systematic bit puncturing, it is important to avoid catastrophic puncturing. In particular, the overall encoding process (including turbo encoder and RM) could become catastrophic if all (or almost all) the parity weight corresponding to an input bit sequence of potentially infinite weight is punctured away [8]. With low Hamming weights, small amounts of channel noise can induce the decoder to make completely wrong decisions, resulting in very high block error rates (BLER). The catastrophic puncturing patterns for the 8-state constituent codes studied in [9] are used to analyze the CBRM algorithms. A. Catastrophic Puncturing in CBRM (=2,=0) The operations of the CBRM algorithms have a period of 32 due to the length-32 row-column subblock permutation. Thus, skipping the first =2 columns in the CB induces a periodic systematic bit puncturing in the 1st constituent code, represented by the following periodic puncturing mask: Systematic Puncturing Mask = [01111111111111110111111111111111], (2)

where 1 and 0 indicate retaining and discarding of systematic bits at the corresponding positions, respectively. The positions in the first period are further distinguished by bold letters. The corresponding parity bit puncturing patterns to avoid are classified in the following critical bit mask [9]: Period-16 Critical Parity Bit Mask = [xCCxxCxCCCxxCxCCxCCxxCxCCCxxCxCC], (3)

parity bits are retained. For coding rates 32/38 r 32/34, the transmitted bits consist of 30 columns of the Systematic stream and no more than 4 columns for each of the parity streams. Since, only a small number of the parity bits are retained from the critical positions in both constituent codes, the performance may suffer even though a total catastrophic puncturing is avoided. As more and more critical parity bits are retained in the two constituent codes (i.e., as code rate reduces), iterative decoding enables the two constituent codes to cooperate and recover some of the critical punctured positions. However, this recovery is dependent of the specific QPP permutation. Although catastrophic puncturing is avoided, performance instability can cause BLER performance flooring for certain block sizes. For coding rates r 32/38 (~0.84), the transmitted bits consist of 30 columns of the Systematic stream and more than 4 columns for each of parity stream. Catastrophic puncturing in both constituent codes is avoided as at least one critical parity bit position is retained periodically. Therefore, the CBRM(=2,=0) algorithm is expected to perform poorly for code rate r 32/34=0.94. The performance should gradually improve as the coding rate decreases toward 32/38=0.84. For code rate r 0.84, its performance should become competitive to other well-designed RM algorithms. B. Partial Catastrophic Puncturing Avoidance in CBRM(=2,=1) The modification of the subblock interleaver for the Parity 2 stream introduced in the last section can be examined from the angle of alleviating catastrophic puncturing problems at high coding rates. With an offset of =1 to the Parity 2 interleaving address, the critical bit mask for the Parity 2 stream becomes Critical Bit Mask for Parity 2 CB = [CCCCCCxxxxxxCCCCCCxxxxCCxxCCCCxx]. (6)

where C and x indicate critical and uncritical parity bit positions, respectively. To avoid catastrophic puncturing, at least one critical parity bit in each period of 16 should be retained by the rate matcher. Interestingly, the QPP permutation also has some periodic properties. For a QPP interleaver with a block size divisible by 16 it can be shown (using the maximum contention-free property) that for any integer k, QPP(16k) mod 16 = 0 (4) In LTE, all QPP block sizes with K512 and half of those with K<512 are divisible by 16. As a consequence, due to interaction of the periodicities of the CBRM and the QPP permutation, =2 induces the same periodic-16 systematic bit puncturing pattern in the second constituent code also. To examine the effective parity bit puncturing pattern of the CBRM (=2,=0), the critical bit mask of (3) has to be permuted by the inter-column permutation pattern of (1): Critical Bit Mask for Parity 1 and Parity 2 in the CB = [xxCCxxCCCCxxxxCCCCCCCCxxxxxxCCCC]. (5)

This allows us to make the following analysis assuming RV=0. For coding rates r 32/34 (~0.94), the transmitted bits consist of 30 columns of the Systematic stream and no more than 2 columns for each of the Parity streams. From (5), it is clear that the encoding is catastrophic since none of the critical

The critical parity bit mask for the Parity 1 stream is the same as (5). We can make the following observations (for RV=0). For coding rates r 32/38, the transmitted bits consist of 30 columns of the Systematic stream and more than 4 columns for each of the parity streams. Catastrophic puncturing is avoided in both constituent codes. For coding rates 32/38 r 32/34, transmitted bits for RV=0 consist of 30 columns of the Systematic stream and no more than 4 columns for each of the parity streams. Therefore, catastrophic puncturing is avoided in the 2nd constituent code but such avoidance is not guaranteed in the 1st constituent code. However, as more and more critical parity bits are retained, statistical avoidance through iterative decoding can aid the recovery process similar to that described for the CBRM(=2,=0) algorithm. For coding rates r 32/34, the transmitted bits consist of 30 columns of the Systematic stream and no more than 2 columns of each of the parity streams. Thus, while certain critical parity bits are retained in the 2nd constituent code, the puncturing in

the 1st constituent code remains catastrophic. The weakness in the 1st constituent code could be remedied by the extrinsic information supplied by the 2nd constituent decoder. However, such statistical avoidance of catastrophic puncturing depends on the QPP permutation and some performance degradation may occur for certain block sizes. From this analysis, for the CBRM (=2,=1) algorithm, reduced performance is expected for code rate r 0.94 though this degradation is not a major issue since the highest code rate in an initial transmission in LTE is expected to be lower than 0.92. The performance improves as the coding rate decreases though some degradation is expected for certain block sizes when statistical catastrophic puncturing avoidance is not effective. For code rate r 0.84, the performance should be competitive to other well-designed RM algorithms. C. Catastrophic Puncturing Avoidance in CBRM (=4,=4) Given the periodic-16 systematic bit puncturing induced by =2 and the column permutation given in (1), it is not possible to find a single offset to avoid catastrophic puncturing in both constituent codes for all LTE block sizes. The search of an alternative period for systematic bit puncturing is constrained by the properties of LTE QPP interleavers. More specifically, if the period is not a divisor of the interleaver length K, then a periodic puncturing in the 1st constituent code becomes aperiodic in the 2nd constituent code, thus making a systematic optimization intractable. Therefore, period 8 is the remaining choice for the systematic bit puncturing. Toward this end, a new algorithm is devised by setting =4 (i.e., skipping the first four columns of the systematic stream). This induces the following periodic puncturing mask for the Systematic stream: Systematic Puncturing Mask = [01111111011111110111111101111111]. (7)

For coding rates r 32/36, transmitted bits for RV=0 consist of 28 columns of the Systematic stream and more than 4 columns for each of the parity streams. Catastrophic puncturing is avoided in both constituent codes. For coding rates r 32/36, transmitted bits for RV=0 consist of 28 columns of the Systematic stream and no more than 4 columns for each of the parity streams. Catastrophic puncturing avoidance cannot be guaranteed. However, as with previous two algorithms, iterative decoding could provide statistical avoidance for certain block sizes. From this analysis, it can be expected that the performance of the CBRM (=4,=4) algorithm to be competitive for rates r 32/36= 0.89. The performance degrades as the coding rate increases as some critical parity bits will be left out and statistical avoidance kicks in and the degradation could be severe for certain block sizes when statistical catastrophic puncturing avoidance is not effective. IV. PERFORMANCE ANALYSIS Extensive simulation results based on the following setup is generated to verify the analysis presented in the previous sections. The performance of the LTE turbo code with the 188 block sizes corresponding to all LTE QPP interleavers [1] is tested at six code rates: r = 0.4, 0.5, , 0.9, with RV=0. The simulations assume an additive white Gaussian noise channel with QPSK modulation and 8 iterations of an enhanced maxlog-MAP turbo decoder with extrinsic scaling factor of 0.75 [10]. It is noted here that during the performance analysis, it was also found that the performance of RM algorithms also depend on whether the simple max-log-MAP algorithm or the enhanced max-log-MAP algorithm is used. Figure 3(a) shows the SNR required for the CBRM(=2,=0) algorithm at code rate r=0.9 and three different BLER targets (10%, 1% or 0.2%) are plotted with blue curves. It can be observed the SNR required varies widely indicating inconsistent performance (or SNR spikes). Furthermore, the performance tends to deteriorate with increasing block sizes, thereby negating the turbo code interleaver gain that improves performance at longer blocks. The improvement with the CBRM(=2,=1) algorithm is shown in Figure 3(b). Most of the SNR spikes are removed and the performance trends are consistent with typical turbo code behavior. Thus, the improved rate-matcher setting =1 results in an improvement of up to 4 dB compared to the CBRM with =0. However, at low BLER such as 2e-3, the performance improvement is not substantial at a few block sizes (including K = 208, 1792 and 5824). Performance for these sizes can be improved by simply choosing a different value for (not preferred for implementation). Figure 3(c) shows the performance the CBRM(=4,=4) algorithm. There is consistent performance for all block sizes larger than 1000 bits though some SNR spikes remain for small block sizes. It is further observed that the CBRM(=4,=4) algorithm is slightly worse than the CBRM(=2,=1) algorithm for small blocks, perhaps due to large amount systematic puncturing associated with the CBRM(=4,=4) algorithm.

Since all QPP interleaver lengths in LTE [1] are divisible by 8, we have QPP(8k) mod 8 = 0 (8) for any integer k. Thus, for all the LTE QPP permutations, this setting introduces identical periodic-8 systematic bit puncturing for both constituent codes. The corresponding critical parity bit mask is given by the following [9]: Critical Parity Bit Mask = [CxCxCCCxCxCxCCCxCxCxCCCxCxCxCCCx]. (9)

For the Parity 1 stream, the critical bit mask after column permutation is given by Critical Bit Mask for Parity 1 CB = [CCCCCCCCCCCCCCCCxxxxCCCCxxxxxxxx]. (10)

With offset =4, the critical bit mask for the Parity 2 stream becomes Critical Bit Mask for Parity 2 CB = [CCCCCCCCCCCCCCCCCCCCxxxxxxxxxxxx]. We can make the following analysis. (11)

(a) CBRM(=2,=0)

(b) CBRM(=2,=1)

(c) CBRM(=4,=4)

Figure 3 Required Eb/N0 for code rate r=0.9. Blue curves show the actual SNR values for the indicated BLER targets and block sizes. Red curves represent estimated trends when SNR spikes are removed.

(a) CBRM(=2,=1)

(b) CBRM(=4,=4)

Figure 5 Required Eb/N0 for code rate r=0.7.

Figure 4 Required Eb/N0 for code rate r=0.8. Blue curves show actual SNR values for the indicated BLER targets and block sizes. Red curves represent estimated trends.

Figure 4 shows the performance of the CBRM(=2,=1) and CBRM(=4,=4) algorithms at code rate r=0.8. The figure shows the benefits of avoiding catastrophic puncturing in both constituent codes. At code rates of r=0.7 and lower, the three RM algorithms perform identically as shown in Figure 5. In general, the performance spikes at very low BLERs and high coding rates are not problematic because high coding rates usually occur in initial transmissions where relatively high BLERs are targeted. Subsequent retransmissions based on IR lead to a lower coding rate and relatively low target BLER. Thus, both CBRM(=2,=1) and CBRM(=4,=4) are suitable candidates for LTE RM due to their consistent performances across the wide-range of code rates, block sizes and BLERs. V. CONCLUSION Based on the algebraic properties of the QPP interleavers and the 8-state recursive systematic convolutional code, three ratematching algorithms based circular buffers were studied for LTE. The baseline CBRM(=2,=0) algorithm was shown to provide deteriorating performance at high coding rates. The improved CBRM(=2,=1) algorithm eliminates most of these performance instabilities and achieves good performance over wide range of code rates. A third algorithm CBRM(=4,=4) was also shown to provide more consistent performance for

certain scenarios. In conclusion, the CBRM(=2,=1) algorithm offers balanced performance over the block size and code rate ranges of interest. This algorithm was hence adopted for the LTE system. REFERENCES
3GPP Technical Specifications 36.212, Multiplexing and Channel Coding (Release 8), 2008. [2] J. Sun and O. Y. Takeshita, Interleavers for turbo codes using permutation polynomials over integer rings, IEEE Trans. Inform. Theory, vol. 51, no. 1, pp. 101119, Jan. 2005 [3] O. Y. Takeshita, On maximum contention-free interleavers and permutation polynomials over integer rings, IEEE Trans. Inform. Theory, vol. 52, no. 3, pp. 12491253, Mar. 2006. [4] L. Korowajczuk, Designing CDMA2000 Systems, John Wiley and Sons, 2004. [5] I. Land and P. Hoeher, Partially Systematic Rate Turbo Codes, Proc. 2nd Intl Symp. on Turbo Codes, pp.287-290, Sep., 2000. [6] Rowitch, D.N.; Milstein, L.B.; "On the performance of hybrid FEC/ARQ systems using rate compatible punctured turbo (RCPT) codes", IEEE Trans. on Comm., vol. 48, no. 6, pp. 948959, June 2000. [7] S. ten Brink, Code Doping for Triggering Iterative Decoding Convergence, Proc ISIT 2001, Washington DC. [8] S. Lin and D. J. Costello, Jr., Error Control Coding: Fundamentals and Applications, Prentice-Hall, Inc., 1983. [9] S. Crozier, P. Guinand, and A. Hunt, On Designing Turbo-Codes with Data Puncturing, Proc. of 2005 Canadian Workshop on Information Theory (CWIT 2005), June, 2005. [10] R. Pyndiah, A. Glavieux, A. Picart and S. Jacq, Near-optimal decoding of product codes, Proc. IEEE GlobeCom94, pp. 339343, 1994. [1]

You might also like