Conference PaperPDF Available

Low-Complexity Software Stack Decoding of Polar Codes

May 2018

May 2018

DOI:10.1109/ISCAS.2018.8351832

Conference: 2018 IEEE International Symposium on Circuits and Systems (ISCAS)

Authors:

Harsh Aurora

McGill University

Carlo Condo

Infinera

FER curves for different decoding algorithms, P C(512, 256).

…

Average number of of iterations for decoder success/failure, with and without early termination, P C(512, 256).

…

Figures - uploaded by Carlo Condo

Content may be subject to copyright.

Content uploaded by Carlo Condo

Content may be subject to copyright.

Low-Complexity Software Stack Decoding of Polar

Codes

Harsh Aurora, Carlo Condo, Warren J. Gross

Department of Electrical and Computer Engineering, McGill University, Montr´

eal, Qu´

ebec, Canada

Email: harsh.aurora@mail.mcgill.ca, carlo.condo@mcgill.ca, warren.gross@mcgill.ca

Abstract—Polar codes are a recent class of linear error-

correcting codes that asymptotically achieve the channel capacity

at inﬁnite code length. The Successive Cancellation List (SCL)

algorithm yields very good error-correction performance, at the

cost of high implementation complexity. The Stack (SCS) de-

coding algorithm provides similar error-correction performance

at a lower complexity. In this work, we propose an efﬁcient

software implementation of the SCS decoding algorithm, along

with techniques to further reduce its computational complexity.

In particular, we reduce the SCS memory requirements through

efﬁcient path switching, replace the stack sorting with a linear

search, and explore the use of a partial CRC along with an

early termination criterion. Using the proposed methods, we are

able to reduce the computational complexity of the SCS decoder,

reducing the number of estimated bits up to 97% with respect

to SCL, while maintaining similar error-correction performance

as SCL.

I. INTRODUCTION

Polar codes [1] are the ﬁrst error-correcting codes that can

provably achieve channel capacity, and they have been selected

as a coding scheme for the 5th generation wireless systems

standards (5G) [2]. The ﬁrst proposed decoding algorithm is

the successive-cancellation (SC) algorithm [1]. While its error-

correction performance is able to reach channel capacity at

inﬁnite code length, it is mediocre at practical code lengths.

Thus, many improvements to SC have been proposed in the

past years: list SC (SCL) [3] and its evolutions [4]–[6] have

gathered the interest of academia and industry alike thanks

to their substantial error-correction performance gains. They

rely on multiple parallel SC decoders working on different

possible candidate codewords, and on dedicated metrics to

identify the most likely one. SCL decoders thus suffer from

high computational complexity.

Similar to the concept used in SCL, SCS has been proposed

in [7] and improved upon in [8], [9]. It relies on a set

of codeword candidates, of which only the most likely is

extended. Unlike SCL, the amount of memory required by

SCS is variable. This cannot easily lead to actual memory

reduction in hardware decoders, where memory usually is

sized at design time considering the worst case. The ﬂexible

nature of SCS is instead well suited for software decoders,

whose inherent adaptability can be exploited in base stations.

Current polar code software decoders suffer from longer la-

tency and lower throughput with respect to hardware decoders

[10], [11]. Fast software decoders such as [12] require parallel

implementations on powerful, power-hungry platforms.

In this work, we present an efﬁcient software implementa-

tion of the SCS algorithm in which the decoder tree has the

same memory requirement as that of SC, improving over [13].

Our software implementation replaces the stack sorting with

a linear search. We then propose an early CRC check in the

message bits, that provides a reduction in computational com-

plexity and latency. Lastly, we describe an early termination

criterion based on this CRC check, which enables us to further

reduce the computational complexity of the SCS decoder while

maintaining similar error-correction performance as SCL.

II. PRELIMINARIES

A polar code P C(N , K)of code length Nand rate R=

K/N is a linear block code that identiﬁes Kreliable bit-

channels, used to transmit information, and N−Kunreliable

ones, frozen at a known value. Polar codes are encoded by

multiplying the information/frozen bit vector by the generator

matrix G⊗n, i.e. the n-th Kronecker product of the polarization

matrix G= [ 1 0

1 1 ].

The SC decoding algorithm can be viewed as a recursive

binary tree search. A node receives from its parent a vector

of log-likelihood ratios (LLRs) α: at the tree stage λ, nodes

compute the left αl={αl

0, αl

1, . . . , αl

2λ−1−1}and right αr=

{αr

0, αr

1, . . . , αr

2λ−1−1}LLR vectors. These are transmitted to

child nodes:

αl

i= sgn(αi) sgn(αi+2λ−1) min(αi, αi+2λ−1), (1)

αr

i=αi+2λ−1+ (1 −2βl

i)αi, (2)

with LLRs at the root node initialized as the LLRs received

from the channel. The right hand terms in Eq. (1) and (2)

are also known as the fand gfunctions respectively. The

partial sums βreceived from the left and right child nodes

are calculated as:

βi=βl

i⊕βr

i,if i≤2λ−1

βr

i,otherwise. (3)

where ⊕is the XOR operation, and 0≤i < 2λ. At leaf nodes,

the βvalue and the estimated bit vector ˆuN−1

0are computed

βi=0,when αi≥0or iis frozen;

1,otherwise. (4)

The SCL decoding algorithm [3] improves the error-

correction performance of SC by relying on Lparallel SC

decoding paths. Every time an information bit is estimated,

both possible values 0and 1are investigated and 2Lpaths are

created. Each path is associated to a path metric PM, and the

Lpaths with the highest PM are discarded. In the LLR-based

formulation of SCL [4], the PM can be computed as

PM−1l= 0,

PMil=(PMi−1l+|αil|, if ˆuil6=1

2(1 −sgn (αil)) ,

PMi−1l, otherwise, (5)

where lis the path index and ˆujlis the estimate of bit jat path

l. The main limitation of the SCL decoder is a high degree

of complexity: it has a space complexity of O(LN)and time

complexity of O(LN log2N).

The SCS algorithm addresses the high complexity issues of

the SCL decoder by employing a priority queue (PQ) of size

D, in which the candidate paths are stored. Every time a bit

is estimated, the decoder only extends the most probable path

from the queue. An additional list-like parameter Lis used to

limit the number of paths in the queue. If a path of length φ

is extracted Ltimes from the queue, all paths with length less

than φare deleted from the queue.

III. MEMORY EFFICI EN T SOF TWAR E STACK DECODER

In this section, we describe our software implementation

of the SCS decoder. The main improvements over existing

work in [7]–[9], [13] include reducing the decoding tree spatial

complexity to O(N)and replacing the stack sorting step with a

linear search over the stack. We calculate our bit probabilities

in the LLR domain, and make use of the path metric from Eq.

(5). The probability calculation and bit propagation is based on

the approach in [3]. We begin by outlining the data-structures

used in our SCS implementation.

•P: A 2-D ﬂoat array with which the LLR of a bit index is

recursively calculated. It consists of nrows, where each

row is a probability array of size 2λλ∈[0, n].

•C: A 3-D bit array where the estimated bits are stored

and recursively propagated for gfunction calculations.

•PM: Array of size Dthat stores path metrics.

•PL: Array of size Dthat stores path lengths.

•PL hits: Array of size Nin which the value at each

index φindicates the number of times a path of length φ

was extracted from the PQ.

•paths: A 2-D bit array that stores the paths in the PQ.

•inactive path indices: an integer stack of depth D

that contains inactive path indices.

•active path: A boolean array of size Dthat indicates

whether a path is active or not.

In addition to these, the SCS decoder makes use of the

following variables:

•T: Total number of active paths in the stack.

•min index: Index of path with minimum path metric.

•max index: Index of path with maximum path metric.

•path switch: Boolean that indicates a path switch.

The main loop of the SCS decoder is described in Algorithm 1,

while the most important functions are detailed in Algorithms

2-6. First, the data structures are initialized. The memory for

P,C,PL,PM and paths does not need to be initialized, as it

Algorithm 1: SCS Decoder, Main Loop

Input : received vector yN−1

Output: estimated message bits ˆmK−1

1initialize data structures();

2min idx =assign initial path();

3for φ= 0,1, . . . , N −1do

4P[0][φ] = L0(yφ);

5while (1) do

6recursively calc P(n, PL[min index]);

7pm0 = calc new pm(PM[min index], P[n][0], 0);

8pm1 = calc new pm(PM[min index], P[n][0], 1);

9if (PL[min index]∈Ac)then

10 extend path(min index, 0, pm0);

11 else

12 if (T== D)then

13 if (PM[max index]>max(pm0, pm1)) then

14 kill path(max index);

15 if pm0 <pm1 then

16 if (T< D)then

17 max index = clone path(min index);

18 extend path(max index, 1, pm1);

19 extend path(min index, 0, pm0);

20 else

21 if (T== D)then

22 max index = clone path(min index);

23 extend path(max index, 0, pm0);

24 extend path(min index, 1, pm1);

25 update min max index();

26 update length info();

27 if (end check() == 1) then

28 break;

29 if path switch then

30 load path();

31 φ=PL[min index]−1;

32 C[φmod 2][n][0];

33 if ((φmod 2) == 1)then

34 recursively update C(n, PL[min index]−1);

35 for φ= 0,1, . . . , K −1do

36 ˆmφ=paths[min index][Aφ];

is set up as new paths are created. The initial path is assigned

to the min index and the channel LLRs are populated at the

top of the probability tree P.

In the while loop (line 5 to 34), the LLR for the current bit

of the most reliable path is calculated. Lines 9 and 10 extend

this path in the event of a frozen bit (i.e. bit index belongs

to frozen set AC). In the case of a message bit, lines 12-14

ﬁrst check if the PQ is full and if both the new guesses are

better than the worst path in the PQ. If this is true, then the

Algorithm 2: initialize data structures()

1clear(inactive path indices);

2for p= 0,1, . . . , D −1do

3push(inactive path indices,p);

4active path[p]=false;

5for φ= 0,1, . . . , N −1do

6PL hits[φ]= 0;

Algorithm 3: assign initial path()

Output: Index pof initial path

1p=pop(inactive path indices);

2active path[p]=true;

3PM[p]= 0.0;

4PL[p]= 0;

5T= 1;

worst path is killed. Lines 15-24 extend the best path along

the more reliable guess and place the other guess in the PQ if

there is space.

The function update min max length is then called to

update min index,max index and path switch. The indices

of the paths with the maximum and minimum path metrics are

identiﬁed in a single loop of at most O(D)complexity, which

eliminates the need to sort all the paths in the PQ, since these

are the only paths that will have to be extended or deleted in

the current iteration of the decoder. Furthermore, by keeping

track of path switching it is possible to reuse the values in the

Pand Cmemory just like an SC decoder, as long as SCS is

extending the same path. In case of a path switch, the new

path needs to be loaded into the Pand Cmemory only once,

and then they can be reused until the path switches again. This

enables us to reduce the space complexity while maintaining

the computational complexity between switches.

Next, update length info is called, which checks if the

current path length has been investigated Ltimes, and kills

all shorter paths if so. Then, the call to end check causes the

algorithm to break out of the while loop if the PQ is empty

or if the length of the current path has reached N. Finally, a

new path is loaded in case of a switch, and the last bit of the

current path is updated in the Cmemory. Upon exiting the

while loop, the index of the decoded path is in min index:

the decoder copies the bits of the unfrozen set Ainto the

estimated message bit vector, and the algorithm terminates.

The probability and bit trees Pand Chave a space com-

plexity of O(N), equal to that of the SC decoder. The PL

and PM arrays have a space complexity of O(D), while the

paths memory has a space complexity of O(ND). Since the

frozen values are already known and only the message bits in

the path need to be saved, the paths memory can be further

compressed to a space complexity of O(KD)at the cost of

the decoder only being able to support a maximum ﬁxed rate.

Algorithm 4: clone path()

Input : Index pof path to clone

Output: Index p0of cloned path

1p0=pop(inactive path indices);

2active path[p0]=true;

3PM[p0]=PM[p];

4PL[p0]=PL[p];

5T=T+1;

6for φ= 0,1,...,PL[p]−1do

7paths[p0][φ]=paths[p][φ];

Algorithm 5: recursively calc P()

Input: Layer λand phase φ

1if λ= 0 then

2return;

3ψ=φ/2;

4if ((φmod 2)== 0)or (path switch == 1) then

5recursively calc P(λ−1, ψ);

6for β= 0,1,...,2n−λ−1do

7if ((φmod 2)== 0)then

8P[λ][β] = f(P[λ−1][2β], P[λ−1][2β+1]);

9else

10 u = C[0][λ][β];

11 P[λ][β] = g(P[λ−1][2β], P[λ−1][2β+1], u);

Algorithm 6: load path()

1for φ= 0,1,...,PL[min index]−1do

2C[φmod 2][n][0] = paths[min index][φ];

3if ((φmod 2)== 1)then

4recursively update C(n, φ);

IV. FURT HE R COMPLEXITY REDUCTION

We deﬁne an “iteration” as a decoder estimating a particular

bit index in a candidate path. Thus, the SC and SCL decoding

algorithms have a ﬁxed number of iterations Nand NL

respectively, while the SCS decoder has a variable number

of iterations depending on Eb/N0. This number converges to

Niterations as Eb/N0increases.

Studies presented in [14] have shown that decoding failures

are typically caused by a limited number of errors introduced

by the channel (1-3 channel errors). These errors are more

likely to occur at bit indices with low reliability, that are found

early on in the polar codeword, and thus decoded earlier.

We propose to protect the ﬁrst γinformation bits encoun-

tered along the SC decoding tree with a CRC of length

Cγ. When the SCS decoder reaches a candidate path with

γmessage bits, it can perform a CRC check and kill the path

in case the CRC fails. Paths that fail the CRC still result in

an increment of PL hits, and therefore the SCS decoder will

have at most Lpaths that have passed this initial CRC.

It is possible, especially at low Eb/N0, that incorrect paths

0 1 2 3

10−5

10−4

10−3

10−2

10−1

100

Eb/No

FER

SCL (L=32)

SCS

SCS-ET

Fig. 1. FER curves for different decoding algorithms, P C(512,256).

pass this initial CRC, or that the correct path gets killed

before or shortly after the CRC check, due to errors in the

CRC bits. In such cases the SCS decoder performs many

useless iterations only to result in a decoding failure. We

propose to introduce an early termination criterion by deﬁning

a maximum number of iterations Mit the decoder is allowed

to take before failure is declared. Mit is initialized to 2LN:

in the event of an initial CRC failure, Mit is penalized by N

iterations, corresponding to the path it has just removed from

consideration. An early termination criterion for SCS decoders

has also been proposed in [15]. However, the parameters of

the method described in [15] depend on channel conditions,

and the early termination comes at a cost in FER; our

approach (SCS-ET) is instead channel-independent and causes

negligible error-correction performance degradation.

V. SIMULATION RESU LTS

Simulation results are presented for P C(512,256) con-

structed for an AWGN channel with σ2= 0.5. The parameter

Lis set to 32 for SCL, SCS, and SCS-ET. The stack depth

Dis set to LN = 16,384 for the SCS and SCS-ET decoders.

Finally, SCS-ET has initial CRC parameters set to γ= 16, Cγ

= 8 and a CRC polynomial 0xD5.

Fig. 1 shows the frame error rate (FER) for the considered

algorithms. It can be seen that SCS and SCS-ET provide

similar error-correction performance as SCL. Fig. 2 shows that

on average the SCS decoder takes fewer iterations than the

SCL decoder, with a gain ranging between 48% and 97%,

and at high Eb/N0it converges to SC complexity. It can

be observed that by using a CRC on γinformation bits and

the early termination criterion, the complexity of the SCS-

ET decoder has been reduced, gaining 1% to 50% over SCS

and 71% to 97% over SCL. Finally, Fig. 3 shows that the

CRC check in the SCS-ET decoder reduces the number of

iterations by 1% to 28% with respect to SCS in case of a

successful decoding, while the CRC combined with the early

termination criterion yields a gain ranging between 31% to

53% in iterations over SCS in case of failed decoding. SCS-ET

0 1 2 3

103

104

Eb/No

Average Iterations

SCL (L=32)

SCS

SCS-ET

Fig. 2. Average number of iterations for different decoding algorithms,

P C(512,256).

0 1 2 3

103

104

Eb/No

Average Iterations

SCS Pass

SCS Fail

SCS-ET Pass

SCS-ET Fail

Fig. 3. Average number of of iterations for decoder success/failure, with and

without early termination, P C(512,256).

thus requires 66%−97% and 75%−95% fewer iterations than

SCL in case of successful and failed decoding, respectively.

VI. CONCLUSION

In this work, we have presented an efﬁcient software im-

plementation of the SCS decoding algorithm for polar codes.

It replaces the stack sorting step with a linear search over

the stack, and guarantees the same spatial complexity as SC

to compute the path probabilities, with additional memory

required only for storing paths in the queue. We have also pro-

posed a partial CRC check as an effective noise-independent

method to reduce the SCS time complexity, along with an early

termination criterion. Simulation results show up to a 97%

iteration gain with respect to SCL, with negligible degradation

in error-correction performance.

REFERENCES

[1] E. Arikan, “Channel polarization: A method for constructing capacity-

achieving codes for symmetric binary-input memoryless channels,” IEEE

Transactions on Information Theory, vol. 55, no. 7, pp. 3051–3073, July

2009.

[2] “Final report of 3GPP TSG RAN WG1 #87 v1.0.0,”

http://www.3gpp.org/ftp/tsg ran/WG1 RL1/TSGR1 87/Report/Final

Minutes report RAN1%2387 v100.zip, Reno, USA, November 2016.

[3] I. Tal and A. Vardy, “List decoding of polar codes,” IEEE Transactions

on Information Theory, vol. 61, no. 5, pp. 2213–2226, May 2015.

[4] A. Balatsoukas-Stimming, M. B. Parizi, and A. Burg, “LLR-based

successive cancellation list decoding of polar codes,” IEEE Transactions

on Signal Processing, vol. 63, no. 19, pp. 5165–5179, Oct 2015.

[5] S. A. Hashemi, C. Condo, and W. J. Gross, “Simpliﬁed successive-

cancellation list decoding of polar codes,” in 2016 IEEE International

Symposium on Information Theory (ISIT), July 2016, pp. 815–819.

[6] ——, “Fast simpliﬁed successive-cancellation list decoding of polar

codes,” in 2017 IEEE Wireless Communications and Networking Con-

ference Workshops (WCNCW), March 2017, pp. 1–6.

[7] K. Niu and K. Chen, “Stack decoding of polar codes,” Electronics

Letters, vol. 48, no. 12, pp. 695–697, June 2012.

[8] ——, “CRC-aided decoding of polar codes,” IEEE Communications

Letters, vol. 16, no. 10, pp. 1668–1671, October 2012.

[9] K. Chen, K. Niu, and J. Lin, “Improved successive cancellation decoding

of polar codes,” IEEE Transactions on Communications, vol. 61, no. 8,

pp. 3100–3107, August 2013.

[10] Y. Shen, C. Zhang, J. Yang, S. Zhang, and X. You, “Low-latency soft-

ware successive cancellation list polar decoder using stage-located copy,”

in 2016 IEEE International Conference on Digital Signal Processing

(DSP), Oct 2016, pp. 84–88.

[11] P. Giard, G. Sarkis, C. Leroux, C. Thibeault, and W. J. Gross,

“Low-latency software polar decoders,” in Journal of Signal Processing

Systems, to appear. [Online]. Available: http://arxiv.org/abs/1504.00353

[12] B. L. Gal, C. Leroux, and C. Jego, “Multi-Gb/s software decoding of

polar codes,” IEEE Transactions on Signal Processing, vol. 63, no. 2,

pp. 349–359, Jan 2015.

[13] V. Miloslavskaya and P. Trifonov, “Sequential decoding of polar codes,”

IEEE Communications Letters, vol. 18, no. 7, pp. 1127–1130, July 2014.

[14] O. Aﬁsiadis, A. Balatsoukas-Stimming, and A. Burg, “A low-complexity

improved successive cancellation decoder for polar codes,” in 2014 48th

Asilomar Conference on Signals, Systems and Computers, Nov 2014, pp.

2116–2120.

[15] P. Trifonov, V. Miloslavskaya, and R. Morozov, “Fast sequential

decoding of polar codes,” CoRR, vol. abs/1703.06592, 2017. [Online].

Available: http://arxiv.org/abs/1703.06592

Soft-Output Successive Cancellation Stack Polar Decoder

Article

Full-text available

May 2021

Polar coding has been ratified for employment in the 3GPP New Radio standard and several soft-decision decoders achieved comparable performance to that of the state-of-the-art successive cancellation list decoder. Aiming for further improving the performance of the soft-decision polar decoders, we propose a soft-output successive cancellation stack (SSCS) polar decoder, which jointly exploits the benefits of the depth-first search of the stack decoder and the soft information output of the belief propagation decoder. This has the substantial benefit of facilitating soft-input soft-output (SISO) decoding and seamless iterative information exchange in turbo-style receivers. As a further contribution, we intrinsically amalgamate our SSCS decoder into polar-coded large-scale multiple-input multiple-output (MIMO) systems and conceive an iterative turbo receiver, operating on the basis of logarithmic likelihood ratios (LLRs). Our simulation results show that the proposed SSCS decoder is capable of outperforming the state-of-the-art SISO polar decoders, despite requiring a lower complexity at moderate to high signal-to-noise ratios (SNRs). Additionally, compared with the non-iterative hard-output SCS decoder, our SSCS scheme attained 1.5 dB SNR gain at a bit error ratio level of $10^{-5}$ , when decoding the [256,512] polar code of a $(64\times 64)$ MIMO system.

Improved generalized successive cancellation list flip decoder of polar codes with fast decoding of special nodes

Article

Full-text available

Dec 2021

In this paper, an improvement for SC list flip (SCL-Flip) decoding is presented for polar codes. A novel bit-selection metric for critical set (set of information symbols of polar codes being flipped during additional decoding attempts) based on path metric of successive cancellation list (SCL) decoding is suggested. With the proposed metric, the improved SCL scheme based on special nodes (SN) decoders was developed. This decoder will be denoted by GSCLF. The main idea of the proposed decoder is joint using of two approaches: first one is a fast decoding of special nodes in binary tree representation of polar code (e.g., some special nodes in tree representation of polar code that allow efficient list decoding with low complexity) and the second one is an applying of additional decoding attempts (flips) in the case when initial decoding was erroneous. The simultaneous use of these two approaches results in both a significant reduction in spatial complexity and a significant reduction in the number of computations required for decoding whereas keeping excellent performance. Simulation results presented in this paper allow us to conclude that the computational complexity of the proposed GSCLF decoder is from 66% to 80% smaller than the one of SCL-32 decoder.

Fast Polar Decoding With Successive Cancellation List Creeper Algorithm

Article

Full-text available

Jan 2024

Polar codes have emerged as a focal point in the field of error-correcting codes, owing to their remarkable capacity-achieving characteristics and their relevance in various modern communication systems. The basic successive cancellation (SC) approach is not optimal to use in terms of the trade-off between performance and decoding complexity. SC-Creeper algorithm performs better with about the same low complexity as the SC version of the algorithm. However, the SC-Creeper algorithm did not have the ability to use the candidate list as a measure to improve performance and refine the search for the true codeword. To compare with successive cancellation list (SCL) approach and the ability to use more computing memory, the SCL-Creeper method was developed, using two additional lists. This method can also be used as a development of Fano algorithms for polar codes (mainly, Fano decoding in polar decoding does not use lists). This paper addresses the challenge of computational complexity in polar code decoding by integrating a list structure with the SC-Creeper algorithm. Building on prior research that introduced the concept of SC-Creeper, the study focuses on enhancing error correction performance while mitigating computational burden. The first chapters describe the polar encoding process and basic decoding technologies, then discuss the basic Creeper algorithm. In the following chapters, the authors describe a modified version of the two-list Creeper approach (that is, the SCL-Creeper version of the algorithm). Extensive simulations and numerical analysis presented in the paper underscore the tangible advantages of this novel decoding strategy. Leveraging the basic list algorithm, renowned for its superior error correction capabilities, the research explores the integration of Creeper to systematically prune unnecessary decoding paths. The resulting SCL-Creeper hybrid approach aims to strike a balance between error correction efficiency and computational complexity. Finally, the optimal selection of parameters for the SCL-Creeper approach and future directions in the research of the list version of the fast Creeper algorithm are discussed.

Sequential Polar Decoding with Cost Metric Threshold

Article

Full-text available

Feb 2024

Polar codes have established themselves as a cornerstone in modern error correction coding due to their capacity-achieving properties and practical implementation advantages. However, decoding polar codes remains a computationally intensive task. In this paper, we introduce a novel approach to improve the decoding efficiency of polar codes by integrating the threshold-based SC-Creeper decoding algorithm, originally designed for convolutional codes. Our proposed decoder with an additional cost function seamlessly merges two established decoding paradigms, namely the stack and Fano approaches. The core idea is to leverage the strengths of both decoding techniques to strike a balance between computational efficiency and performance, with an additional method of controlling movement along a code tree. Simulations demonstrate the superiority of the proposed improved SC-Creeper decoder with tuned parameters. The improved SC-Creeper decoder achieves the performance of the CA-SCL-8 decoder in terms of high code rates and overcomes it in terms of the N=1024 code length, while simultaneously surpassing the efficiency of the traditional Fano decoding algorithm.

Deep learning-enabled polar code decoders for 5G networks and beyond

Article

Mar 2024
AEU-INT J ELECTRON C

New Structure of Channel Coding: Serial Concatenation of Polar Codes

Article

Jan 2023

On the improvements of successive cancellation Creeper decoding for polar codes

Article

Mar 2023
DIGIT SIGNAL PROCESS

Successive Cancellation Creeper Decoding of Polar Codes

Conference Paper

Nov 2022

Bidirectional stack decoding of polar codes

Article

Full-text available

Apr 2021

Introduction/purpose: The paper introduces a reduced latency stack decoding algorithm of polar codes, inspired by the bidirectional stack decoding of convolutional codes and based on the folding technique. Methods: The stack decoding algorithm (also known as stack search) that is useful for decoding tree codes, the list decoding technique introduced by Peter Elias and the folding technique for polar codes which is used to reduce the latency of the decoding algorithm. The simulation was done using the Monte Carlo procedure. Results: A new polar code decoding algorithm, suitable for parallel implementation, is developed and the simulation results are presented. Conclusions: Polar codes are a class of capacity achieving codes that have been adopted as the main coding scheme for control channels in 5G New Radio. The main decoding algorithm for polar codes is the successive cancellation decoder. This algorithm performs well at large blocklengths with a low complexity, but has very low reliability at short and medium blocklengths. Several decoding algorithms have been proposed in order to improve the error correcting performance of polar codes. The successive cancellation list decoder, in conjunction with a cyclic redundancy check, provides very good error-correction performance, but at the cost of a high implementation complexity. The successive cancellation stack decoder provides similar error-correction performance at a lower complexity. Future machine-type and ultra reliable low latency communication applications require high-speed low latency decoding algorithms with good error correcting performance. In this paper, we propose a novel decoding algorithm, inspired by the bidirectional stack decoding of classical convolutional codes, with reduced latency that achieves similar performance as the classical successive cancellation list and successive cancellation stack decoding algorithms. The results are presented analytically and verified by simulation.

An Efficient Software Stack Sphere Decoder for Polar Codes

Article

Feb 2020

Proved to achieve the symmetric capacity of the binary-input discrete memoryless channels, polar codes have been chosen for the eMBB control channels in the 5th generation mobile communication systems. Besides the main decoding algorithms like successive cancellation (SC) decoding and CRC-aid SC list (CA-SCL) decoding, sphere decoder (SD) and list SD (LSD) are the alternatives for short codes with less required memory bits. Existing SD and LSD attain high calculation complexity, for SD requires a back-tracking process and LSD needs a large list size L to achieve satisfying performance. To reduce complexity, an efficient software stack sphere decoder (ESSD) based on the synchronous determination is firstly proposed in this article. With the dynamic set-by-set decoding in the stack structure, it achieves the lowest complexity in SD-based decoders (SD/LSD/ESSD) while sharing the same performance on low-rate codes and high-rate codes. Compared with the CA-SCL decoder, the complexity and latency of the proposed ESSD are also competitive at high signal-to-noise-ratio on the displayed codes. Implemented on C++, the proposed ESSD reduces 44.77% latency compared with CA-SCL-32 for P(128, 120) at the BER of 10 -5 with E b /N 0 = 7 dB.

Fast Sequential Decoding of Polar Codes

Article

Full-text available

Mar 2017

An extension of the stack decoding algorithm for polar codes is presented. The paper introduces a new score function, which enables one to accurately compare paths of different length. This results in significant complexity reduction with respect to the original stack algorithm at the expense of negligible performance loss.

Simplified Successive-Cancellation List Decoding of Polar Codes

Conference Paper

Full-text available

Jul 2016

Multi-Gb/s Software Decoding of Polar Codes

Article

Full-text available

Jan 2015

This paper presents an optimized software implementation of a Successive Cancellation (SC) decoder for polar codes. Despite the strong data dependencies in SC decoding, a highly parallel software polar decoder is devised for x86 processor target. A high level of performance is achieved by exploiting the parallelism inherent in today's processor architectures (SIMD, multicore, etc.). Some optimizations that were originally thought for hardware implementation (memory reduction techniques and algorithmic simplifications) were also applied to enhance the throughput of the software implementation. Finally, some low level optimizations such as explicit assembly description or data packing are used to improve the throughput even more. The resulting decoder description is implemented on different x86 processor targets. An analysis of the decoder in terms of latency and throughput is proposed. The influence of several parameters on the throughput and the latency is investigated: the selected target, the code rate, the code length, the SIMD mode (SSE/AVX), the multithreading mode, etc. The energy per decoded bit is also estimated. The proposed software decoder compares favorably with state of the art software polar decoders. Extensive experimentations demonstrate that the proposed software polar decoder exceeds 1 Gb/s for code lengths N ≤ 217 on a single core and reaches multi-Gb/s throughputs when using four cores in parallel in AVX mode.

A Low-Complexity Improved Successive Cancellation Decoder for Polar Codes

Conference Paper

Full-text available

Dec 2014

Under successive cancellation (SC) decoding, polar codes are inferior to other codes of similar blocklength in terms of frame error rate. While more sophisticated decoding algorithms such as list- or stack-decoding partially mitigate this performance loss, they suffer from an increase in complexity. In this paper, we describe a new flavor of the SC decoder, called the SC flip decoder. Our algorithm preserves the low memory requirements of the basic SC decoder and adjusts the required decoding effort to the signal quality. In the waterfall region, its average computational complexity is almost as low as that of the SC decoder.

On Metric Sorting for Successive Cancellation List Decoding of Polar Codes

Conference Paper

Full-text available

Oct 2014

We focus on the metric sorter unit of successive cancellation list decoders for polar codes, which lies on the critical path in all current hardware implementations of the decoder. We review existing metric sorter architectures and we propose two new architectures that exploit the structure of the path metrics in a log-likelihood ratio based formulation of successive cancellation list decoding. Our synthesis results show that, for the list size of L=32, our first proposed sorter is 14% faster and 45% smaller than existing sorters, while for smaller list sizes, our second sorter has a higher delay in return for up to 36% reduction in the area.

LLR-Based Successive Cancellation List Decoding of Polar Codes

Conference Paper

Full-text available

May 2014

We present an LLR-based implementation of the successive cancellation list (SCL) decoder. To this end, we associate each decoding path with a `metric' which (i) is a monotone function of the path's likelihood and (ii) can be computed efficiently from the channel LLRs. The LLR-based formulation leads to a more efficient hardware implementation of the decoder compared to the known log-likelihood based implementation. Synthesis results for an SCL decoder with block-length of $N = 1024$ and list sizes of $L=2$ and $L=4$ confirm that the LLR-based decoder has considerable area and operating frequency advantages in the orders of $50\%$ and $30\%$, respectively.

Low-Latency Software Polar Decoders

Chapter

Aug 2017

The low-complexity encoding and decoding algorithms render polar codes attractive for use in SDR applications where computational resources are limited. In this chapter, we present low-latency software polar decoders that exploit modern processor capabilities. We show how adapting the algorithm at various levels can lead to significant improvements in latency and throughput, yielding polar decoders that are suitable for high-performance SDR applications on modern desktop processors and embedded-platform processors. These proposed decoders have an order of magnitude lower latency and memory footprint compared to state-of-the-art decoders, while maintaining comparable throughput. In addition, we present strategies and results for implementing polar decoders on graphical processing units. Finally, we show that the energy efficiency of the proposed decoders is comparable to state-of-the-art software polar decoders.

Low-latency software successive cancellation list polar decoder using stage-located copy

Conference Paper

Oct 2016

Successive cancellation list (SCL) decoding for polar codes is promising in data communication. However, in addition to L times complexity of conventional SC, both path selecting and updating result in extra complexity. In detail, the copy of intermediate values suffers from a long latency, especially when list size L is large. In this paper, a stage-located copy algorithm is proposed to avoid copying the same contents in candidate paths, which significantly reduces the processing latency. Furthermore, the resulting data processing speedup increases with code length. For (2048, 1723) polar codes, experimental results have shown that by employing the proposed stage-located copy, throughput of software-based SCL decoder with L = 32 achieves up to 1.1 Mbps throughput with 45% increase compared to the state-of-the-art software SCL decoders.

Sequential Decoding of Polar Codes

Article

Jul 2014

The problem of efficient decoding of polar codes is considered. A low-complexity sequential soft decision decoding algorithm is proposed. It is based on the successive cancellation approach, and it employs most likely codeword probability estimates for selection of a path within the code tree to be extended.

CRC-Aided Decoding of Polar Codes

Article

Oct 2012

CRC (cyclic redundancy check)-aided decoding schemes are proposed to improve the performance of polar codes. A unified description of successive cancellation decoding and its improved version with list or stack is provided and the CRC-aided successive cancellation list/stack (CA-SCL/SCS) decoding schemes are proposed. Simulation results in binary-input additive white Gaussian noise channel (BI-AWGNC) show that CA-SCL/SCS can provide significant gain of 0.5 dB over the turbo codes used in 3GPP standard with code rate 1/2 and code length 1024 at the block error probability (BLER) of 10-4. Moreover, the time complexity of CA-SCS decoder is much lower than that of turbo decoder and can be close to that of successive cancellation (SC) decoder in the high SNR regime.

Low-Complexity Software Stack Decoding of Polar Codes

Figures

Recommended publications

Design of Multilevel Polar Codes with Shaping

Polar Codes with Mixed Kernels

A Simple Proof of Fast Polarization

DEVICE FOR MONITORING PULSE AMPLITUDE.