[PPT] - Local Optimality Certificates for LP Decoding of Tanner Codes PowerPoint Presentation

SLIDE 1

1

Local Optimality Certificates for LP Decoding

f Tanner Codes

1 1th Haifa Workshop on Interdisciplinary Applications of Graph Theory , Combinatorics and Algorithms. May 2011

Nissim Halabi Guy Even

SLIDE 2

2

Error Correcting Codes – Worst Case

Vs. Average Case Analysis

An [N,K] linear code C – K-dimensional subspace of the vector space {0,1}N

min/2

d

p

min

d

p

C

dmin

Worst case analysis – assuming adversarial channel. e.g., how many bit flips, in any pattern, can decoding recover? Pr(fail : worst case) ~ Average case analysis – probabilistic channel e.g., given that every bit is flipped with probability p independently, what is the probability that decoding succeeds? possibly, Pr(fail : avg. case) <<

SLIDE 3

3

Error Correcting Codes for Memoryless Binary-Input Output-Symmetric Channels (1)

Memoryless Binary-Input Output-Symmetric Channel

characterized by conditional probability function P( y | c ) Errors occur randomly and are independent from bit to bit (memoryless) Assumes transmitted symbols are binary Errors affect ‘0’s and ‘1’s with equal probability (i.e., symmetric)

Examaple: Binary Symmetric Channel (BSC)

Noisy Channel Channel Decoding Channel Encoding

{ }

ˆ 0,1

N

c∈ { }

0,1

N

c∈ ⊆ C

1 1

yi ci

p p 1-p 1-p

codeword noisy codeword

N

y ∈

SLIDE 4

4

Error Correcting Codes for Memoryless Binary-Input Output-Symmetric Channels (2)

Log-Likelihood Ratio (LLR) λi for a received observation yi:

λi > 0  yi is more likely to be ‘0’ λi < 0  yi is more likely to be ‘1’

λ  y  replace y by λ

Noisy Channel Channel Decoding Channel Encoding

{ }

ˆ 0,1

N

c∈ { }

0,1

N

c∈ ⊆ C

N

y ∈

( ) ( ) ( )

/ /

| ln | 1

i i i i

Y X i i i i Y X i i

y x y y x λ   = =     =    

codeword noisy codeword

λ( )

SLIDE 5

5

Maximum-Likelihood (ML) Decoding

Maximum-likelihood (ML) decoding for any binary-input memory-less channel: Maximum-likelihood (ML) decoding formulated as a linear program:

( )

conv

arg min , arg min ,

x x

ML x x λ λ λ

∈ ∈

= =

C C

No Efficient Representation

C{0,1}N conv(C)[0,1]N

( )

arg min ,

x

ML x λ λ

∈

=

C

SLIDE 6

6

Linear Programming (LP) Decoding

Linear Programming (LP) decoding [Fel03, FWK05] – relaxation of the polytope conv(C) P : (1) All codewords x in C are vertices

(2) All new vertices are fractional (therefore, new vertices are not in C) (3) Has an efficient representation

Solve LP

( )

arg min ,

x

LP x λ λ

∈

=

P

( ) fractional

! LP fail λ ⇒

( ) ( ) ( )

integral LP LP ML λ λ λ ⇒ =

P C{0,1}N conv(C)[0,1]N fractional conv(C)  P

SLIDE 7

7

Linear Programming (LP) Decoding

Linear Programming (LP) decoding [Fel03, FWK05] – relaxation of the polytope conv(C) P : (1) All codewords x in C are vertices

(2) All new vertices are fractional (therefore, new vertices are not in C) (3) Has an efficient representation

LP decoder finds ML codeword

( )

arg min ,

x

LP x λ λ

∈

=

P

Solve LP

( ) fractional

! LP fail λ ⇒

( ) ( ) ( )

integral LP LP ML λ λ λ ⇒ =

P C{0,1}N conv(C)[0,1]N fractional conv(C)  P

SLIDE 8

8

Linear Programming (LP) Decoding

Linear Programming (LP) decoding [Fel03, FWK05] – relaxation of the polytope conv(C) P : (1) All codewords x in C are vertices

(2) All new vertices are fractional (therefore, new vertices are not in C) (3) Has an efficient representation

LP decoder fails

Solve LP

( ) ( ) ( )

integral LP LP ML λ λ λ ⇒ =

( ) fractional

! LP fail λ ⇒

( )

arg min ,

x

LP x λ λ

∈

=

P

P C{0,1}N conv(C)[0,1]N fractional conv(C)  P

SLIDE 9

9

Tanner Codes [Tan81]

Factor graph representation of Tanner codes:

Every Local-code node Cj is associated with linear code of length degG(Cj) Tanner code C(G) and codewords x : Extended local-code Cj  {0,1}N: extend to bits outside the local-code Example: Expander codes [SS’96] Tanner graph is an expander; Simple bit flipping decoding algorithm.

Variable nodes Local-Code nodes

1

C

2

C

3

C

5

C

4

C

10

x

1

x

3

x

2

x

4

x

5

x

6

x

7

x

8

x

9

x

( )

.

j

x G j x local code ∈ ⇔ ∀ ∈ −

C

C C



G = ( I  J , E )

( )

min *

min   = −  

j j

d loc de d al co C

SLIDE 10

10

Maximum-likelihood (ML) decoding: where Linear Programming (LP) decoding [following Fel03, FWK05]: where

LP Decoding of Tanner Codes

( )

conv

arg min ,

x

ML x λ λ

∈

=

C

( )

arg min ,

x

LP x λ λ

∈

=

P

( )

extended local-code co nv = P Cj

j

( )

conv = extended local-code conv     



C C

j j

conv(extended local-code C1) conv(extended local-code C2)

SLIDE 11

11

Criterions of Interest

Let λ N denote an LLR vector received from the channel. Let x  C(G) denote a codeword. Consider the following questions: E.g., efficient test via local computations  “Local Optimality” criterion

( )

?

λ = x ML

( )

?

λ = x LP

( )

? LP unique λ

( )

? ML unique λ

Efficient Test with One Sided Error

x Definitely Yes / Maybe No λ

( )

?

? λ = x ML unique

NP-Hard

SLIDE 12

12

Let x  C(G)  {0,1}N f  [0,1]N  N [Fel03] Define relative point x ⊕ f by Consider a finite set B  [0,1]N Definition: A codeword x  C is locally optimal for λ  N if for all vectors b  B, Goal: find a set B such that: (1) x  LO(λ )  x  ML(λ) and ML(λ) unique (2) x  LO(λ )  x  LP(λ) and LP(λ) unique (3)

{ }

LP decoding fails . , 0 | 0N c β λ β ≤ ∃ ∈ ≤ = B  

Combinatorial Characterization of Local Optimality (1)

, , x x λ β λ ⊕ >

( )

i i i

x f x f ⊕ = −

, 0 | 1 (1)

β

λ β

∈

  > = = −    





B N

c

All-Zeros

Assumption

λ

LO(λ ) ML(λ) LP(λ) integral

SLIDE 13

13

Goal: find a set B such that: (1) x  LO(λ )  x  ML(λ) and ML(λ) unique (2) x  LO(λ )  x  LP(λ) and LP(λ) unique (3) Suppose we have properties (1), (2). Large support(b)  property (3). (e.g., Chernoff-like bounds) If B = C, then: x  LO(λ )  x  ML(λ) and ML(λ) unique However, analysis of property (3) ??? b – GLOBAL Structure

Combinatorial Characterization of Local Optimality (2)

, 0 | 1 (1)

β

λ β

∈

  > = = −    





B N

c

SLIDE 14

14

Goal: find a set B such that: (1) x  LO(λ )  x  ML(λ) and ML(λ) unique (2) x  LO(λ )  x  LP(λ) and LP(λ) unique (3) Suppose we have properties (1), (2). Large support(b)  property (3). (e.g., Chernoff-like bounds) For analysis purposes, consider structures with a local nature  B is a set of TREES [following KV’06] Strengthen analysis by introducing layer weights! [following ADS’09]  better bounds on Finally, height(subtrees(G)) < ½ girth(G) = O(log N)  Take path prefix trees – not bounded by girth!

Combinatorial Characterization of Local Optimality (2)

, 0 | 1 (1)

β

λ β

∈

  > = = −    





B N

c

,

0 |

β

λ β

∈

  > =    





B N

c

SLIDE 15

15

Consider a graph G=(V,E) and a node r  V:

– set of all backtrackless paths in G emanating from node r with length at most h. – path-prefix tree of G rooted at node r with height h.

Path-Prefix Tree

ˆ V ( ) (

)

ˆ ˆ , 

h r

T G V E

G:

SLIDE 16

16

Consider a graph G=(V,E) and a node r  V:

– set of all backtrackless paths in G emanating from node r with length at most h. – path-prefix tree of G rooted at node r with height h.

Path-Prefix Tree

ˆ V ( ) (

)

ˆ ˆ , 

h r

T G V E

1 1 1 1 1 2 1 1 1 2 2 1 2 3 2 1 2 4 2

( )

4

:

r

T G

G:

SLIDE 17

17

d-Tree

Consider a path-prefix tree of a Tanner graph G = (I  J , E) d-tree T [r,h,d] – subgraph of

root = r ∀ v ∈ T ∩ I : deg T (v) = deg G (v). ∀ c ∈ T ∩ J : deg T (c) = d.

v0

2-tree = skinny tree / minimal deviation

v0

3-tree

v0

4-tree

Not necessarily a valid configuration!

( )

h r

T G

( )

h r

T G

SLIDE 18

18

Consider layer weights  : {1,…,h} → , and a subtree of a path prefix tree . Define a weight function for the subtree induced by : where – projection to Tanner graph G.

Cost of a Projected Weighted Subtree

( )

ˆ

ˆ :

ω

→  Tr V

( )

2h r

T G

ˆ r

1

ω

2

ω

( ) (

)

2 ˆ 2

1 6 2 2 1 2

r ω

ω = ⋅ ⋅ ⋅ T

( ) (

)

1 ˆ 1

1 6 2 2

r ω

ω = ⋅ T

( )

ˆ N G r ω

π  ∈   T 

1 1 1 1 1 2 1 1 1 2 2 1 2 3 2 1 2 4 2 1 2

( ) (

)

1 2 ˆ

4 8

G r ω

ω ω π   = +   T

project

ˆ

Tr

ˆ

Tr

SLIDE 19

19

Combinatorial Characterization of Local Optimality

Setting: C(G)  {0,1}N Tanner code with minimal local distance d* 2 ≤ d ≤ d*   [0,1]h\{0N} – set of all vectors corresponding to projections to G by -weighted d-trees of height 2h rooted at variable nodes

Definition: A codeword x is (h,  , d)-locally optimal for λ  N if for all vectors , , , x x λ β λ ⊕ >

( )

ω

Bd

( )

ω

β ∈Bd

( ) ( )

( )

{ }

is -weighted a -tree of height

ω ω ω

π ω = B T T

d G

d h

SLIDE 20

20

Thm: Local Opt ⇒ ML Opt / LP Opt

Theorem: If x  C(G) is (h ,  , d)-locally optimal for λ, then: (1) x is the unique ML codeword for λ. (2) x is the unique optimal LP solution given λ.

λ

LO(λ ) ML(λ) LP(λ) integral

Goals achieved: (1) x  LO(λ )  x  ML(λ) and ML(λ) unique (2) x  LO(λ )  x  LP(λ) and LP(λ) unique Left to show: (3) Pr{x  LO(λ )} = 1 – o(1)

SLIDE 21

21

Thm: Local Opt ⇒ ML Opt / LP Opt

Theorem: If x  C(G) is (h ,  , d)-locally optimal for λ, then: (1) x is the unique ML codeword for λ. (2) x is the unique optimal LP solution given λ.

Interesting outcomes: Characterizes the event of LP decoding failure. For example: Work in progress: Design an iterative message passing decoding algorithm that computes an (h ,  , d)- locally-optimal codeword after h iterations.

∃ x locally optimal codeword for λ  weighted message passing algorithm computes x in h iterations + guarantee that x is the ML codeword.

Theorem: Fix

h

ω

+

∈ . Then

{ }

LP decoding fails

tree .

[ ], 0 | τ π τ λ ≤ ∃ ≤ =  

n G

d c .

All-Zeros Assumption

SLIDE 22

22

Local Optimality Certificates for LP decoding

Previous results:

Koetter and Vontobel ’06 – Characterized LP solutions and provide a criterion for certifying the optimality of a codeword for LP decoding of regular LDPC codes.

Characterization is based on combinatorial structure of skinny trees of height h in the factor graph. (h<girth(G))

Arora, Daskalakis and Steurer ’09 – Certificate for LP decoding

f regular LDPC codes over BSC.

Extension to MBIOS channels in [H-Even ’10].

Certificate characterization is based on weighted skinny trees of height h. (h<girth(G))

Note: Bounds on the word error probability of LP decoding are computed by analyzing the certificates. By introducing layer weights, ADS certificate occurs with high probability for much larger noise rates. Vontobel ’10 – Implies a certificate for LP decoding of Tanner codes based on weighted skinny subtrees of graph covers.

SLIDE 23

23

Local Optimality Certificates for LP decoding – Summary of Main Techniques

Current work [KV06,ADS09,HE10] h is unbounded Characterization using computation trees h < girth(G) No dependencies on the factor graph. Girth Irregular factor graph – add normalization factors according to node degrees. Regular factor graph. Regularity Linear Codes. Tighter relaxation for the generalized fundamental polytope. Parity code. Check Nodes / Constraints Local Isomorphism

1

C

2

C

3

C

5

C

4

C

10

x

1

x

3

x

2

x

4

x

5

x

6

x

7

x

8

x

9

x

SLIDE 24

24

Local Optimality Certificates for LP decoding – Summary of Main Techniques

Current work [KV06,ADS09,HE10] h is unbounded Characterization using computation trees h < girth(G) No dependencies on the factor graph. Girth Irregular factor graph – add normalization factors according to node degrees. Regular factor graph. Regularity Linear Codes. Tighter relaxation for the generalized fundamental polytope. Parity code. Check Nodes / Constraints “fat” – Certificates based on “fat” structures likely to occur with high probability for larger noise rates. Not necessarily a valid configuration! “skinny” Locally satisfies inner parity checks. Deviations Local Isomorphism

v0 v0

SLIDE 25

25

Local Optimality Certificates for LP decoding – Summary of Main Techniques

Current work [KV06,ADS09,HE10] h is unbounded Characterization using computation trees h < girth(G) No dependencies on the factor graph. Girth Irregular factor graph – add normalization factors according to node degrees. Regular factor graph. Regularity Linear Codes. Tighter relaxation for the generalized fundamental polytope. Parity code. Check Nodes / Constraints “fat” – Certificates based on “fat” structures likely to occur with high probability for larger noise rates. Not necessarily a valid configuration! “skinny” Locally satisfies inner parity checks. Deviations Use reduction to ML via characterization of graph cover decoding. Dual / Primal LP

analysis. Polyhedral

analysis. LP solution analysis / characterization Local Isomorphism

SLIDE 26

26

Conclusions

Combinatorial, graph theoretic, and algorithmic methods for analyzing decoding of (modern) error correcting codes over any memoryless channel: New local optimality combinatorial certificate for LP decoding of irregular Tanner codes, i.e., Local Opt. ⇒ LP Opt.

The certificate is based on weighted “fat” subtrees of computation trees of height h. h is not bounded by girth.

Proofs using combinatorial decompositions and graph covers arguments. Efficient algorithm (dynamic programming), runs in O(|E|h) time, for the computation of local optimality certificate of a codeword given an LLR vector.

Degree of local-code nodes is not limited to 2

SLIDE 27

27

Future Work

Work in progress: Design an iterative message passing decoding algorithm that computes an (h ,  , d)- locally-optimal codeword after h iterations.

∃ x locally optimal codeword for λ  weighted message passing algorithm computes x in h iterations + guarantee that x is the ML codeword.

Asymptotic analysis of LP decoding and weighted min-sum decoding for ensembles of irregular Tanner codes.

Based on density evolution techniques and the combinatorial characterization of local optimality.

SLIDE 28

28

Local Optimality Certificates for LP Decoding

Nissim Halabi Guy Even

Error Correcting Codes – Worst Case

p

p

C

Error Correcting Codes for Memoryless Binary-Input Output-Symmetric Channels (1)

{ }

Error Correcting Codes for Memoryless Binary-Input Output-Symmetric Channels (2)

{ }

Maximum-Likelihood (ML) Decoding

( )

( )

Linear Programming (LP) Decoding

Linear Programming (LP) Decoding

Linear Programming (LP) Decoding

Tanner Codes [Tan81]

LP Decoding of Tanner Codes

( )

( )



Criterions of Interest

Combinatorial Characterization of Local Optimality (1)



Combinatorial Characterization of Local Optimality (2)



Combinatorial Characterization of Local Optimality (2)



Path-Prefix Tree

Path-Prefix Tree

( )

d-Tree

Cost of a Projected Weighted Subtree

Combinatorial Characterization of Local Optimality

( )

{ }

Thm: Local Opt ⇒ ML Opt / LP Opt

Thm: Local Opt ⇒ ML Opt / LP Opt

Local Optimality Certificates for LP decoding

Local Optimality Certificates for LP decoding – Summary of Main Techniques

Local Optimality Certificates for LP decoding – Summary of Main Techniques

Local Optimality Certificates for LP decoding – Summary of Main Techniques

Conclusions

Future Work

Work in progress: Design an iterative message passing decoding algorithm that computes an (h ,  , d)- locally-optimal codeword after h iterations.

Asymptotic analysis of LP decoding and weighted min-sum decoding for ensembles of irregular Tanner codes.

Thank You!