1
Local Optimality Certificates for LP Decoding
- f Tanner Codes
1 1th Haifa Workshop on Interdisciplinary Applications of Graph Theory , Combinatorics and Algorithms. May 2011
Local Optimality Certificates for LP Decoding of Tanner Codes - - PowerPoint PPT Presentation
Local Optimality Certificates for LP Decoding of Tanner Codes Nissim Halabi Guy Even 1 1 th Haifa Workshop on Interdisciplinary Applications of Graph Theory , Combinatorics and Algorithms. May 2011 1 Error Correcting Codes Worst Case Vs.
1
1 1th Haifa Workshop on Interdisciplinary Applications of Graph Theory , Combinatorics and Algorithms. May 2011
2
An [N,K] linear code C – K-dimensional subspace of the vector space {0,1}N
min/2
d
min
d
dmin
Worst case analysis – assuming adversarial channel. e.g., how many bit flips, in any pattern, can decoding recover? Pr(fail : worst case) ~ Average case analysis – probabilistic channel e.g., given that every bit is flipped with probability p independently, what is the probability that decoding succeeds? possibly, Pr(fail : avg. case) <<
3
Memoryless Binary-Input Output-Symmetric Channel
characterized by conditional probability function P( y | c ) Errors occur randomly and are independent from bit to bit (memoryless) Assumes transmitted symbols are binary Errors affect ‘0’s and ‘1’s with equal probability (i.e., symmetric)
Examaple: Binary Symmetric Channel (BSC)
Noisy Channel Channel Decoding Channel Encoding
ˆ 0,1
N
c∈ { }
0,1
N
c∈ ⊆ C
1 1
yi ci
p p 1-p 1-p
codeword noisy codeword
N
y ∈
4
Log-Likelihood Ratio (LLR) λi for a received observation yi:
λi > 0 yi is more likely to be ‘0’ λi < 0 yi is more likely to be ‘1’
λ y replace y by λ
Noisy Channel Channel Decoding Channel Encoding
ˆ 0,1
N
c∈ { }
0,1
N
c∈ ⊆ C
N
y ∈
( ) ( ) ( )
/ /
| ln | 1
i i i i
Y X i i i i Y X i i
y x y y x λ = = =
codeword noisy codeword
λ( )
5
Maximum-likelihood (ML) decoding for any binary-input memory-less channel: Maximum-likelihood (ML) decoding formulated as a linear program:
( )
conv
arg min , arg min ,
x x
ML x x λ λ λ
∈ ∈
= =
C C
No Efficient Representation
C{0,1}N conv(C)[0,1]N
arg min ,
x
ML x λ λ
∈
=
C
6
Linear Programming (LP) decoding [Fel03, FWK05] – relaxation of the polytope conv(C) P : (1) All codewords x in C are vertices
(2) All new vertices are fractional (therefore, new vertices are not in C) (3) Has an efficient representation
Solve LP
( )
arg min ,
x
LP x λ λ
∈
=
P
( ) fractional
! LP fail λ ⇒
( ) ( ) ( )
integral LP LP ML λ λ λ ⇒ =
P C{0,1}N conv(C)[0,1]N fractional conv(C) P
7
Linear Programming (LP) decoding [Fel03, FWK05] – relaxation of the polytope conv(C) P : (1) All codewords x in C are vertices
(2) All new vertices are fractional (therefore, new vertices are not in C) (3) Has an efficient representation
LP decoder finds ML codeword
( )
arg min ,
x
LP x λ λ
∈
=
P
Solve LP
( ) fractional
! LP fail λ ⇒
( ) ( ) ( )
integral LP LP ML λ λ λ ⇒ =
P C{0,1}N conv(C)[0,1]N fractional conv(C) P
8
Linear Programming (LP) decoding [Fel03, FWK05] – relaxation of the polytope conv(C) P : (1) All codewords x in C are vertices
(2) All new vertices are fractional (therefore, new vertices are not in C) (3) Has an efficient representation
LP decoder fails
Solve LP
( ) ( ) ( )
integral LP LP ML λ λ λ ⇒ =
( ) fractional
! LP fail λ ⇒
( )
arg min ,
x
LP x λ λ
∈
=
P
P C{0,1}N conv(C)[0,1]N fractional conv(C) P
9
Factor graph representation of Tanner codes:
Every Local-code node Cj is associated with linear code of length degG(Cj) Tanner code C(G) and codewords x : Extended local-code Cj {0,1}N: extend to bits outside the local-code Example: Expander codes [SS’96] Tanner graph is an expander; Simple bit flipping decoding algorithm.
Variable nodes Local-Code nodes
1
C
2
C
3
C
5
C
4
C
10
x
1
x
3
x
2
x
4
x
5
x
6
x
7
x
8
x
9
x
( )
( )
.
j
j
x G j x local code ∈ ⇔ ∀ ∈ −
C
C C
G = ( I J , E )
( )
min *
min = −
j j
d loc de d al co C
10
Maximum-likelihood (ML) decoding: where Linear Programming (LP) decoding [following Fel03, FWK05]: where
( )
conv
arg min ,
x
ML x λ λ
∈
=
C
arg min ,
x
LP x λ λ
∈
=
P
( )
extended local-code co nv = P Cj
j
( )
conv = extended local-code conv
C C
j j
conv(extended local-code C1) conv(extended local-code C2)
11
Let λ N denote an LLR vector received from the channel. Let x C(G) denote a codeword. Consider the following questions: E.g., efficient test via local computations “Local Optimality” criterion
( )
?
λ = x ML
( )
?
λ = x LP
( )
? LP unique λ
( )
? ML unique λ
Efficient Test with One Sided Error
x Definitely Yes / Maybe No λ
( )
?
? λ = x ML unique
NP-Hard
12
Let x C(G) {0,1}N f [0,1]N N [Fel03] Define relative point x ⊕ f by Consider a finite set B [0,1]N Definition: A codeword x C is locally optimal for λ N if for all vectors b B, Goal: find a set B such that: (1) x LO(λ ) x ML(λ) and ML(λ) unique (2) x LO(λ ) x LP(λ) and LP(λ) unique (3)
{ }
{ }
LP decoding fails . , 0 | 0N c β λ β ≤ ∃ ∈ ≤ = B
, , x x λ β λ ⊕ >
( )
i i i
x f x f ⊕ = −
, 0 | 1 (1)
β
λ β
∈
> = = −
B N
c
Assumption
λ
LO(λ ) ML(λ) LP(λ) integral
13
Goal: find a set B such that: (1) x LO(λ ) x ML(λ) and ML(λ) unique (2) x LO(λ ) x LP(λ) and LP(λ) unique (3) Suppose we have properties (1), (2). Large support(b) property (3). (e.g., Chernoff-like bounds) If B = C, then: x LO(λ ) x ML(λ) and ML(λ) unique However, analysis of property (3) ??? b – GLOBAL Structure
, 0 | 1 (1)
β
λ β
∈
> = = −
B N
c
14
Goal: find a set B such that: (1) x LO(λ ) x ML(λ) and ML(λ) unique (2) x LO(λ ) x LP(λ) and LP(λ) unique (3) Suppose we have properties (1), (2). Large support(b) property (3). (e.g., Chernoff-like bounds) For analysis purposes, consider structures with a local nature B is a set of TREES [following KV’06] Strengthen analysis by introducing layer weights! [following ADS’09] better bounds on Finally, height(subtrees(G)) < ½ girth(G) = O(log N) Take path prefix trees – not bounded by girth!
, 0 | 1 (1)
β
λ β
∈
> = = −
B N
c
0 |
β
λ β
∈
> =
B N
c
15
Consider a graph G=(V,E) and a node r V:
– set of all backtrackless paths in G emanating from node r with length at most h. – path-prefix tree of G rooted at node r with height h.
ˆ V ( ) (
)
ˆ ˆ ,
h r
T G V E
G:
16
Consider a graph G=(V,E) and a node r V:
– set of all backtrackless paths in G emanating from node r with length at most h. – path-prefix tree of G rooted at node r with height h.
ˆ V ( ) (
)
ˆ ˆ ,
h r
T G V E
1 1 1 1 1 2 1 1 1 2 2 1 2 3 2 1 2 4 2
4
:
r
T G
G:
17
Consider a path-prefix tree of a Tanner graph G = (I J , E) d-tree T [r,h,d] – subgraph of
root = r ∀ v ∈ T ∩ I : deg T (v) = deg G (v). ∀ c ∈ T ∩ J : deg T (c) = d.
v0
2-tree = skinny tree / minimal deviation
v0
3-tree
v0
4-tree
Not necessarily a valid configuration!
( )
h r
T G
( )
h r
T G
18
Consider layer weights : {1,…,h} → , and a subtree of a path prefix tree . Define a weight function for the subtree induced by : where – projection to Tanner graph G.
( )
ˆ
ˆ :
ω
→ Tr V
( )
2h r
T G
ˆ r
1
ω
2
ω
( ) (
)
2 ˆ 2
1 6 2 2 1 2
r ω
ω = ⋅ ⋅ ⋅ T
( ) (
)
1 ˆ 1
1 6 2 2
r ω
ω = ⋅ T
( )
ˆ N G r ω
π ∈ T
1 1 1 1 1 2 1 1 1 2 2 1 2 3 2 1 2 4 2 1 2
( ) (
)
1 2 ˆ
4 8
G r ω
ω ω π = + T
project
ˆ
Tr
ˆ
Tr
19
Setting: C(G) {0,1}N Tanner code with minimal local distance d* 2 ≤ d ≤ d* [0,1]h\{0N} – set of all vectors corresponding to projections to G by -weighted d-trees of height 2h rooted at variable nodes
Definition: A codeword x is (h, , d)-locally optimal for λ N if for all vectors , , , x x λ β λ ⊕ >
( )
ω
Bd
( )
ω
β ∈Bd
( ) ( )
( )
is -weighted a -tree of height
ω ω ω
π ω = B T T
d G
d h
20
Theorem: If x C(G) is (h , , d)-locally optimal for λ, then: (1) x is the unique ML codeword for λ. (2) x is the unique optimal LP solution given λ.
λ
LO(λ ) ML(λ) LP(λ) integral
Goals achieved: (1) x LO(λ ) x ML(λ) and ML(λ) unique (2) x LO(λ ) x LP(λ) and LP(λ) unique Left to show: (3) Pr{x LO(λ )} = 1 – o(1)
21
Theorem: If x C(G) is (h , , d)-locally optimal for λ, then: (1) x is the unique ML codeword for λ. (2) x is the unique optimal LP solution given λ.
Interesting outcomes: Characterizes the event of LP decoding failure. For example: Work in progress: Design an iterative message passing decoding algorithm that computes an (h , , d)- locally-optimal codeword after h iterations.
∃ x locally optimal codeword for λ weighted message passing algorithm computes x in h iterations + guarantee that x is the ML codeword.
Theorem: Fix
h
ω
+
∈ . Then
{ }
{ }
LP decoding fails
[ ], 0 | τ π τ λ ≤ ∃ ≤ =
n G
d c .
All-Zeros Assumption
22
Previous results:
Koetter and Vontobel ’06 – Characterized LP solutions and provide a criterion for certifying the optimality of a codeword for LP decoding of regular LDPC codes.
Characterization is based on combinatorial structure of skinny trees of height h in the factor graph. (h<girth(G))
Arora, Daskalakis and Steurer ’09 – Certificate for LP decoding
Extension to MBIOS channels in [H-Even ’10].
Certificate characterization is based on weighted skinny trees of height h. (h<girth(G))
Note: Bounds on the word error probability of LP decoding are computed by analyzing the certificates. By introducing layer weights, ADS certificate occurs with high probability for much larger noise rates. Vontobel ’10 – Implies a certificate for LP decoding of Tanner codes based on weighted skinny subtrees of graph covers.
23
Current work [KV06,ADS09,HE10] h is unbounded Characterization using computation trees h < girth(G) No dependencies on the factor graph. Girth Irregular factor graph – add normalization factors according to node degrees. Regular factor graph. Regularity Linear Codes. Tighter relaxation for the generalized fundamental polytope. Parity code. Check Nodes / Constraints Local Isomorphism
1
C
2
C
3
C
5
C
4
C
10
x
1
x
3
x
2
x
4
x
5
x
6
x
7
x
8
x
9
x
24
Current work [KV06,ADS09,HE10] h is unbounded Characterization using computation trees h < girth(G) No dependencies on the factor graph. Girth Irregular factor graph – add normalization factors according to node degrees. Regular factor graph. Regularity Linear Codes. Tighter relaxation for the generalized fundamental polytope. Parity code. Check Nodes / Constraints “fat” – Certificates based on “fat” structures likely to occur with high probability for larger noise rates. Not necessarily a valid configuration! “skinny” Locally satisfies inner parity checks. Deviations Local Isomorphism
v0 v0
25
Current work [KV06,ADS09,HE10] h is unbounded Characterization using computation trees h < girth(G) No dependencies on the factor graph. Girth Irregular factor graph – add normalization factors according to node degrees. Regular factor graph. Regularity Linear Codes. Tighter relaxation for the generalized fundamental polytope. Parity code. Check Nodes / Constraints “fat” – Certificates based on “fat” structures likely to occur with high probability for larger noise rates. Not necessarily a valid configuration! “skinny” Locally satisfies inner parity checks. Deviations Use reduction to ML via characterization of graph cover decoding. Dual / Primal LP
analysis. LP solution analysis / characterization Local Isomorphism
26
Combinatorial, graph theoretic, and algorithmic methods for analyzing decoding of (modern) error correcting codes over any memoryless channel: New local optimality combinatorial certificate for LP decoding of irregular Tanner codes, i.e., Local Opt. ⇒ LP Opt.
The certificate is based on weighted “fat” subtrees of computation trees of height h. h is not bounded by girth.
Proofs using combinatorial decompositions and graph covers arguments. Efficient algorithm (dynamic programming), runs in O(|E|h) time, for the computation of local optimality certificate of a codeword given an LLR vector.
Degree of local-code nodes is not limited to 2
27
∃ x locally optimal codeword for λ weighted message passing algorithm computes x in h iterations + guarantee that x is the ML codeword.
Based on density evolution techniques and the combinatorial characterization of local optimality.
28