An Upgrading Algorithm with Optimal Power Law Or Ordentlich 1 Ido Tal - - PowerPoint PPT Presentation

▶

Oct 30, 2022 37 likes •177 views

An Upgrading Algorithm with Optimal Power Law Or Ordentlich 1 Ido Tal 2 1 Hebrew University 2 Technion 1 / 14 Big picture first In this talk: An upgrading algorithm for channels with non-binary input Optimal power law Achieved by

SLIDE 1

1 / 14

An Upgrading Algorithm with Optimal Power Law

Or Ordentlich1 Ido Tal2

1Hebrew University 2Technion

SLIDE 2

2 / 14

Big picture first

In this talk:

◮ An upgrading algorithm for channels with non-binary input ◮ Optimal power law ◮ Achieved by reduction to the binary-input case ◮ Important for constructing polar codes

SLIDE 3

3 / 14

Constructing vanilla polar codes

◮ Underlying channel: a binary-input symmetric and memoryless

channel W : X → Y, where X = {0, 1}

◮ Derive N = 2n synthetic channels W (n) j

: X → YN × X j−1, where 1 ≤ j ≤ N.

◮ Constructing a vanilla polar code ≡ finding which synthetic

channels W (n)

j

are ‘almost noiseless’

◮ Problem: output alphabet YN × X j−1 is intractably large ◮ Solution:

◮ Replace W (n)

with Q(n)

having output alphabet size L

◮ Have Q(n)

be (stochastically) degraded with respect to W (n)

W (n)

Φ Q(n)

input intractably large

utput size = L

◮ Q(n)

almost noiseless = ⇒ W (n)

almost noiseless

SLIDE 4

4 / 14

Constructing vanilla polar codes

◮ We write Q ≤ W if Q is degraded with respect to W ◮ Alternatively, we write W ≥ Q and say that W is upgraded

with respect to Q

◮ Previous slide:

Q(n)

j

≤ W (n)

j ◮ We can also approximate W (n) j

“from above” by an upgraded channel R(n)

j

having output alphabet size at most L.

◮ Sandwich property:

Q(n)

j

≤ W (n)

j

≤ R(n)

j ◮ In vanilla setting, R(n) j

has secondary importance. . .

SLIDE 5

5 / 14

Constructing generalized polar codes

◮ Polar codes have been generalized beyond vanilla setting

◮ Asymmetric channels (with asymmetric input distribution) ◮ Wiretap channels ◮ Channels with memory (input distribution can have memory as

well)

◮ In all these settings upgrading is as important as degrading for

constructing the code

◮ For settings with memory, the “effective input alphabet” is

non-binary

SLIDE 6

6 / 14

Problem statement

◮ Given: joint distribution of channel and input PX,Y (x, y)

◮ x ∈ X, the input alphabet and y ∈ Y, the output alphabet ◮ PX,Y (x, y) =

PX(x)

input distribution

· PY |X(y|x)

channel

◮ Find: P∗ X,Z,Y (x, z, y) such that

◮ Marginalization:

z P∗ X,Z,Y (x, z, y) = PX,Y (x, y)

◮ Upgrading: X − Z − Y is a Markov chain ◮ Tractable output alphabet size: z ∈ Z and |Z| ≤ L

R Φ W X Z Y W X Y = ⇒

◮ Figure of merit:

H(X|Y ) − H(X|Z) = I(X; Z) − I(X; Y ) should be ‘small’

SLIDE 7

7 / 14

Power law

◮ Previous results:

◮ Recall: output alphabet size of upgraded channel |Z| ≤ L ◮ There exists a ‘hard to upgrade’ joint distribution P(X, Y ):

H(X|Y ) − H(X|Z) = Ω(L−2/(|X|−1))

◮ For binary input, |X| = 2, and any PX,Y , there exists an

upgrading algorithm such that H(X|Y ) − H(X|Z) = O(L−2) = O(L−2/(|X|−1))

◮ New result:

◮ Also for non-binary input, we can upgrade any PX,Y and

achieve H(X|Y ) − H(X|Z) = O(L−2/(|X|−1))

◮ Main idea: use binary-input as a black-box (reduction)

SLIDE 8

8 / 14

One-hot representation

◮ Denote q = |X|. Assume

X = {1, 2, . . . , q}

◮ For x ∈ X, define

g(x) = (x1, x2, . . . , xq−1) , the one-hot representation: g(1) = (1, 0, 0 . . . 0, 0) g(2) = (0, 1, 0 . . . 0, 0) . . . g(q − 1) = (0, 0, 0 . . . 0, 1) g(q) = (0, 0, 0 . . . 0, 0)

◮ Abuse notation and write x = g(x) = (x1, x2, . . . , xq−1)

SLIDE 9

9 / 14

PX,Y = ⇒ α(i) = ⇒ β(i) = ⇒ γ(i) = ⇒ P∗

X,Z,Y

◮ We are given PX,Y , where |X| = q ◮ Need to produce P∗ X,Z,Y by reducing to binary-input upgrading ◮ Denote X ′ = {0, 1} ◮ Let X = (X1, X2, . . . , Xq−1) and Y be distributed according to

PX,Y

◮ First step: define, for 1 ≤ i ≤ q − 1 the joint distribution

α(i)

Xi,Y (x′, y) = P(Xi = x′, Y = y|X i−1 1

= 0i−1

1

)

◮ The joint distribution α(i) Xi,Y (x′, y) has binary input, x′ ∈ X ′ ◮ We may apply our binary-input upgrading procedure

SLIDE 10

10 / 14

PX,Y = ⇒ α(i) = ⇒ β(i) = ⇒ γ(i) = ⇒ P∗

X,Z,Y

◮ Recall our binary-input joint distribution: for 1 ≤ i ≤ q − 1,

α(i)

Xi,Y (x′, y) = P(Xi = x′, Y = y|X i−1 1

= 0i−1

1

)

◮ Define

Λ =

L1/(q−1)

.

◮ Second step:

◮ Apply our binary-input upgrading procedure to α(i)

Xi,Y (x′, y),

resulting in β(i)

Xi,Zi,Y (x′, z, y) ,

where |Zi| ≤ Λ

◮ Difference in entropies is O(Λ−2)

SLIDE 11

11 / 14

PX,Y = ⇒ α(i) = ⇒ β(i) = ⇒ γ(i) = ⇒ P∗

X,Z,Y

◮ Recall that we have produced β(i) Xi,Zi,Y (x′, z, y), where x′ ∈ X ′

is binary

◮ Third step: define the conditional distribution

γ(i)

Xi|Zi,X i−1

(xi|zi, xi−1

1

) =      β(i)

Xi|Zi(xi|zi)

if xi−1

1

= 0i−1

1

, 1 if xi−1

1

= 0i−1

1

and xi = 0 ,

therwise .

◮ That is, if xi−1 1

is non-zero, force xi to zero, in accordance with the one-hot representation

◮ Otherwise, if xi−1 1

is zero, use β(i)

Xi|Zi

SLIDE 12

12 / 14

PX,Y = ⇒ α(i) = ⇒ β(i) = ⇒ γ(i) = ⇒ P∗

X,Z,Y

◮ Last step: define

P∗

X,Z,Y (x, z, y) = PY (y) ·

q−1

β(i)

Zi|Y (zi|y)

q−1

γ(i)

Xi|Zi,X i−1

(xi|zi, xi−1

1

)

◮ A valid upgrade, with optimal power law:

H(X|Y ) − H(X|Z) = O(L−2/(|X|−1))

SLIDE 13

13 / 14

A graphical description of PX,Y

∼ Y α(1)

X1|Y

˜ X1 f1( ˜ X1) X1 α(i)

Xi|Y

˜ Xi fi( ˜ Xi

Xi α(q−1)

Xq−1|Y

˜ Xq−1 fq−1( ˜ Xq−1

) Xq−1 . . . . . . . . . . . .

where fi(˜ xi

1) ˜

xi · 1{˜

xi−1

=0i−1

}