[PPT] - FUNDAMENTALS AND MULTIPLE USER APPLICATIONS (PART II) Max H. M. PowerPoint Presentation

SLIDE 1

INFORMATION THEORY FUNDAMENTALS AND MULTIPLE USER APPLICATIONS (PART II)

July 2018

Max H. M. Costa Unicamp

LAWCI – Unicamp

SLIDE 2

Summary

Introduction
Entropy, K-L Divergence, Mutual Information
Asymptotic Equipartition Property (AEP)
1. Data Compression (Source Coding)
2. Transmission over Unreliable Channels (Channel Coding)
Differential Entropy, Gaussian Channels
Multiple User Information Theory:
Multiple Access, Broadcast, Interference, Relay Channels
Remarks: Applications in Biology, Economics,...

SLIDE 3

H(X) = Entropy of X

 Let X be a discrete random variable taking values in

{x1, x2, ..., xM} with probabilities p = {p1, p2, ..., pM}.

 Definition:  H(X) = H(p) =

𝑞 𝑦𝑙 𝑚𝑝𝑕2

1 𝑞(𝑦𝑙) 𝑁 𝑙=1

(bits) =

 = E ( 𝑚𝑝𝑕2

1 𝑞(𝑌) ) bits

H(X) is a measure of the uncertainty of X.

SLIDE 4

How can H(X) arise naturally?

 Let X1, X2, ... be independent and identically distributed

(i.i.d.) according to p(x).

 Then  p(x1, x2, ..., xn) = p(x1)p(x2) ... p(xn) =  p(xi) =

= 2 𝑚𝑝𝑕2𝑞

𝑜 𝑙=1

𝑦𝑗 = 2 𝑜

(𝑘)

𝑜 𝑚𝑝𝑕2𝑞 𝑦𝑘 𝑛 𝑘=1

~ 2−𝑜𝐼 𝑌

i=1 n

Asymptotic Equipartition Property

SLIDE 5

Examples (continued)

 Example) X  {0,1}, p(X=0)=p, p(X=1) =1-p,  H(X) = – p log p – (1-p) log (1-p)  = h(p)

h(p) p

1

1 1/2

h(p) is the binary entropy function

SLIDE 6

Lemma

 ln x ≤ x-1, x>0  Proof: Taylor series with remainder

1

x-1 ln x

x

SLIDE 7

Relative Entropy (Kulbach-Leibler divergence)

 Let p(x) and q(x) be two probability mass functions

defined on alphabet X.

 The K-L divergence of p w.r.t. q is  D(p  q) = 𝑞 𝑦 log

𝑞(𝑦) 𝑟(𝑦)

X

SLIDE 8

Joint, marginal and conditional distributions

 Joint Distribution:

X Y

yN y2 y1 xM x2 x1

.... p(xi,yj) p(xi) p(yj) 𝑞(𝑧𝑘) = 𝑞 𝑦𝑗, 𝑧𝑘 𝑞(𝑦𝑗) = 𝑞 𝑦𝑗, 𝑧𝑘 Marginal distributions:

𝑦𝑗 𝑧𝑘

p(x1) ... p(xM) p(y1) p(yN)

SLIDE 9

Conditional Distributions:

 𝑞 𝑧𝑘 𝑦𝑗 = 𝑞(𝑦𝑗,𝑧𝑘)

𝑞(𝑦𝑗)

 𝑞 𝑦𝑗 𝑧𝑘 = 𝑞(𝑦𝑗,𝑧𝑘)

𝑞(𝑧𝑘)

The joint distribution determines the marginal and the conditional distributions. Note: The marginals do not determine the joint distribution.

SLIDE 10

Joint Entropy

 H(X,Y) = H(p( . , . )) = 𝑞 𝑦𝑗, 𝑧𝑘 𝑚𝑝𝑕

1 𝑞(𝑦𝑗,𝑧𝑘)

𝑦𝑗, 𝑧𝑘

SLIDE 11

Conditional Entropy

 H(X|Y) = 𝑞 𝑦𝑗, 𝑧𝑘 𝑚𝑝𝑕

1 𝑞(𝑦𝑗|𝑧𝑘) = - E (𝑚𝑝𝑕 𝑞(𝑌|𝑍) )

 H(Y|X) = 𝑞 𝑧𝑘, 𝑦𝑗 𝑚𝑝𝑕

1 𝑞(𝑧𝑘|𝑦𝑗) = - E (𝑚𝑝𝑕 𝑞(𝑍|𝑌) )

SLIDE 12

Chain Rule (like peeling an onion):

H(X,Y) = H(X) + H(Y|X)

 = H(Y) + H(X|Y)  Proof: Do it for homework.  Simple algebraic manipulation.  Corollary (conditional form):  H(X,Y|Z) = H(X|Z) + H(Y|X,Z)  = H(Y|Z) + H(X|Y,Z)

SLIDE 13

Mutual Information

 The Mutual information between X and Y is  the K-L divergence of the joint distribution p(x,y)

and the product of the marginals p(x) p(y).

 I(X;Y) = D(p(x,y)  p(x) p(y) )  =

𝑞 𝑦, 𝑧 log 𝑞(𝑦,𝑧) 𝑞 𝑦 𝑞(𝑧)

X Y

SLIDE 14

Mutual Information and Entropy

 I(X;Y) =

𝑞 𝑦, 𝑧 log 𝑞(𝑦,𝑧) 𝑞 𝑦 𝑞(𝑧)

 = H(X) + H(Y) – H(X,Y) (from above)  = H(X) – H(X|Y) (from chain rule)  = H(Y) – H(Y|X) (alternative form)  Note: The Mutual Information between two random

variables is the residual uncertainty about one r.v. after the other is revealed.

SLIDE 15

A Venn Diagram

 Works well for two random variables

H(Y|X) H(X) H(X|Y) H(Y) I(X;Y)

SLIDE 16

Fano’s Inequality:

p(y|x) decoder Y X = g(Y) x ^

Pe  H(X | Y ) − 1

𝐦𝐩𝐡 ( 𝒀 )

where Pe = Prob(X = X )

^

SLIDE 17

Asymptotic Equipartition Property

 Let X1, X2, …, Xn be i.i.d. according to p(x)

Sample space = set of all sequences (x1, x2, …, xn) A=Set of typical sequences

1) Pr{A} ≥ 1-

2) p(x)  2–nH(X) 3) A  2nH(X) A.E.P. { This is the DNA of IT !

SLIDE 18

Examples of typical sequences

 Let X be a biased coin with  P(Head)=0.9 and P(Tail) = 0.1  Consider the set of 1000-long sequences of coin tosses.  Typical sequences are those that have approximately

900 Heads and 100 Tails.

 Note: The most likely sequence, namely the one  with 1000 Heads, is not Typical !

SLIDE 19

Rn also tends to be non-intuitive

 Example: Sphere “inscribed” in cube of side 1.

 R2 R3

 Question: What happens in Rn ? Vn  0, 1,  ?

Radius 1/2

SLIDE 20

Transmission over Unreliable Channels

 The Channel Coding Problem:  W  {1,2,…,2𝑜𝑆} = message set of rate R  X = (x1 x2 … xn) = codeword input to channel  Y = (y1 y2 … yn) = codeword output from channel  𝑋

= decoded message P(error) = P{W𝑋}

𝑋 W X Y Channel Encoder Channel 𝑞(𝑧|𝑦) Channel Decoder

SLIDE 21

Shannon’s Channel Coding Theorem

 Using the channel n times:

Xn Yn

•

SLIDE 22

Simple examples

 Noiseless typewriter:

4 3 2 1 4 3 2 1 Can transmit R = 𝑚𝑝𝑕2 4 = 2 bits/transmission Number of noise free symbols = 4 Output Y Input X

SLIDE 23

Simple examples

 Noisy typewriter (type 1):

4 3 2 1 4 3 2 1 Can transmit R = 𝑚𝑝𝑕2 2 = 1 bit/transmission Number of noise free symbols = 2 Output Y Input X

0.5 0.5 0.5 0.5

SLIDE 24

Simple examples

 Noisy typewriter (type 2):

4 3 2 1 4 3 2 1 Can transmit R ≥ 𝑚𝑝𝑕2 2 = 1 bit/transmission Number of noise free symbols = 2 (apparently, surprise later) Output Y Input X

0.5 0.5 0.5 0.5

SLIDE 25

Simple examples

 Noisy typewriter (type 3):

4 3 2 1 4 3 2 1 Can transmit R = 𝑚𝑝𝑕2 2 = 1 bit/transmission Number of noise free symbols = 2 Use X=1 and X=3 Output Y Input X

0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5

SLIDE 26

Simple examples

 A tricky typewriter:

4 3 2 1 3 2 1 Output Y Input X 4 5 5

0.5 0.5 0.5 0.5 0.5

How many noise free symbols? Clearly at least 2, hopefully more.

SLIDE 27

Simple examples

 Consider the n=2 extension of the channel:

4 X1 X2 2 1 4 3 2 1 5 5 Which code squares to pick? 3

SLIDE 28

Simple examples

 Consider the n=2 extension of the channel:

4 X1 X2 3 2 1 4 3 2 1 5 5 Let {X1,X2} be {(1,1), (2,3), (3,5), (4,2), (5,4)}

SLIDE 29

Reminder of the channel

 A tricky typewriter:

4 3 2 1 3 2 1 Output Y Input X 4 5 5

0.5 0.5 0.5 0.5 0.5

How many noise free symbols? Clearly at least 2, hopefully more.

SLIDE 30

Simple examples

 Looking at the outputs:

4 Y1 Y2 3 2 1 4 3 2 1 5 5 Let {X1,X2} be {(1,1), (2,3), (3,5), (4,2), (5,4)}

SLIDE 31

Simple examples - observations

 Note that we get 5 noise-free symbols in n=2

transmissions.

 Thus achieve rate

𝑚𝑝𝑕25 2 = 1.16 bits/transmission

 with P(error) = 0. (Capacity of zero error, Lovász, 1979)  For arbitrarily small P(error) can use long codes (n)

to achieve log2 5/2=1.32 bits/transmission, the channel capacity.

SLIDE 32

The Binary Symmetric Channel (BSC)

 How many noise-free symbols?

 1- 1 1  X 1- A.: Clearly for n=1 there are none. How about using n large? Y

SLIDE 33

Shannon’s Second Theorem

 Using the channel 𝑜 times:

Xn Yn

•

SLIDE 34

Shannon’s Second Theorem

 The Information Channel Capacity of a discrete

memoryless channel is

 𝐷 = max 𝐽(𝑌; 𝑍).  Note: 𝐽 𝑌; 𝑍 is a function of 𝑞 𝑦, 𝑧 = 𝑞 𝑦 𝑞 𝑧 𝑦 .  But 𝑞 𝑧 𝑦 is fixed by the channel.

𝑞(𝑦)

SLIDE 35

Shannon’s Second Theorem

 Direct Part:  If R < C  There exists a code with P(error)  0  Converse Part:  If R > C  Communication with P(error)  0  is not possible.

SLIDE 36

Shannon’s Second Theorem

 Proof of Converse (sketch using AEP):

Xn Yn

•
Yn 2𝑜𝐼(𝑍)

typical balls 2𝑜𝐼(𝑍|𝑌)

SLIDE 37

Shannon’s Second Theorem

 Proof of Converse (sketch using AEP):  Recall the sphere packing problem.

Maximum number of non-overlapping balls is bounded by

2𝑜𝑆≤

2𝑜𝐼(𝑍) 2𝑜𝐼(𝑍|𝑌) = 2𝑜𝐽(𝑌:𝑍) ≤ 2𝑜𝐷

 Thus 2𝑆 ≤ 2𝐷 and 𝑆 ≤ 𝐷.  A formal proof uses Fano’s inequality.

SLIDE 38

Example: The Binary Symmetric Channel

  C = max (H(Y) – H(Y|X))  = 1 – h() bits/transmission  Note: C=0 for  = ½ .

 1- 1 1  X 1- Y  C() 1 ½ 1

SLIDE 39

Example: The Binary Erasure Channel

  C = max (H(Y) – H(Y|X))  = 1 –  bits/transmission

 1- 1 1  X 1- Y  C() 1 ½ E Note: C=0 for  = 1. Capacity is achieved with 𝑞 𝑌 = 0 = 𝑞 𝑌 = 1 =½ .

SLIDE 40

Example: The Z Channel

  C = max (H(Y) – H(Y|X)) = 𝑚𝑝𝑕2 5 − 2 = 0.322

𝑐𝑗𝑢 𝑢𝑠.

 Note: Maximizing 𝑞 𝑌 = 1 =

2 5 .

 Homework: Obtain this capacity.

1 1 X Y ½ ½

𝑞(𝑦)

SLIDE 41

Changing Z channel into BEC

 Show that code {01, 10} can transform a Z channel

into a BEC.

 What is a lower bound to capacity of the Z

channel?

SLIDE 42

Typewriter type 2:



Sum channel: 2C = 2C1 + 2C2 where C1 = C2 = 0.322 C = 1,322 bits/channel use How many noise free symbols?

½ ½ ½ ½

SLIDE 43

Example: Noisy typewriter

  C = max (H(Y) – H(Y|X))  = 𝑚𝑝𝑕226 − 𝑚𝑝𝑕22 = 𝑚𝑝𝑕213 bits/transmission  Achieved with uniform distribution on the inputs.

A C D ½ X ½ Y A B B C D E Z Z

SLIDE 44

Remark:

 For this example, we can also achieve  𝐷 = 𝑚𝑝𝑕213 bits/transmission with P(error)=0 and  n = 1 by transmitting alternating input symbols, i.e.,  X = {A C E … Z}.

SLIDE 45

Differential Entropy

 Let 𝑌 be a continuous random variable with density

𝑔 𝑦 and support 𝑇. The differential entropy of 𝑌 is

ℎ 𝑌 = − 𝑔 𝑦 log 𝑔 𝑦 𝑒𝑦

𝑇

(if it exists).

Note: Also written as ℎ 𝑔 .

SLIDE 46

Examples: Uniform distribution

 Let 𝑌 be uniform in the interval 0, 𝑏 . Then  𝑔 𝑦 =

1 𝑏 in the interval and 𝑔 𝑦 = 0 outside.

 ℎ 𝑌 = −

1 𝑏 𝑚𝑝𝑕 𝑏 1 𝑏 𝑒𝑦 = log 𝑏

 Note that ℎ 𝑌 can be negative for 𝑏 < 1.  However, 2ℎ(𝑔) = 2log 𝑏 = 𝑏 is the size of the

support set, which is non-negative.

SLIDE 47

Example: Gaussian distribution

 Let 𝑌 ~  𝑦 =

1 22 𝑓𝑦𝑞 ( −𝑦2 22)

 Then ℎ 𝑌 = ℎ  = −  𝑦 [−

𝑦2 22 − 𝑚𝑜 22] 𝑒𝑦

 =

𝐹𝑌2 22 + 1 2 𝑚𝑜 2 2

 =

1 2 𝑚𝑜( 2e2) nats

 Changing the base we have ℎ 𝑌 =

1 2 𝑚𝑝𝑕( 2e2) bits

SLIDE 48

Relation of Differential and Discrete Entropies

 Consider a quantization of X, denoted by X

 Let X = 𝑦𝑗 inside the 𝑗th interval.

Then 𝐼(𝑌) = - 𝑞𝑗

𝑗

𝑚𝑝𝑕 𝑞𝑗

= - 𝑔(𝑦𝑗)

𝑗

𝑚𝑝𝑕 𝑔(𝑦𝑗) - 𝑚𝑝𝑕   ℎ 𝑔 − log 



SLIDE 49

Differential Entropy

 So the two entropies differ by the log of the

quantization level .

 We can define joint differential entropy, conditional

differential entropy, K-L divergence and mutual information with some care to avoid infinite differential entropies.

SLIDE 50

K-L divergence and Mutual Information

  𝐸(𝑔 g) = 𝑔 𝑚𝑝𝑕

𝑔 𝑕

 𝐽 𝑌; 𝑍 = 𝑔 𝑦, 𝑧 𝑚𝑝𝑕

𝑔(𝑦,𝑧) 𝑔 𝑦 𝑔(𝑧) 𝑒𝑦 𝑒𝑧

 Thus, I(X;Y) = h(X) + h(Y) – h(X,Y).

Note: h(X) can be negative, but I(X;Y) is always  0.

SLIDE 51

Differential entropy of a Gaussian vector

 Theorem: Let 𝒀 be a Gaussian n-dimensional vector

with mean  and covariance matrix 𝐿. Then

 ℎ 𝒀 =

1 2 log((2𝑓)𝑜 𝐿 )

 where 𝐿 denotes the determinant of 𝐿.  Proof: Algebraic manipulation.

SLIDE 52

The Gaussian Channel

𝑋 W X Y Channel Encoder Channel Decoder Z~N (0, N I)

+

Power Constraint: EX2≤P

SLIDE 53

The Gaussian Channel

  W  {1,2,…,2𝑜𝑆} = message set of rate R  X = (x1 x2 … xn) = codeword input to channel  Y = (y1 y2 … yn) = codeword output from channel  𝑋

= decoded message P(error) = P{W𝑋}

𝑋 W X Y Channel Encoder Channel Decoder Z~N (0, N I)

+

Power constraint: EX2≤P

SLIDE 54

The Gaussian Channel

 Using the channel n times:

Xn Yn

•

SLIDE 55

The Gaussian Channel

  C𝑏𝑞𝑏𝑑𝑗𝑢𝑧 𝐷 = max 𝐽(𝑌; 𝑍)  𝐽 𝑌; 𝑍 = ℎ 𝑍 − ℎ 𝑍 𝑌 = ℎ 𝑍 − ℎ 𝑌 + 𝑎|𝑌  = ℎ 𝑍 − ℎ 𝑎 ≤

1 2 log 2e 𝑄 + 𝑂 − 1 2 log 2e𝑂

 =

1 2 log 1 + 𝑄 𝑂 bits/transmission

f(x): EX2≤P

SLIDE 56

The Gaussian Channel

  The capacity of the discrete time additive

Gaussian channel:

 𝐷 =

1 2 log 1 + 𝑄 𝑂 bits/transmission

 achieved with X ~ N(0 , P).

SLIDE 57

Bandlimited Gaussian Channel

 Consider the channel with continuous waveform inputs x(t)

with power constraint (

1 𝑈 𝑦2 𝑈

𝑢 𝑒𝑢 ≤ 𝑄) and Bandwidth limited to W. The channel has white Gaussian noise with power spectral density N0/2 watt/Hz.

 In the interval (0,T) we can specify the code waveform by

2WT samples (Nyquist criterion). We can transmit these samples over discrete time Gaussian channels with noise variance N0/2. This gives

 𝐷 = 𝑋 log

(1+

𝑄 𝑂0𝑋 ) bit/second

SLIDE 58

Bandlimited Gaussian Channel

  𝐷 = 𝑋 log

(1+

𝑄 𝑂0𝑋 ) bit/second

 Note: If W   we have C =

𝑄 𝑂0 𝑚𝑝𝑕2𝑓 bits/second.

SLIDE 59

Bandlimited Gaussian Channel

 Let

𝑆 𝑋 be the spectral density  in bits per second

per Hertz. Also let 𝑄 = 𝐹𝑐𝑆 where 𝐹𝑐 is the available energy per information bit.

 We get 

𝑆 𝑋 ≤ 𝐷 𝑋 = log

(1+

𝐹𝑐𝑆 𝑂0𝑋 ) bit/second.

 Thus 

𝐹𝑐 𝑂0 ≥ 2−1



This relation defines the so called Shannon Bound.

SLIDE 60

The Shannon Bound



𝐹𝑐 𝑂0 ≥ 2−1



 𝐹𝑐 𝑂0

𝐹𝑐 𝑂0 (dB)

0 0.69

1.59

0.1 0.718

1.44

0.25 0.757

1.21

0.5 0.828

0.82

1 1 2 1.5 1.76 4 3.75 5.74 8 31.87 15.03



– – – – – –

      

𝐹𝑐 𝑂0 (dB)

 Shannon Bound 5 4 3 2 1 6 5 4 3 2 1

1

SLIDE 61

Shannon’s Water Filling Solution

SLIDE 62

Parallel Gaussian Channels



2.5 3 2 1

SLIDE 63

Example of Water Filling

 Channels with noise levels 2, 1 and 3.  Available power = 2  Capacity=

1 2 log (1+ 0.5 2 ) + 1 2 log (1+ 1.5 1 ) + 1 2 log (1+ 3)

 Level of noise + signal power = 2.5  No power allocated to the third channel.

SLIDE 64

Parallel Gaussian Channels



2.5 3 2 1

SLIDE 65

Differential capacity

Discrete memoryless channel as a band limited channel

SLIDE 66

Multiplex strategies (TDMA, FDMA)

j  P j 

𝐷

𝑘 = 1 2 log

(1 + 𝑄

𝑂)

Aggregate capacity::

SLIDE 67

Multiplex strategies (non-orthonal CDMA)

Discrete memoryless channel as a band limited channel

j P j 

1

2 log

(1 +

𝑄 𝑂+ 𝑘−1 𝑄)

𝐷

𝑘 = 1 2 log

(1 + 𝑁𝑄

𝑂 )

j=1

M

Aggregate capacity::

SLIDE 68

TDMA or FDMA versus CDMA



Number of Users Aggregate Power

Bandwidth limitation (2WT dimensions) Non-orthogonal CDMA (log has no cap)

Orthogonal schemes:

SLIDE 69

Multiple User Information Theory

 Building Blocks:  Multiple Access Channels (MACs)  Broadcast Channels (BCs)  Interference Channels (IFCs)  Relay Channels (RCs)  Note: These channels have their discrete memoryless

and Gaussian versions. For simplicity we will look at the Gaussian models.

SLIDE 70

Multiple Access Channel

 A well understood model.  Models the uplink channel in wireless comm.

P(y|x1,x2) Y

Encoder Decoder Encoder

W1,W2 X2 X1 W1 W2

^ ^

Capacity region obtained by Ahlswede (1971) and Liao (1972)

SLIDE 71

Capacity region - MAC

 C = closure of convex hull of { (R1,R2) s.t.  R1 ≤ I(X1 ; Y | X2),  R2 ≤ I(X2 ; Y | X1),  R1 + R2 ≤ I(X1 , X2; Y ) for all p1(x1) . p2(x2) }

SLIDE 72

Multiple Access Channel (MAC)

SLIDE 73

Broadcast Channel (Cover, 1972)

 Still open in discrete memoryless case.  Models the downlink channel in wireless comm.

P(y1,y2 |x) Y1 X

Encoder Decoder

W1

^

Cover introduced superposition coding. W1,W2

Decoder

W2

^

Y2

SLIDE 74

Superposition coding



Message Clouds Y1 can see the message, Y2 can see the cloud center

SLIDE 75

Gaussian Broadcast Channel

SLIDE 76

Superposition coding

N2 (1-)P P 1

SLIDE 77

Superposition coding

N2 (1-)P P 1

SLIDE 78

Interference Channel

 Gaussian Interference Channel - standard form  Brief history  Z-Interference channel  Symmetric Interference channel

SLIDE 79

Standard Gaussian Interference Channel

Power P1 Power P2

a b

W1 W2 W1 W2

^ ^

SLIDE 80

Symmetric Gaussian Interference Channel

Power P Power P

SLIDE 81

Z-Gaussian Interference Channel

SLIDE 82

Interference Channel: Strategies

Things that we can do with interference:

1.

Ignore (take interference as noise (IAN)

2.

Avoid (divide the signal space (TDM/FDM))

3.

Partially decode both interfering signals

4.

Partially decode one, fully decode the other

5.

Fully decode both (only good for strong inter- ference, a≥1)

SLIDE 83

Interference Channel: Brief history

 Carleial (1975): Very strong interference does not

reduce capacity (a2 ≥ 1+P)

 Sato (1981), Han and kobayashi (1981): Strong

interference (a2 ≥ 1) : IFC behaves like 2 MACs

 Motahari, Khandani (2007), Shang, Kramer and

Chen (2007), Annapureddy, Veeravalli (2007): Very weak interference (2a(1+a2P) ≤ 1) : Treat interference as noise – (IAN)

SLIDE 84

Interference Ch.: History (continued)

 Sason (2004): Symmetrical superposition to beat TDM  Etkin, Tse, Wang (2008): capacity to within 1 bit  Polyanskiy and Wu, 2016: Corner points established.

SLIDE 85

Summary: Z interference Channels

 Z-Gaussian Interference Channel as a degraded

interference channel

 Discrete Memoryless Channel as a band limited

channel

 Multiplex Region: growing Noisebergs  Overflow Region: back to superposition

SLIDE 86

Z-Gaussian Interference Channel

SLIDE 87

Degraded Gaussian Interference Channel

SLIDE 88

Differential capacity

Discrete memoryless channel as a band limited channel

SLIDE 89

Interference x Broadcast Channels

SLIDE 90

Superposition coding

N2 (1-)P P 1

SLIDE 91

Superposition coding

N2 (1-)P P 1

SLIDE 92

Degraded Interference Channel

One Extreme Point

SLIDE 93

Degraded Interference Channel

Another Extreme Point

SLIDE 94

Intermediary Points (Multiplex Region)

SLIDE 95

Admissible region for (, h)

SLIDE 96

Intermediary Point (Overflow Region)

SLIDE 97

Admissible region

Q1=1 Q2 = 1 a = 0.5 N2 = 3



h

SLIDE 98

The Z-Gaussian Interference Channel Rate Region

Q1=1 Q2 = 1 a = 0.5 N2 = 3 R2 R1

SLIDE 99

Admissible region

Q1=1 Q2 = 1 a = 0.99 N2 = 0.02



h

SLIDE 100

Q1=1 Q2 = 1 a = 0.99 N2 = 0.02

The Z-Gaussian Interference Channel Rate Region

R2 R1

SLIDE 101

Some Remarks

Simple 2-D parameter space: (, h) Noiseberg region is the optimized

achieveable region for Gaussian signaling (Gaussian Han and Kobayashi Region) [Zhao et al., 2012]

SLIDE 102

Symmetric Gaussian Interference Channel

Power P Power P

SLIDE 103

Symmetric Interference Channels (joint work with Chandra Nair, CUHK)

 Discrete time channel seen as a band limited

channel – differential capacity

 Concave envelopes  Symmetric and Asymmetric Superposition  Phase transitions in parameter space

SLIDE 104

Interference channel: Spectra at Y1 and Y2

 At Y1 At Y2

Power P Power P

Interference a2P Interference a2P

Noise Noise

R1+R2 ≤log (1+

𝑄 1+𝑏2𝑄)

IAN:

SLIDE 105

Interference Channel: TDM/FDM:

Signal P Signal P

At Y1 At Y2 R1+R2≤

1 2 log

(1 + 2𝑄) Noise Noise a2P a2P

SLIDE 106

Concave Envelope

IAN vs TDM/FDM, a2=0.25

Tangent points TDM IAN IAN TDM

Power P Rate Sum

SLIDE 107

Multiplex domination

IAN vs TDM/FDM, a2=0.5

Power P

IAN TDM

No intersection beyond a2=0.5

Rate Sum

SLIDE 108

Interference as Noise and TDM/FDM

IAN TDM Superposition prone region

a2+a4 P >1

SLIDE 109

Rate Sum for IAN and TDM/FDM

 Insert 3D plot

IAN TDM/FDM

SLIDE 110

Superposition: partially decoding

 At Y1 At Y2

Noise Noise

V1 αP

V1 a2αP V2 a2αP

V2 αP U1 (1-α)P U1 a2(1-α)P U2 (1-α)P U2 a2(1-α)P

R1+R2 ≤ log(

1+𝑄+𝑏2𝑄

1−𝑏2 𝑏2 +𝑏2(1+𝑏2𝑄)) , Sason (2004)

SLIDE 111

Point where Symmetric Superposition starts beating TDM/FDM

P=50

TDM/FDM IAN Symmetric-Superposition (Sason)

a2

Rate Sum

SLIDE 112

Rate Sum, a2=0.05: Need convexification

Sason

TDM/FDM IAN IAN

Power P

Rate sum

SLIDE 113

Rate sum for P=1000, 0≤a2≤1

IAN TDM Symmetric superposition Asymmetric superposition

Rate sum values before convexification along P

a2 R1+R2

SLIDE 114

Symmetric superposition:

P a2 Sason’s Band

Above Sason’s Band Below Sason’s band

SLIDE 115

Symmetric Superposition (continued):

 Optimal choice for α = α1 = α2 :

 Case 1:

 If

1−𝑏2 𝑏4

≤ 𝑄 ≤

1−𝑏6 𝑏6 1−𝑏2 (𝑇𝑏𝑡𝑝𝑜′𝑡 𝐶𝑏𝑜𝑒)

then set ∝ 𝑄 = 𝑏2 1 + 𝑏2𝑄 − 1;

 Case 2:

 If 𝑄 ≥

1−𝑏6 𝑏6 1−𝑏2 (𝐵𝑐𝑝𝑤𝑓 𝑇𝑏𝑡𝑝𝑜′𝑡 𝐶𝑏𝑜𝑒)

then set ∝ 𝑄 =

1−𝑏2 𝑏2 1+𝑏2 . Note: Invariant with P

SLIDE 116

Symmetric Superposition (continued):

 In Sason’s Band:  𝑆1+𝑆2 ≤ log

𝑏2 1+𝑄+𝑏2𝑄 1−𝑏2+𝑏4(1+𝑏2𝑄)

 Above Sason’s Band:  𝑆1+𝑆2 ≤

1 2 log 1+𝑏2 2 1+𝑄+𝑏2𝑄 4𝑏2

SLIDE 117

The hummingbird function:

α1 α2

Rate Sum

SLIDE 118

The shroud function

α1 α2

Rate Sum

SLIDE 119

Min (hummingbird, shroud)

α1 α2

Rate Sum

SLIDE 120

Flapping wings

α1 α2

Rate Sum

SLIDE 121

Asymmetric-Superposition vs TDM/FDM

a2 P

TDM/FDM

Asym. Sup.

SLIDE 122

Phase Transitions in Weak Interference

Symmetric Superposition Asymmetric Superposition

TDM/F DM

TDM/FDM

P

a

2

Note: Transitional regions due to convexification along P not included.

SLIDE 123

Pairwise Phase Transitions

Sym-Sup vs. TDM Asym-sup vs. TDM Sym-sup vs. Asym-sup Sason’s band lower limit Sason’s band upper limit

P a2

SLIDE 124

A pleasant resemblance

SLIDE 125

Asymptotically as P  ∞

0 < a2 < 0.087 -- symmetric superposition is best 0.087 < a2 < 1 – asymmetric superposition is best

SLIDE 126

As before: Need convexification along P

SLIDE 127

Remarks

 Powerful tool: Concave envelopes to transition from

ne mode to another: time sharing between modes

 Shown a full taxonomy of phase transitions in (a2, P)

parameter space with 0< a2 <1, P>0:

 4 pure modes (IAN, TDM, Symmetric Superposition,

and Asymmetric Superposition) and

 4 transitional regions (IAN vs. TDM, TDM vs. Sym-Sup,

TDM vs. Asym-Sup, and Sym-Sup vs. Asym-Sup)

SLIDE 128

The Relay Channel

 The least understood. Capacity not known.

 Upper bound: Cut set bound

Recent result: Cut set bound is not tight,(Wu and Özgür, 2015)

 Lower bounds: Decode-and-Forward,  Compress and Forward,  Compute and Forward.

Y1 : X1 X Y Sender Relay Receiver

SLIDE 129

The Relay Channel

  The relay channel is said to be physically degraded

if p(y,y1|x,x1)=p(y1|x,x1) p(y|y1,x1).

 So Y is a degradation of the relay signal Y1 .  Theorem: C = sup min { I(X,X1;Y1), I(X;Y1|X1)}

Y1 : X1 X Y p(x,x1)

SLIDE 130

I.T. - Applications to Biology

 BCH error correcting codes have been found in DNA sequences

generated by BCH codes over GF(4)

 L.C.B. Faria, A.S.L. Rocha, J.H. Kleinschmidt, R. Palazzo Jr. and M.C. Silva-Filho

 The question raised by researchers in the field of mathematical

biology regarding the existence of error-correcting codes in the structure of the DNA sequences is answered positively. It is shown, for the first time, that DNA sequences such as proteins, targeting sequences and internal sequences are identified as codewords of BCH codes over Galois fields.

 Electronics Letters, vol 46, No. 3, 4/Feb/2010

SLIDE 131

I. T. - Applications to Economics

 Stock Market:  Portfolio b=(b1 b2 … bm), bi ≥ 0, ∑ bi =1  Stock vector X= (x1, x2, … xm)  Stocks Xi ≥0 , i = 1,2,…, n.  xi represent the relative final price w.r.t. initial price

in day i. For example, xi = 1.03 represent a 3% variation that day.

 The wealth after n days using portfolio b is  Sn=

𝑐𝑈𝑌𝑗

𝑜 𝑗=1

SLIDE 132

Optimal portfolio

 Def.: The growth rate of a stock portfolio b w.r.t. to

a stock market distribution F(x) is

 W(b,F) = E log bTX.  Def. The optimal growth rate W*(F) is  W*(F) = max W(b,F)  Theorem: The optimal wealth after n days behaves

as 𝑇𝑜* ≈ 2𝑜𝑋∗ with probability 1.

b

SLIDE 133

Proof

 By the strong Law of Large Numbers, 

1 𝑜 log 𝑇𝑜* = 1 𝑜

log 𝑐∗𝑈

𝑜 𝑗=1

𝑌𝑗

  W* with probability 1.  Thus  𝑇𝑜* ≈ 2𝑜𝑋∗ with probability 1.

SLIDE 134

Some work fronts

Joint source and channel coding Coding for channels with side information Distributed source coding Network strategies Merging of Network Coding and Multi User IT

SLIDE 135

 Many thanks!