An Alphabet-Size Bound for the Information Bottleneck Function ISIT - - PowerPoint PPT Presentation

an alphabet size bound for the information bottleneck
SMART_READER_LITE
LIVE PREVIEW

An Alphabet-Size Bound for the Information Bottleneck Function ISIT - - PowerPoint PPT Presentation

An Alphabet-Size Bound for the Information Bottleneck Function ISIT 2020 Christoph Hirche , Andreas Winter What for? DNNs video processing clustering C. Hirche IBM bounds 2/16 Sufficient Statistics Sufficient statistics are maps or


slide-1
SLIDE 1

An Alphabet-Size Bound for the Information Bottleneck Function

ISIT 2020 Christoph Hirche, Andreas Winter

slide-2
SLIDE 2

What for?

DNNs video processing clustering

  • C. Hirche – IBM bounds

2/16

slide-3
SLIDE 3

Sufficient Statistics

Sufficient statistics are maps or partitions of X, S(X), that capture all the information that X has on Y. Namely, I(S(X); Y) = I(X; Y).

  • C. Hirche – IBM bounds

3/16

slide-4
SLIDE 4

Sufficient Statistics

Sufficient statistics are maps or partitions of X, S(X), that capture all the information that X has on Y. Namely, I(S(X); Y) = I(X; Y). Minimal sufficient statistics, T(X), are the simplest sufficient statistics. T(X) =

arg min

S(X):I(S(X);Y)=I(X;Y)

I(S(X); X).

  • C. Hirche – IBM bounds

3/16

slide-5
SLIDE 5

Sufficient Statistics

Sufficient statistics are maps or partitions of X, S(X), that capture all the information that X has on Y. Namely, I(S(X); Y) = I(X; Y). Minimal sufficient statistics, T(X), are the simplest sufficient statistics. T(X) =

arg min

S(X):I(S(X);Y)=I(X;Y)

I(S(X); X). Approximate minimal sufficient statistics ⇔ Information Bottleneck

min

S(X):I(S(X);Y)≥a I(S(X); X)

  • C. Hirche – IBM bounds

3/16

slide-6
SLIDE 6

Application in ML

  • C. Hirche – IBM bounds

4/16

slide-7
SLIDE 7

IB optimality?

From Schwartz-Ziv, Tishby: The DNN layers converge to fixed-points of the IB equations.

  • C. Hirche – IBM bounds

5/16

slide-8
SLIDE 8

Dimension Bounds

Generally known:

|W| ≤ |X| + 1.

  • C. Hirche – IBM bounds

6/16

slide-9
SLIDE 9

Dimension Bounds

Generally known:

|W| ≤ |X| + 1.

But can we get bounds in terms of |Y|?

  • C. Hirche – IBM bounds

6/16

slide-10
SLIDE 10

Dimension Bounds

Generally known:

|W| ≤ |X| + 1.

But can we get bounds in terms of |Y|? Maybe approximate? IXY(R, N) ≤ IXY(R) ≤ IXY(R, N) + δ(ǫ, |Y|) for some δ(ǫ, |Y|) and |W| ≤ N(ǫ, |Y|).

  • C. Hirche – IBM bounds

6/16

slide-11
SLIDE 11

Recoverability

Lemma

Given a joint distribution PXY of two random variables X and Y, and assuming that there exist N probability distributions Q1, . . . , QN on Y, and a function f : X −

→ [N] with the property that ∀x

1 2PY|X=x − Qf(x)1 ≤ ǫ, for some ǫ > 0. Then there exists a recovery channel S : [N] −

→ X such that the Markov

chain Y − X − X ′ − X defined by X ′ = f(X) and P

X|X ′ = S satisfies

PX = P

X and 1 2PXY − P XY1 ≤ ǫ′ = 2ǫ.

  • C. Hirche – IBM bounds

7/16

slide-12
SLIDE 12

Bounds on N?

How large does N need to be?

  • C. Hirche – IBM bounds

8/16

slide-13
SLIDE 13

Bounds on N?

How large does N need to be? Easy: N ≤ |X|, but that’s still too big.

  • C. Hirche – IBM bounds

8/16

slide-14
SLIDE 14

Bounds on N?

How large does N need to be? Easy: N ≤ |X|, but that’s still too big. In the worst case, we need to choose an ǫ-net of the probability simplex

P(Y) of all probability distributions on Y with respect to the total

variational distance, which results in N ≤

2 ǫ |Y| .

Generally, one can do much better (e.g. for deterministic data sets).

  • C. Hirche – IBM bounds

8/16

slide-15
SLIDE 15

IBM Bound

Lemma

Let Y − X − X be a Markov chain. Then the IB function of PXY dominates the IB function of P

XY pointwise:

IXY(R) ≥ I

XY(R)

∀R.

  • C. Hirche – IBM bounds

9/16

slide-16
SLIDE 16

Alphabet-Size bounds

Corollary

Under the assumptions of our main lemma, IX ′Y(R) ≤ IXY(R) ≤ IX ′Y(R) + δ(ǫ, |Y|), where δ(ǫ, |Y|) := ǫ′ log |Y| + (1+ǫ′)h

  • ǫ′

1+ǫ′

  • .

Corollary

Under the assumptions of our main lemma, IXY(R, N) ≤ IXY(R) ≤ IXY(R, N) + δ(ǫ, |Y|), where δ(ǫ, |Y|) := ǫ′ log |Y| + (1+ǫ′)h

  • ǫ′

1+ǫ′

  • and N ≤

2

ǫ

|Y|.

  • C. Hirche – IBM bounds

10/16

slide-17
SLIDE 17

Quantum IB

  • C. Hirche – IBM bounds

11/16

slide-18
SLIDE 18

QIB

For a quantum state ρXY , we define Rq(a) =

inf

NX→W I(Y;W)σ≥a

I(YR; W)σ with,

σWYR := (NX→W ⊗ idYR)ΨXYR

and ΨXYR a purification of ρXY .

  • C. Hirche – IBM bounds

12/16

slide-19
SLIDE 19

QIB

Lemma

For X and Y quantum, and W classical, an optimal solution for the quantum information bottleneck can be achieved with

|W| ≤ |Y|2|R|2 + 1. Lemma

For Y quantum, but X and W classical, an optimal solution for the quantum information bottleneck can be achieved with |W| ≤ |X| + 1.

  • C. Hirche – IBM bounds

13/16

slide-20
SLIDE 20

QIB

Lemma

Given a classical-quantum state

ρXY =

  • x

p(x)|xx| ⊗ ρx

Y,

(1) and assume that there exist N quantum states σ1

Y, . . . , σN Y and a function

f : X −

→ [N] with the property that ∀x

1 2ρx

Y − σf(x) Y

1 ≤ ǫ,

(2) for given ǫ > 0. Then there exists a recovery channel S : [N] −

→ X such that the Markov

chain Y − X − X ′ − X defined by X ′ = f(X) and P

X|X ′ = S satisfies

PX = P

X and 1 2ρXY − ρ XY1 ≤ ǫ′ = 2ǫ.

  • C. Hirche – IBM bounds

14/16

slide-21
SLIDE 21

QIB

For Y quantum, but X and W classical:

Corollary

Under the assumptions of the previous lemma, Icq

X ′Y(R) ≤ Icq XY(R) ≤ Icq X ′Y(R) + δ(ǫ, |Y|),

where δ(ǫ, |Y|) := ǫ′ log |Y| + (1+ǫ′)h

  • ǫ′

1+ǫ′

  • .

Corollary

Under the assumptions of the previous lemma, Icq

XY(R, N) ≤ Icq XY(R) ≤ Icq XY(R, N) + δ(ǫ, |Y|),

where δ(ǫ, |Y|) is as before and N ≤

3

ǫ

2|Y|2

.

  • C. Hirche – IBM bounds

15/16

slide-22
SLIDE 22

The End

Summary: New approach to alphabet-size bounds via recoverability. New bounds on approximating the IB with alphabet-size limited by

|Y| (instead of |X|).

Open Questions: Other applications to recoverability approach. Fully quantum case. (Stay tuned for more on this soon1.) Thanks!!

1 M. Christandl, CH, AW, in preparation, 2020

  • C. Hirche – IBM bounds

16/16