Application of Information Theory, Lecture 8
Kolmogorov Complexity and Other Entropy Measures
Iftach Haitner
Tel Aviv University.
December 16, 2014
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 1 / 24
Kolmogorov Complexity and Other Entropy Measures Iftach Haitner - - PowerPoint PPT Presentation
Application of Information Theory, Lecture 8 Kolmogorov Complexity and Other Entropy Measures Iftach Haitner Tel Aviv University. December 16, 2014 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 1 / 24
Tel Aviv University.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 1 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 2 / 24
◮ What is the description length of the following strings?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
◮
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
◮
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
◮
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
◮
◮ Bergg’s paradox: Let s be “the smallest positive integer that cannot be
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
◮
◮ Bergg’s paradox: Let s be “the smallest positive integer that cannot be
◮ The above is a definition of s, of less than twelve English words...
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ What is the description length of the following strings?
◮
◮ Bergg’s paradox: Let s be “the smallest positive integer that cannot be
◮ The above is a definition of s, of less than twelve English words... ◮ Solution: the word “described" above in the definition of s is not well
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 3 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined. ◮ Why C++?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined. ◮ Why C++? ◮ All (complete) programming language/computational model are
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined. ◮ Why C++? ◮ All (complete) programming language/computational model are
◮ Let K ′(x) be the description length of x in another complete language,
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined. ◮ Why C++? ◮ All (complete) programming language/computational model are
◮ Let K ′(x) be the description length of x in another complete language,
◮ What is K(x) for x = 0101010101 . . . 01
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined. ◮ Why C++? ◮ All (complete) programming language/computational model are
◮ Let K ′(x) be the description length of x in another complete language,
◮ What is K(x) for x = 0101010101 . . . 01
◮ “For i = 1 : i++ : n; print 01”
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined. ◮ Why C++? ◮ All (complete) programming language/computational model are
◮ Let K ′(x) be the description length of x in another complete language,
◮ What is K(x) for x = 0101010101 . . . 01
◮ “For i = 1 : i++ : n; print 01” ◮ K(x) ≤ log n + const
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined. ◮ Why C++? ◮ All (complete) programming language/computational model are
◮ Let K ′(x) be the description length of x in another complete language,
◮ What is K(x) for x = 0101010101 . . . 01
◮ “For i = 1 : i++ : n; print 01” ◮ K(x) ≤ log n + const ◮ This is considered to be small complexity. We typically ignore log n
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined. ◮ Why C++? ◮ All (complete) programming language/computational model are
◮ Let K ′(x) be the description length of x in another complete language,
◮ What is K(x) for x = 0101010101 . . . 01
◮ “For i = 1 : i++ : n; print 01” ◮ K(x) ≤ log n + const ◮ This is considered to be small complexity. We typically ignore log n
◮ What is K(x) for x being the first n digits of π?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
◮ For s string x ∈ {0, 1}∗, let K(x) be the length of the shortest C++
◮ Now the term “described” is well defined. ◮ Why C++? ◮ All (complete) programming language/computational model are
◮ Let K ′(x) be the description length of x in another complete language,
◮ What is K(x) for x = 0101010101 . . . 01
◮ “For i = 1 : i++ : n; print 01” ◮ K(x) ≤ log n + const ◮ This is considered to be small complexity. We typically ignore log n
◮ What is K(x) for x being the first n digits of π? ◮ K(x) = log n + const
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 5 / 24
◮ What is K(x) for x ∈ {0, 1}n with k ones?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 5 / 24
◮ What is K(x) for x ∈ {0, 1}n with k ones? ◮ Recall that
k
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 5 / 24
◮ What is K(x) for x ∈ {0, 1}n with k ones? ◮ Recall that
k
◮ Hence K(x) ≤ log n + nh(k/n)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 5 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24
◮ K(x) ≤ |x| + const
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24
◮ K(x) ≤ |x| + const ◮ Proof: “output x”
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24
◮ K(x) ≤ |x| + const ◮ Proof: “output x” ◮ Most sequences have high Kolmogorov complexity:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24
◮ K(x) ≤ |x| + const ◮ Proof: “output x” ◮ Most sequences have high Kolmogorov complexity: ◮ At most 2n−1 (C++) programs of length ≤ n − 2
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24
◮ K(x) ≤ |x| + const ◮ Proof: “output x” ◮ Most sequences have high Kolmogorov complexity: ◮ At most 2n−1 (C++) programs of length ≤ n − 2 ◮ 2n strings of length n
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24
◮ K(x) ≤ |x| + const ◮ Proof: “output x” ◮ Most sequences have high Kolmogorov complexity: ◮ At most 2n−1 (C++) programs of length ≤ n − 2 ◮ 2n strings of length n ◮ Hence, at least 1
2 of n-bit strings have Kolmogorov complexity at least
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24
◮ K(x) ≤ |x| + const ◮ Proof: “output x” ◮ Most sequences have high Kolmogorov complexity: ◮ At most 2n−1 (C++) programs of length ≤ n − 2 ◮ 2n strings of length n ◮ Hence, at least 1
2 of n-bit strings have Kolmogorov complexity at least
◮ In particular, a random sequence has Kolmogorov complexity ≈ n
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 7 / 24
◮ K(x|y) — Kolmogorov complexity of x given y. The length of the
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 7 / 24
◮ K(x|y) — Kolmogorov complexity of x given y. The length of the
◮ Chain rule
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 7 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24
◮ Both quantities measure the amount of uncertainty or randomness in an
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24
◮ Both quantities measure the amount of uncertainty or randomness in an
◮ Both measure the number of bits it takes to describe an object
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24
◮ Both quantities measure the amount of uncertainty or randomness in an
◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X1, . . . , Xn be iid, then whp
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24
◮ Both quantities measure the amount of uncertainty or randomness in an
◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X1, . . . , Xn be iid, then whp
◮ Proof: ?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24
◮ Both quantities measure the amount of uncertainty or randomness in an
◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X1, . . . , Xn be iid, then whp
◮ Proof: ?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24
◮ Both quantities measure the amount of uncertainty or randomness in an
◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X1, . . . , Xn be iid, then whp
◮ Proof: ? AEP
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24
◮ Both quantities measure the amount of uncertainty or randomness in an
◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X1, . . . , Xn be iid, then whp
◮ Proof: ? AEP ◮ Example: coin flip (0.7, 0.3) then whp we get a string with
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24
◮ A program of length K(x) that outputs x, compresses x into k(x) bit of
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24
◮ A program of length K(x) that outputs x, compresses x into k(x) bit of
◮ Example: length of the human genome: 6 · 109 bits
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24
◮ A program of length K(x) that outputs x, compresses x into k(x) bit of
◮ Example: length of the human genome: 6 · 109 bits ◮ But the code is redundant
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24
◮ A program of length K(x) that outputs x, compresses x into k(x) bit of
◮ Example: length of the human genome: 6 · 109 bits ◮ But the code is redundant ◮ The relevant number to measure the number of possible values is the
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24
◮ A program of length K(x) that outputs x, compresses x into k(x) bit of
◮ Example: length of the human genome: 6 · 109 bits ◮ But the code is redundant ◮ The relevant number to measure the number of possible values is the
◮ No-one knows its value...
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
◮ Namely, the probability that if one picks a program at random, it prints x.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
◮ Namely, the probability that if one picks a program at random, it prints x. ◮ Insensitive (up o constant factor) to the computation model.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
◮ Namely, the probability that if one picks a program at random, it prints x. ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: PU(x) is the the probability that you observe x in nature.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
◮ Namely, the probability that if one picks a program at random, it prints x. ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: PU(x) is the the probability that you observe x in nature.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
◮ Namely, the probability that if one picks a program at random, it prints x. ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: PU(x) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
◮ Namely, the probability that if one picks a program at random, it prints x. ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: PU(x) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
◮ Namely, the probability that if one picks a program at random, it prints x. ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: PU(x) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
◮ Namely, the probability that if one picks a program at random, it prints x. ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: PU(x) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier
◮ The interesting part is PU(x) ≤ c · 2−K(x)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
p : p()=x 2−|p| = Prp←{0,1}∞ [p() = x]
◮ Namely, the probability that if one picks a program at random, it prints x. ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: PU(x) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier
◮ The interesting part is PU(x) ≤ c · 2−K(x) ◮ Hence, for X ∼ PU, it holds that
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24
◮ We need to find c > 0 such that k(x) ≤ log
1 PU(x) + c for every x ∈ {0, 1}∗
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24
◮ We need to find c > 0 such that k(x) ≤ log
1 PU(x) + c for every x ∈ {0, 1}∗
◮ In other words, find a program to output x whose length is log
1 PU(x) + c
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24
◮ We need to find c > 0 such that k(x) ≤ log
1 PU(x) + c for every x ∈ {0, 1}∗
◮ In other words, find a program to output x whose length is log
1 PU(x) + c
◮ Idea, program chooses a leaf on the Shannon code for PU (in which x is
1 PU(x)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24
◮ We need to find c > 0 such that k(x) ≤ log
1 PU(x) + c for every x ∈ {0, 1}∗
◮ In other words, find a program to output x whose length is log
1 PU(x) + c
◮ Idea, program chooses a leaf on the Shannon code for PU (in which x is
1 PU(x)
◮ Problem: PU is not computable
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24
◮ We need to find c > 0 such that k(x) ≤ log
1 PU(x) + c for every x ∈ {0, 1}∗
◮ In other words, find a program to output x whose length is log
1 PU(x) + c
◮ Idea, program chooses a leaf on the Shannon code for PU (in which x is
1 PU(x)
◮ Problem: PU is not computable ◮ Solution: compute a better and better estimate for the tree of PU along
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
◮ Initial T to be the infinite Binary tree.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
◮ Initial T to be the infinite Binary tree.
1 ˆ PU(x)
p′ : emulated p′ has output x 2−|p′|
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
◮ Initial T to be the infinite Binary tree.
1 ˆ PU(x)
p′ : emulated p′ has output x 2−|p′|
◮ The program never gets stack (can always add the node).
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
◮ Initial T to be the infinite Binary tree.
1 ˆ PU(x)
p′ : emulated p′ has output x 2−|p′|
◮ The program never gets stack (can always add the node).
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
◮ Initial T to be the infinite Binary tree.
1 ˆ PU(x)
p′ : emulated p′ has output x 2−|p′|
◮ The program never gets stack (can always add the node).
x 2−K(x) ≤ 1, the proof follows by Kraft inequality.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
◮ Initial T to be the infinite Binary tree.
1 ˆ PU(x)
p′ : emulated p′ has output x 2−|p′|
◮ The program never gets stack (can always add the node).
x 2−K(x) ≤ 1, the proof follows by Kraft inequality.
◮ ∀x ∈ {0, 1}∗: M adds a node (·, x, ·) to T at depth 2 +
1 PU(x)
Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
◮ Initial T to be the infinite Binary tree.
1 ˆ PU(x)
p′ : emulated p′ has output x 2−|p′|
◮ The program never gets stack (can always add the node).
x 2−K(x) ≤ 1, the proof follows by Kraft inequality.
◮ ∀x ∈ {0, 1}∗: M adds a node (·, x, ·) to T at depth 2 +
1 PU(x)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
◮ Initial T to be the infinite Binary tree.
1 ˆ PU(x)
p′ : emulated p′ has output x 2−|p′|
◮ The program never gets stack (can always add the node).
x 2−K(x) ≤ 1, the proof follows by Kraft inequality.
◮ ∀x ∈ {0, 1}∗: M adds a node (·, x, ·) to T at depth 2 +
1 PU(x)
◮ For x ∈ {0, 1}∗, let ℓ(x) be the location its (2 +
1 PU(x)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
◮ Initial T to be the infinite Binary tree.
1 ˆ PU(x)
p′ : emulated p′ has output x 2−|p′|
◮ The program never gets stack (can always add the node).
x 2−K(x) ≤ 1, the proof follows by Kraft inequality.
◮ ∀x ∈ {0, 1}∗: M adds a node (·, x, ·) to T at depth 2 +
1 PU(x)
◮ For x ∈ {0, 1}∗, let ℓ(x) be the location its (2 +
1 PU(x)
◮ Program for printing x. Run M till it assigns the node at the location of ℓ(x)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24
◮ (another) Proof that there are infinity many primes.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24
◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p1, . . . , pm
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24
◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p1, . . . , pm ◮ Any length n integer x can be written as x = m
i=1 pdi i
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24
◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p1, . . . , pm ◮ Any length n integer x can be written as x = m
i=1 pdi i
◮ di ≤ n, hence length di ≤ log n
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24
◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p1, . . . , pm ◮ Any length n integer x can be written as x = m
i=1 pdi i
◮ di ≤ n, hence length di ≤ log n ◮ Hence, K(x) ≤ m · log n + const
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24
◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p1, . . . , pm ◮ Any length n integer x can be written as x = m
i=1 pdi i
◮ di ≤ n, hence length di ≤ log n ◮ Hence, K(x) ≤ m · log n + const ◮ But for most numbers k(x) ≥ n − 1
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K(s) > 2C + 10, 000
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K(s) > 2C + 10, 000 ◮ s can be computed by the following program:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K(s) > 2C + 10, 000 ◮ s can be computed by the following program:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K(s) > 2C + 10, 000 ◮ s can be computed by the following program:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K(s) > 2C + 10, 000 ◮ s can be computed by the following program:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K(s) > 2C + 10, 000 ◮ s can be computed by the following program:
◮ Thus K(s) < C + log C + log 10, 000 + const < 2C + 10, 000
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K(s) > 2C + 10, 000 ◮ s can be computed by the following program:
◮ Thus K(s) < C + log C + log 10, 000 + const < 2C + 10, 000 ◮ Bergg’s Paradox, revisited:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K(s) > 2C + 10, 000 ◮ s can be computed by the following program:
◮ Thus K(s) < C + log C + log 10, 000 + const < 2C + 10, 000 ◮ Bergg’s Paradox, revisited: ◮ s — the smallest positive number with K(s) > 10000
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
◮ Can we compute K(x)? ◮ Answer, No. ◮ Proof: Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K(s) > 2C + 10, 000 ◮ s can be computed by the following program:
◮ Thus K(s) < C + log C + log 10, 000 + const < 2C + 10, 000 ◮ Bergg’s Paradox, revisited: ◮ s — the smallest positive number with K(s) > 10000 ◮ This is not a paradox, since the description of s is not short.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
◮ For most strings K(x) > C + 1, but it cannot be proven even for a single
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
◮ For most strings K(x) > C + 1, but it cannot be proven even for a single
◮ K(x) ≥ C is an example for a theorem that cannot be proven, and for
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
◮ For most strings K(x) > C + 1, but it cannot be proven even for a single
◮ K(x) ≥ C is an example for a theorem that cannot be proven, and for
◮ Proof: for integer C define the program TC:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
◮ For most strings K(x) > C + 1, but it cannot be proven even for a single
◮ K(x) ≥ C is an example for a theorem that cannot be proven, and for
◮ Proof: for integer C define the program TC:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
◮ For most strings K(x) > C + 1, but it cannot be proven even for a single
◮ K(x) ≥ C is an example for a theorem that cannot be proven, and for
◮ Proof: for integer C define the program TC:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
◮ For most strings K(x) > C + 1, but it cannot be proven even for a single
◮ K(x) ≥ C is an example for a theorem that cannot be proven, and for
◮ Proof: for integer C define the program TC:
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
◮ For most strings K(x) > C + 1, but it cannot be proven even for a single
◮ K(x) ≥ C is an example for a theorem that cannot be proven, and for
◮ Proof: for integer C define the program TC:
◮ |TC| = log C + D, where D is a const
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
◮ For most strings K(x) > C + 1, but it cannot be proven even for a single
◮ K(x) ≥ C is an example for a theorem that cannot be proven, and for
◮ Proof: for integer C define the program TC:
◮ |TC| = log C + D, where D is a const ◮ Take C such that C > log C + D
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
◮ Can we give an explicit example of string x with large k(x)?
◮ For most strings K(x) > C + 1, but it cannot be proven even for a single
◮ K(x) ≥ C is an example for a theorem that cannot be proven, and for
◮ Proof: for integer C define the program TC:
◮ |TC| = log C + D, where D is a const ◮ Take C such that C > log C + D ◮ If TC stops and outputs x, then k(x) < log C + D < C, a contradiction to
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 15 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 16 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)|
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)}
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X) ◮ For α = 1 ∈ N — Hα =
1 1−α log
i=1 pα i
α 1−α log(pα)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X) ◮ For α = 1 ∈ N — Hα =
1 1−α log
i=1 pα i
α 1−α log(pα)
◮ H∞(X) ≤ H2(X) ≤ H(X) ≤ H0(X) (Jensen)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X) ◮ For α = 1 ∈ N — Hα =
1 1−α log
i=1 pα i
α 1−α log(pα)
◮ H∞(X) ≤ H2(X) ≤ H(X) ≤ H0(X) (Jensen)
◮ For instance, CP(X) ≤
x p(x) maxx′ p(x′) = maxx′ p(x′).
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X) ◮ For α = 1 ∈ N — Hα =
1 1−α log
i=1 pα i
α 1−α log(pα)
◮ H∞(X) ≤ H2(X) ≤ H(X) ≤ H0(X) (Jensen)
◮ For instance, CP(X) ≤
x p(x) maxx′ p(x′) = maxx′ p(x′).
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X) ◮ For α = 1 ∈ N — Hα =
1 1−α log
i=1 pα i
α 1−α log(pα)
◮ H∞(X) ≤ H2(X) ≤ H(X) ≤ H0(X) (Jensen)
◮ For instance, CP(X) ≤
x p(x) maxx′ p(x′) = maxx′ p(x′). Hence,
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X) ◮ For α = 1 ∈ N — Hα =
1 1−α log
i=1 pα i
α 1−α log(pα)
◮ H∞(X) ≤ H2(X) ≤ H(X) ≤ H0(X) (Jensen)
◮ For instance, CP(X) ≤
x p(x) maxx′ p(x′) = maxx′ p(x′). Hence,
◮ H2(X) ≤ 2 H∞(X)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X) ◮ For α = 1 ∈ N — Hα =
1 1−α log
i=1 pα i
α 1−α log(pα)
◮ H∞(X) ≤ H2(X) ≤ H(X) ≤ H0(X) (Jensen)
◮ For instance, CP(X) ≤
x p(x) maxx′ p(x′) = maxx′ p(x′). Hence,
◮ H2(X) ≤ 2 H∞(X) ◮ Proof: CP(X) ≥ (maxx′ p(x′))2.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X) ◮ For α = 1 ∈ N — Hα =
1 1−α log
i=1 pα i
α 1−α log(pα)
◮ H∞(X) ≤ H2(X) ≤ H(X) ≤ H0(X) (Jensen)
◮ For instance, CP(X) ≤
x p(x) maxx′ p(x′) = maxx′ p(x′). Hence,
◮ H2(X) ≤ 2 H∞(X) ◮ Proof: CP(X) ≥ (maxx′ p(x′))2.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
◮ Recall that Shannon entropy of X is
x∈X −p(x) · log p(x) = EX [− log p(X)]
◮ Max entropy of X is H0(X) = log |Supp(X)| ◮ Min entropy of X is H∞(X) = minx∈X {− log p(x)} = − log maxx∈X {p(x)} ◮ Collision probability of X is CP(X) =
x∈X p(x)2
◮ Collision entropy/Renyi entropy of X is H2(X) = − log CP(X) ◮ For α = 1 ∈ N — Hα =
1 1−α log
i=1 pα i
α 1−α log(pα)
◮ H∞(X) ≤ H2(X) ≤ H(X) ≤ H0(X) (Jensen)
◮ For instance, CP(X) ≤
x p(x) maxx′ p(x′) = maxx′ p(x′). Hence,
◮ H2(X) ≤ 2 H∞(X) ◮ Proof: CP(X) ≥ (maxx′ p(x′))2. Hence, − log CP(X) ≤ −2 H∞(X)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 17 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 18 / 24
◮ No simple chain rule.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 18 / 24
◮ No simple chain rule. ◮ Let X =⊥ wp 1
2 and uniform over {0, 1}n otherwise, and let Y be
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 18 / 24
◮ No simple chain rule. ◮ Let X =⊥ wp 1
2 and uniform over {0, 1}n otherwise, and let Y be
◮ H∞(X|Y = 1) = 0 and H∞(X|Y = 0) = n. But H∞(X) = 1.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 18 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 19 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
◮ An,ε := {x ∈ Supp(X n): 2−n(H(X)+ε) ≤ pn(x) ≤ 2−n(H(X)−ε)}
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
◮ An,ε := {x ∈ Supp(X n): 2−n(H(X)+ε) ≤ pn(x) ≤ 2−n(H(X)−ε)} ◮ − log pn(x) ≥ n · (H(X) − ε) for any x ∈ An,ε
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
◮ An,ε := {x ∈ Supp(X n): 2−n(H(X)+ε) ≤ pn(x) ≤ 2−n(H(X)−ε)} ◮ − log pn(x) ≥ n · (H(X) − ε) for any x ∈ An,ε
n
j=i Z j
n
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
◮ An,ε := {x ∈ Supp(X n): 2−n(H(X)+ε) ≤ pn(x) ≤ 2−n(H(X)−ε)} ◮ − log pn(x) ≥ n · (H(X) − ε) for any x ∈ An,ε
n
j=i Z j
n
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
◮ An,ε := {x ∈ Supp(X n): 2−n(H(X)+ε) ≤ pn(x) ≤ 2−n(H(X)−ε)} ◮ − log pn(x) ≥ n · (H(X) − ε) for any x ∈ An,ε
n
j=i Z j
n
◮ Taking Zi = log p(Xi), it follows that Pr [X n /
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
◮ An,ε := {x ∈ Supp(X n): 2−n(H(X)+ε) ≤ pn(x) ≤ 2−n(H(X)−ε)} ◮ − log pn(x) ≥ n · (H(X) − ε) for any x ∈ An,ε
n
j=i Z j
n
◮ Taking Zi = log p(Xi), it follows that Pr [X n /
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
◮ An,ε := {x ∈ Supp(X n): 2−n(H(X)+ε) ≤ pn(x) ≤ 2−n(H(X)−ε)} ◮ − log pn(x) ≥ n · (H(X) − ε) for any x ∈ An,ε
n
j=i Z j
n
◮ Taking Zi = log p(Xi), it follows that Pr [X n /
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
◮ An,ε := {x ∈ Supp(X n): 2−n(H(X)+ε) ≤ pn(x) ≤ 2−n(H(X)−ε)} ◮ − log pn(x) ≥ n · (H(X) − ε) for any x ∈ An,ε
n
j=i Z j
n
◮ Taking Zi = log p(Xi), it follows that Pr [X n /
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
i=1 p(xi).
◮ An,ε := {x ∈ Supp(X n): 2−n(H(X)+ε) ≤ pn(x) ≤ 2−n(H(X)−ε)} ◮ − log pn(x) ≥ n · (H(X) − ε) for any x ∈ An,ε
n
j=i Z j
n
◮ Taking Zi = log p(Xi), it follows that Pr [X n /
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 20 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 21 / 24
X n|Y n(xn|yn) ≤ n · (H(X|Y) − ε)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 21 / 24
X n|Y n(xn|yn) ≤ n · (H(X|Y) − ε)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 21 / 24
X n|Y n(xn|yn) ≤ n · (H(X|Y) − ε)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 21 / 24
X n|Y n(xn|yn) ≤ n · (H(X|Y) − ε)
◮ SD(WYn, Y n) = 0, and
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 21 / 24
X n|Y n(xn|yn) ≤ n · (H(X|Y) − ε)
◮ SD(WYn, Y n) = 0, and ◮ H(W | WYn = y) ≥ n · (H(X|Y) − ε), for any y ∈ Supp(Y n)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 21 / 24
X n|Y n(xn|yn) ≤ n · (H(X|Y) − ε)
◮ SD(WYn, Y n) = 0, and ◮ H(W | WYn = y) ≥ n · (H(X|Y) − ε), for any y ∈ Supp(Y n)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 21 / 24
X n|Y n(xn|yn) ≤ n · (H(X|Y) − ε)
◮ SD(WYn, Y n) = 0, and ◮ H(W | WYn = y) ≥ n · (H(X|Y) − ε), for any y ∈ Supp(Y n)
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 21 / 24
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 22 / 24
|R|)2.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 23 / 24
|R|)2.
◮ Example: for D = {0, 1}n and R = {0, 1}m let
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 23 / 24
|R|)2.
◮ Example: for D = {0, 1}n and R = {0, 1}m let
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 23 / 24
|R|)2.
◮ Example: for D = {0, 1}n and R = {0, 1}m let
◮ 2-universal families: Prg←G [g(x) = g(x′))] =
1 |R|.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 23 / 24
|R|)2.
◮ Example: for D = {0, 1}n and R = {0, 1}m let
◮ 2-universal families: Prg←G [g(x) = g(x′))] =
1 |R|.
◮ Example for universal family that is not pairwise independent?
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 23 / 24
|R|)2.
◮ Example: for D = {0, 1}n and R = {0, 1}m let
◮ 2-universal families: Prg←G [g(x) = g(x′))] =
1 |R|.
◮ Example for universal family that is not pairwise independent? ◮ Many-wise independent
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 23 / 24
|R|)2.
◮ Example: for D = {0, 1}n and R = {0, 1}m let
◮ 2-universal families: Prg←G [g(x) = g(x′))] =
1 |R|.
◮ Example for universal family that is not pairwise independent? ◮ Many-wise independent ◮ We identify functions with their description.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 23 / 24
|R|)2.
◮ Example: for D = {0, 1}n and R = {0, 1}m let
◮ 2-universal families: Prg←G [g(x) = g(x′))] =
1 |R|.
◮ Example for universal family that is not pairwise independent? ◮ Many-wise independent ◮ We identify functions with their description. ◮ Amazingly useful tool
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 23 / 24
2 · 2(m−k))/2.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
◮ p − q2
2 = u∈U(d(u)−q(u))2
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
◮ p − q2
2 = u∈U(d(u)−q(u))2
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
◮ p − q2
2 = u∈U(d(u)−q(u))2 = p2 2+q2 2−2p, q
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
◮ p − q2
2 = u∈U(d(u)−q(u))2 = p2 2+q2 2−2p, q = CP(p)− 1 |U| ≤ δ |U|
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
◮ p − q2
2 = u∈U(d(u)−q(u))2 = p2 2+q2 2−2p, q = CP(p)− 1 |U| ≤ δ |U|
◮ Chebyshev Sum Inequality: (n
i=1 ai)2 ≤ n n i=1 a2 i
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
◮ p − q2
2 = u∈U(d(u)−q(u))2 = p2 2+q2 2−2p, q = CP(p)− 1 |U| ≤ δ |U|
◮ Chebyshev Sum Inequality: (n
i=1 ai)2 ≤ n n i=1 a2 i
◮ Hence, p − q2
1 ≤ |U| · p − q2 2
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
◮ p − q2
2 = u∈U(d(u)−q(u))2 = p2 2+q2 2−2p, q = CP(p)− 1 |U| ≤ δ |U|
◮ Chebyshev Sum Inequality: (n
i=1 ai)2 ≤ n n i=1 a2 i
◮ Hence, p − q2
1 ≤ |U| · p − q2 2
◮ Thus, SD(p, q) = 1
2 p − q1 ≤ √ δ 2 .
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
◮ p − q2
2 = u∈U(d(u)−q(u))2 = p2 2+q2 2−2p, q = CP(p)− 1 |U| ≤ δ |U|
◮ Chebyshev Sum Inequality: (n
i=1 ai)2 ≤ n n i=1 a2 i
◮ Hence, p − q2
1 ≤ |U| · p − q2 2
◮ Thus, SD(p, q) = 1
2 p − q1 ≤ √ δ 2 .
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24
2 · 2(m−k))/2.
|U| , then SD(p, ∼ U) ≤ √ δ 2 .
◮ p − q2
2 = u∈U(d(u)−q(u))2 = p2 2+q2 2−2p, q = CP(p)− 1 |U| ≤ δ |U|
◮ Chebyshev Sum Inequality: (n
i=1 ai)2 ≤ n n i=1 a2 i
◮ Hence, p − q2
1 ≤ |U| · p − q2 2
◮ Thus, SD(p, q) = 1
2 p − q1 ≤ √ δ 2 .
1 |G| · (2−k + 2−m) = 1+2m−k |G×{0,1}m|
Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 24 / 24