I n f o r m a t i o n T r a n s m i s s i o n - - PowerPoint PPT Presentation

i n f o r m a t i o n t r a n s m i s s i o n c h a p t e
SMART_READER_LITE
LIVE PREVIEW

I n f o r m a t i o n T r a n s m i s s i o n - - PowerPoint PPT Presentation

I n f o r m a t i o n T r a n s m i s s i o n C h a p t e r 5 , S o u r c e c o d i n g OVE EDFORS ELECTRICAL AND INFORMATION TECHNOLOGY L e a r n i n g o u t c o m e s A f t e r


slide-1
SLIDE 1

OVE EDFORS ELECTRICAL AND INFORMATION TECHNOLOGY

I n f

  • r

m a t i

  • n

T r a n s m i s s i

  • n

C h a p t e r 5 , S

  • u

r c e c

  • d

i n g

slide-2
SLIDE 2

2

L e a r n i n g

  • u

t c

  • m

e s

  • A

f t e r t h i s l e c t u r e t h e s t u d e n t s h

  • u

l d

– u

n d e r s t a n d t h e b a s i c s

  • f

s

  • u

r c e c

  • d

i n g ,

– k

n

  • w

w h a t a p r e fi x f r e e s

  • u

r c e c

  • d

e i s ,

– k

n

  • w

h

  • w

t

  • c

a l c u l a t e a v e r a g e c

  • d

e w

  • r

d l e n g t h ,

– u

n d e r s t a n d t h e l i m i t s

  • n

s

  • u

r c e c

  • d

i n g ,

– u

n d e r s t a n d t h e c

  • n

c e p t

  • f

u n i v e r s a l s

  • u

r c e c

  • d

i n g , a n d

– b

e a b l e t

  • p

e r f

  • r

m e n c

  • d

i n g a n d d e c

  • d

i n g a c c

  • r

d i n g t

  • t

h e L e m p e l

  • Z

i v

  • We

l c h a l g

  • r

i t h m

slide-3
SLIDE 3

3

Wh e r e a r e w e i n t h e B I G P I C T U R E ?

Source coding Lecture relates to pages 179-189 in textbook.

slide-4
SLIDE 4

4

Wh a t d i d S h a n n

  • n

p r

  • m

i s e ?

  • We can represent a source

sequence from X, of length n, uniquely by, on the average, nH(X) bits

slide-5
SLIDE 5

5

P r e fi x f r e e s

  • u

r c e c

  • d

e

We say that a sequence of length l is a prefix of a sequence if the first l symbols of the latter sequence is identical to the first sequence; in particular, a sequence is a prefix of itself. Then we require that no codeword is the prefix of another codeword and call such a code a prefix-free source code. The sequence 10011 has the prefixes: 1, 10, 100, 1001, and 10011. The source code with codewords {00,01,1} is prefix-free, but {00,10,1} is not, since 1 is prefix of 10.

slide-6
SLIDE 6

6

E x a m p l e

Consider the source code What is the average codeword length?

slide-7
SLIDE 7

7

H u ff m a n c

  • d

e c

  • n

s t r u c t i

  • n

An optimal way to find a prefix-free variable-length code was discovered by David Huffman in 1951, when he worked on a term, paper as a student. Procedure: 1.Let the source symbols be nodes with respective probabilities 2.Combine the two least probable remaining nodes into a new node and calculate its probability 3.If there are more than one node left, go back to step 2. 4.Label each branch of the constructed the tree either 0 or 1. 5.Traversing the tree from the root to each symbol gives the codewords.

ROOT

slide-8
SLIDE 8

8

P a t h l e n g t h l e m m a

In a rooted tree with probabilities, the average depth of the leaves is equal to the sum of the probabilities of the nodes (including the root).

slide-9
SLIDE 9

9

E x a m p l e

What is the average word length and the uncertainty of the source

ROOT

Uncertainty/entropy: Average word length: 2% away from what is possible

slide-10
SLIDE 10

1

R e a c h i n g t h e l i m i t

If we encode consecutive source symbols pairwise, that is, use the Huffman code for the source we will obtain an average codeword length per single source symbol that is closer to the uncertainty of the source, H(U) = 2.35.

slide-11
SLIDE 11

1 1

A u n i v e r s a l s

  • u

r c e c

  • d

i n g a l g

  • r

i t h m

The LZW algorithm is due to Ziv, Lempel, and Welch and belongs to the class of so-called universal source-coding algorithms which means that we do not need to know the source statistics. The algorithm is easy to implement and for long sequences it approaches the uncertainty of the source; it is asymptotically optimum.

slide-12
SLIDE 12

1 2

B a s i c p r

  • c

e d u r e

1. Initialize the dictionary. 2. Find the longest string W in the dictionary that matches the current input. 3. Emit the dictionary index for W to output and remove W from the input. 4. Add W followed by the next symbol in the input to the dictionary. 5. Go to Step 2.

Suppose we want to compress the sentence:

DO_NOT_TROUBLE_TROUBLE_ UNTIL_TROUBLE_TROUBLES_YOU!

slide-13
SLIDE 13

1 3

DO_NOT_TROUBLE_TROUBLE_ UNTIL_TROUBLE_TROUBLES_YOU!

slide-14
SLIDE 14

1 4

E v a l u a t i

  • n

Without compression we need as many as 50*8 = 400 binary digits to represent the sentence as a string of 50 ASCII symbols. If we sum the number of binary digits needed for the 38 steps shown in the table we get only 271 binary digits. A highly optimized version of the LZW algorithm we have described is used widely in practice to compress computer files on in all major operating systems.

slide-15
SLIDE 15

1 5

S u m m a r y

  • Mo

s t n a t u r a l t y p e s

  • f

d a t a c a n b e c

  • m

p r e s s e d

  • T

h e u n c e r t a i n t y ,

  • r

e n t r

  • p

y ,

  • f

t h e s

  • u

r c e d e t e r m i n e s h

  • w

m u c h w e c a n c

  • m

p r e s s d a t a w i t h

  • u

t l

  • s

i n g a n y t h i n g

  • P

r e fi x f r e e v a r i a b l e

  • l

e n g t h s

  • u

r c e c

  • d

e s c a n b e u n i q u e l y d e c

  • d

e d

  • H

u f f m a n ' s c

  • d

e c

  • n

s t r u c t i

  • n

i s

  • p

t i m a l f

  • r

a g i v e n s e t

  • f

s y m b

  • l

s

  • G

r

  • u

p i n g s y m b

  • l

s c a n r e d u c e a v e r a g e c

  • d

e w

  • r

d l e n g t h , b u t n

  • t

f u r t h e r d

  • w

n t h a n w h a t i s g i v e n b y t h e u n c e r t a i n t y / e n t r

  • p

y

  • U

n i v e r s a l s

  • u

r c e c

  • d

i n g b u i l d s a c

  • d

i n g t a b l e

  • n

t h e fl y , w i t h

  • u

t k n

  • w

i n g t h e s

  • u

r c e p r

  • b

a b i l i t i e s i n a d v a n c e

  • L

e m p e l

  • Z

i v

  • We

l c h i s t h e m

  • s

t w e l l k n

  • w

n u n i v e r s a l s

  • u

r c e c

  • d

e r , a p p l i e d i n a l l m

  • d

e r n

  • p

e r a t i n g s y s t e m s f

  • r

c

  • m

p r e s s i n g fi l e s

slide-16
SLIDE 16