[PPT] - CSE202: Design and Analysis of Algorithms Ragesh Jaiswal, CSE, UCSD PowerPoint Presentation

SLIDE 1

CSE202: Design and Analysis of Algorithms

Ragesh Jaiswal, CSE, UCSD

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 2

Greedy Algorithms: One more example

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 3

Greedy Algorithms

Huffman coding

A wants to send an email to B but wants to minimize the amount of communication (number of bits communicated). How do you encode an email into bits?

ASCII: 8 bits per character Is this the best way to encode the email given that the goal is to minimize the amount of communication?

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 4

Greedy Algorithms

Huffman coding

A wants to send an email to B but wants to minimize the amount of communication (number of bits communicated). How do you encode an email into bits?

ASCII: 8 bits per character Is this the best way to encode the email given that the goal is to minimize the amount of communication? Different alphabets have different frequency of occurrence in a standard English document.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 5

Greedy Algorithms

Huffman coding A wants to send an email to B but wants to minimize the amount

f communication (number of bits communicated).

How do you encode an email into bits?

ASCII: 8 bits per character Is this the best way to encode the email given that the goal is to minimize the amount of communication? Different alphabets have different frequency of occurrence in a standard English document.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 6

Greedy Algorithms

Huffman coding

The encoding of “e” should be shorter than the encoding of “x”. In fact, Morse code was designed with this in mind.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 7

Greedy Algorithms

Huffman coding

Suppose you receive the following Morse code from your friend:

• •−

What is the message?

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 8

Greedy Algorithms

Huffman coding

Prefix-free encoding: An encoding f is called prefix-free if for any pair of alphabets (a1, a2), f (a1) is not a prefix of f (a2). Morse code is certainly not prefix-free. Consider a binary tree with 26 leaves and associate each alphabet with a leaf in this tree.

Binary tree: A rooted tree where each non-leaf node has at most two children.

Label an edge 0 if this edge connects the parent to its left child and 1 otherwise. f (x) = The label of edges connecting the root with x.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 9

Greedy Algorithms

Huffman coding Consider a binary tree with 26 leaves and associate each alphabet with a leaf in this tree.

Binary tree: A rooted tree where each non-leaf node has at most two children.

Label an edge 0 if this edge connects the parent to its left child and 1 otherwise. f (x) = The label of edges connecting the root with x. f (a) = 01, f (b) = 000, f (c) = 101, f (d) = 111. Is f prefix-free?

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 10

Greedy Algorithms

Huffman coding

Suppose you are given a prefix-free encoding g. Can you construct a binary tree with 26 leaves, associate each leaf with an alphabet, and label the edges as defined previously such that for any alphabet, the label of edges connecting the root with x = g(x)? For example: g(a) = 0, g(b) = 11, g(c) = 101, g(d) = 100.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 11

Greedy Algorithms

Huffman coding

Suppose you are given a prefix-free encoding g. Can you construct a binary tree with 26 leaves, associate each leaf with an alphabet, and label the edges as defined previously such that for any alphabet, the label of edges connecting the root with x = g(x)? For example: g(a) = 0, g(b) = 11, g(c) = 101, g(d) = 100.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 12

Greedy Algorithms

Huffman coding

Problem Huffman Coding: Given alphabets Σ = (a1, ..., an) and the frequency

f occurrence of alphabets (t(a1), ..., t(an)), find a prefix-free encoding

f that minimizes: Of = |f (a1)| · t(a1) + |f (a2)| · t(a2) + ... + |f (an)| · t(an) Consider Σ = (a, b, c, d), t(a) = 0.6, t(b) = 0.2, t(c) = 0.1, t(d) = 0.1 and consider the prefix-free encoding given by the binary tree below: What is the value of Of for the prefix-free code given by the binary tree below?

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 13

Greedy Algorithms

Huffman coding

Problem Huffman Coding: Given alphabets Σ = (a1, ..., an) and the frequency

f occurrence of alphabets (t(a1), ..., t(an)), find a prefix-free encoding

f that minimizes: Of = |f (a1)| · t(a1) + |f (a2)| · t(a2) + ... + |f (an)| · t(an) Consider Σ = (a, b, c, d), t(a) = 0.6, t(b) = 0.2, t(c) = 0.1, t(d) = 0.1 and consider the prefix-free encoding given by the binary tree below: What is the value of Of for the prefix-free code given by the binary tree below?

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 14

Greedy Algorithms

Huffman coding

Node depth: The depth of a vertex v, denoted by d(v), is the length of the path from root to v. Every binary tree gives a prefix-free encoding and every prefix-free encoding gives a binary tree. We will now use these properties to rephrase the previous problem in terms of binary trees and depths of leaves.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 15

Greedy Algorithms

Huffman coding

Problem Huffman Coding: Given alphabets Σ = (a1, ..., an) and the frequency

f occurrence of alphabets (t(a1), ..., t(an)), find a prefix-free encoding

f binary tree T with n leaves (each leaf labeled with unique alphabet) that minimizes: Of = |d(a1)| · t(a1) + |d(a2)| · t(a2) + ... + |d(an)| · t(an), where d(ai) denotes the depth of the leaf labeled ai. What are the properties of the optimal tree T ∗?

Claim 1: T ∗ is a complete binary tree.

Complete binary tree: Every non-leaf node has exactly two children.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 16

Greedy Algorithms

Huffman coding

Problem Huffman Coding: Given alphabets Σ = (a1, ..., an) and the frequency

f occurrence of alphabets (t(a1), ..., t(an)), find a prefix-free encoding

f binary tree T with n leaves (each leaf labeled with unique alphabet) that minimizes: Of = |d(a1)| · t(a1) + |d(a2)| · t(a2) + ... + |d(an)| · t(an), where d(ai) denotes the depth of the leaf labeled ai. What are the properties of optimal tree T ∗?

Claim 1: Any T ∗ is a complete binary tree. Claim 2: Consider two alphabets x and y with least frequencies. Then x and y have maximum depth in any optimal tree T ∗. Moreover, there is an optimal tree T ∗ where x and y are siblings.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 17

Greedy Algorithms

Huffman coding

Let Ω be a new symbol not present in Σ. Consider the following (smaller) problem:

Σ′ = Σ − {x, y} ∪ {Ω} For all z ∈ Σ, t′(z) = t(z) t(Ω) = t(x) + t(y) Find an optimal binary tree for the new alphabet set Σ′ and new frequencies t′.

Let T ′ be any optimal binary tree for the above problem. Consider the leaf v labeled with Ω in T ′. Consider the tree T which is the same as T ′ except that the node v has two children labeled as x and y. Claim 3: T is an optimal binary tree for the original problem.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 18

Greedy Algorithms

Huffman coding

Algorithm Huffman-Tree

Let v1, ..., vn be the nodes each denoting an alphabet
S ← {v1, ..., vn}
While (|S| > 1):
Pick two nodes x, y with least values of t(x) and t(y)
Create a new node z and set t(z) ← t(x) + t(y)
Set x as the left child of z and y as the right child of z
S ← S − {x, y} ∪ {z}
Return the only node in S as the root node of the Binary Tree

What is the running time of the above algorithm?

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 19

Greedy Algorithms

Huffman coding

An example:

A DNA sequence has four characters A, C, T, G and these characters appear with frequency 30%, 20%, 10%, and 40% respectively. We have to encode a sequence of length 1 million in bits. If we use two bits for each character, the encoding will use 2 million bits. How many bits will be required if we do Huffman encoding?

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 20

Course Overview

Basic graph algorithms Algorithm Design Techniques:

Greedy Algorithms Divide and Conquer Dynamic Programming Network Flows

Computational Intractability

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 21

Divide and Conquer

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 22

Divide and Conquer

Introduction

You may have already seen multiple examples of Divide and Conquer algorithms:

Binary Search Merge Sort Quick Sort Multiplying two n-bit numbers in O

nlog2 3

time.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 23

Divide and Conquer

Main Idea

Main Idea: Divide the input into smaller parts. Solve the smaller parts and combine their solution.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 24

Divide and Conquer

Merge Sort

Problem Given an array of unsorted integers, output a sorted array. Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 25

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 26

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 27

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 28

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 29

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 30

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 31

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 32

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 33

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 34

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

How do we argue correctness?

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 35

Divide and Conquer

Merge Sort Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

How do we argue correctness? Proof of correctness of Divide and Conquer algorithms are usually by induction.

Base case: This corresponds to the base cases of the algorithm. For the MergeSort, the base case is that the algorithm correctly sorts arrays of size 1. Inductive step: In general, this corresponds to correctly combining the solutions of smaller subproblems. For MergeSort, this is just proving that the Merge routine works correctly. This may again be done using induction and is left as an exercise.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 36

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Let n be a power of 2 (e.g., n = 256) Let T(n) denote the worst case running time for the algorithm. Claim 1: T(1) ≤ c for some constant c.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 37

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Let n be a power of 2 (e.g., n = 256) Let T(n) denote the worst case running time for the algorithm. Claim 1: T(1) ≤ c for some constant c. Claim 2: T(n) ≤ 2 · T(n/2) + cn for all n ≥ 2.

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 38

Divide and Conquer

Merge Sort

Algorithm MergeSort(A)

If (|A| = 1) return(A)
Divide A into two equal parts AL and AR
BL ← MergeSort(AL)
BR ← MergeSort(AR)
B ← Merge(BL, BR)
return(B)

Let n be a power of 2 (e.g., n = 256) Let T(n) denote the worst case running time for the algorithm. Claim 1: T(1) ≤ c for some constant c. Claim 2: T(n) ≤ 2 · T(n/2) + cn for all n ≥ 2. T(n) ≤ 2 · T(n/2) + cn for n ≥ 2 and T(1) ≤ c is called a recurrence relation for the running time T(n). How do we solve such recurrence relation to obtain the value of T(n) as a function of n?

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 39

Divide and Conquer

Merge Sort

Let n be a power of 2 (e.g., n = 256) Let T(n) denote the worst case running time for the algorithm. Claim 1: T(1) ≤ c for some constant c. Claim 2: T(n) ≤ 2 · T(n/2) + cn for all n ≥ 2. T(n) ≤ 2 · T(n/2) + cn for n ≥ 2 and T(1) ≤ c is called a recurrence relation for the running time T(n). How do we solve such recurrence relation to obtain the value of T(n) as a function of n?

Unrolling the recursion: Rewrite T(n/2) in terms of T(n/4) and so on until a pattern for the running time with respect to all levels

f the recursion is observed. Then, combine these and get the

value of T(n).

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 40

Divide and Conquer

Merge Sort

Recurrence relation for Merge Sort: T(n) ≤ 2 · T(n/2) + cn for n ≥ 2 and T(1) ≤ c. How do we solve such recurrence relation to obtain the value of T(n) as a function of n?

Unrolling the recursion: Rewrite T(n/2) in terms of T(n/4) and so on until a pattern for the running time with respect to all levels

f the recursion is observed. Then, combine these and get the

value of T(n). Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 41

Divide and Conquer

Merge Sort

Recurrence relation for Merge Sort: T(n) ≤ 2 · T(n/2) + cn for n ≥ 2 and T(1) ≤ c. How do we solve such recurrence relation to obtain the value of T(n) as a function of n? So, the running time T(n) ≤ cn · log n = O(n log n).

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms

SLIDE 42

End

Ragesh Jaiswal, CSE, UCSD CSE202: Design and Analysis of Algorithms