[PPT] - CPSC 320: Intermediate Algorithm Design and Analysis July 28, 2014 PowerPoint Presentation

SLIDE 1

1

CPSC 320: Intermediate Algorithm Design and Analysis

July 28, 2014

SLIDE 2

2

Course Outline

Introduction and basic concepts
Asymptotic notation
Greedy algorithms
Graph theory
Amortized analysis
Recursion
Divide-and-conquer algorithms
Randomized algorithms
Dynamic programming algorithms
NP-completeness

SLIDE 3

3

Dynamic Programming

SLIDE 4

4

Dynamic Programming Components

Analyse the structure of an optimal solution
Separate one choice (usually the last) from a subproblem
Phrase the value of a choice as a function of the choice and the subproblem
Phrase an optimal solution as the value of the best choice
Usually a max/min result
Implement the calculation of the optimal value
Memoization: save optimal values as we compute them
Bottom-up: evaluate smaller problems and use them for bigger problems
Top-down: evaluate big problem by calling smaller problems recursively and

saving result

Keep record of the choice made in each level
Rebuild the optimal solution from the optimal value result

SLIDE 5

5

Knapsack Problem

Algorithm Knapsack(𝑥, 𝑞, 𝑁) – 𝑥 is array of weights, 𝑞 is array of values, 𝑁 is limit 𝑠 0, 𝑛 ← 0, 𝑚 0, 𝑛 ← false for 𝑛 = 0,1,2, … , 𝑁 For 𝑗 ← 1 To 𝑥 Do For 𝑛 ← 1 To 𝑁 Do If 𝑥 𝑗 > 𝑛 Or 𝑠 𝑗 − 1, 𝑛 > 𝑠 𝑗 − 1, 𝑛 − 𝑥 𝑗 + 𝑞[𝑗] Then 𝑠 𝑗, 𝑛 ← 𝑠 𝑗 − 1, 𝑛 , 𝑚 𝑗, 𝑛 ← 𝑚[𝑗 − 1, 𝑛] Else 𝑠 𝑗, 𝑛 ← 𝑠 𝑗 − 1, 𝑛 − 𝑥 𝑗 + 𝑞[𝑗], 𝑚 𝑗, 𝑛 ← 𝑗 𝑡 ← ∅, 𝑦 ← 𝑥 While 𝑁 > 0 And 𝑚[𝑦, 𝑁] is not false Do 𝑦 ← 𝑚[𝑦, 𝑁], 𝑡 ← 𝑡 ∪ 𝑦, 𝑁 ← 𝑁 − 𝑥 𝑦 , 𝑦 ← 𝑦 − 1 Return 𝑡

SLIDE 6

6

Knapsack Algorithm - Complexity

What is the time complexity of the knapsack algorithm?
𝑃(𝑜𝑋) (number of items times the weight limit)
This algorithm is called pseudo-polynomial
Time complexity is based on the value of the input, not just the size
There is no known polynomial algorithm to solve the knapsack problem

SLIDE 7

7

Algorithm Strategies - Review

Dynamic programming algorithms:
Choice is made based on evaluation of all possible results
Time and space complexity are usually higher
Greedy algorithms:
Choice is made based on locally optimal solution
Usually faster, but may not result in globally optimal solution
Divide and conquer algorithms:
Choice of input division is made based on assumption that merging result of

subproblems is optimal

SLIDE 8

8

Global Sequence Alignment Problem

Problem: given two sequences, analyse how similar they are
Allow both gaps and mismatches
Application:
Finding suggestions for misspelled words (comparing strings)
Comparing files (diff)
Analyse if two pieces of DNA match
Example: “ocurrance” vs “occurrence”
There is a letter “c” missing (gap)
An “a” was used instead of an “e” (mismatch)
Mismatches may be seen as gaps in both sides
“oc-urra-nce” vs “occurr-ence”

SLIDE 9

9

Formal Definition

We represent a gap with a hyphen “−”
A sequence alignment of (𝑌, 𝑍) is a pair (𝑌′, 𝑍′) of sequences, such that:
𝑌′ minus the gaps is 𝑌, 𝑍′ minus the gaps is 𝑍
𝑌′ = 𝑍′ (the size is the same for both sides)
If 𝑌𝑗

′ = −, then 𝑍 𝑗 ′ ≠ − (you can’t have gaps on both sides)

A parameter 𝜀 > 0 defines the gap penalty (penalty if one side has a gap)
A parameter 𝛽𝑞𝑟 defines the mismatch penalty of matching 𝑞 and 𝑟 (𝛽𝑞𝑞 = 0)
The cost of a matching (𝑌′, 𝑍′) is 𝑗=0

𝑌′

𝑞𝑓𝑜(𝑦𝑗, 𝑧𝑗)

SLIDE 10

10

Finding the Best Alignment

What is the choice to be made?
Last character could be a gap on either side, or a potential mismatch
Assume 𝐺(𝑗, 𝑘) is the penalty for the best alignment of 𝑦1. . 𝑦𝑗 and 𝑧1. . 𝑧𝑘

𝐺 𝑗, 𝑘 = 𝑘 ⋅ 𝜀 𝑗 = 0 𝑗 ⋅ 𝜀 𝑘 = 0 min 𝐺 𝑗 − 1, 𝑘 − 1 + 𝛽𝑦𝑗𝑧𝑘, 𝐺 𝑗 − 1, 𝑘 + 𝜀, 𝐺 𝑗, 𝑘 − 1 + 𝜀

therwise

SLIDE 11

11

Algorithm (Smith-Wasserman)

Algorithm SmithWasserman(𝑌, 𝑍, 𝜀, 𝛽) For 𝑗 ← 0 To |𝑌| Do 𝐺 𝑗, 0 ← 𝑗 ⋅ 𝜀 For 𝑘 ← 1 To |𝑍| Do 𝐺 0, 𝑘 ← 𝑘 ⋅ 𝜀 For 𝑗 ← 1 To |𝑌| Do 𝑛 ← 𝐺 𝑗 − 1, 𝑘 − 1 + 𝛽 𝑌 𝑗 , 𝑍 𝑘

- matching cost

𝑕𝑦 ← 𝐺 𝑗, 𝑘 − 1 + 𝜀, 𝑕𝑧 ← 𝐺 𝑗 − 1, 𝑘 + 𝜀 -- gap penalty in 𝑌, 𝑍 If 𝑛 ≤ 𝑕𝑦 And 𝑛 ≤ 𝑕𝑧 Then 𝐺 𝑗, 𝑘 ← 𝑛, 𝐼 𝑗, 𝑘 ←”match” Else If 𝑕𝑦 ≤ 𝑕𝑧 Then 𝐺 𝑗, 𝑘 ← 𝑕𝑦, 𝐼 𝑗, 𝑘 ←”gap in X” Else 𝐺 𝑗, 𝑘 ← 𝑕𝑧, 𝐼 𝑗, 𝑘 ←”gap in Y”

SLIDE 12

12

Algorithm (cont.)

… 𝑌′ ← “”, 𝑍′ ← “” 𝑗 ← 𝑛, 𝑘 ← 𝑜 While 𝑗 > 0 Or 𝑘 > 0 Do If 𝐼 𝑗, 𝑘 = “match” Then 𝑌′ ← 𝑌 𝑗 . X′, 𝑍′ ← 𝑍 𝑘 . Y′ 𝑗 ← 𝑗 − 1, 𝑘 ← 𝑘 − 1 Else If 𝐼 𝑗, 𝑘 = “gap in X” Then 𝑌′ ← − . X′, 𝑍′ ← 𝑍 𝑘 . Y′ 𝑘 ← 𝑘 − 1 Else 𝑌′ ← 𝑌 𝑗 . X′, 𝑍′ ← − . Y′ 𝑗 ← 𝑗 − 1 Return 𝑌′, 𝑍′, 𝐺[𝑛, 𝑜]

SLIDE 13

13

Longest Common Subsequence

Subsequence: any sequence of items that is contained in the original sequence in

the same order (but not necessarily consecutively)

Example: 𝐶, 𝐷, 𝐸, 𝐶 is a subsequence of 𝐵, 𝑪, 𝑫, 𝐶, 𝑬, 𝐵, 𝑪
Problem: Given two sequences 𝑌 and 𝑍, find the longest common subsequence of

𝑌 and 𝑍

Application:
Find common DNA sequences in different organisms
Video compression (inter-frame comparison)

SLIDE 14

14

Characterizing the LCS

Define 𝑌𝑗 as the sequence 𝑌 limited to the first 𝑗 elements
Given two sequences 𝑌 = 𝑦1, . . , 𝑦𝑛 and 𝑍 = 𝑧1, . . , 𝑧𝑜 , let 𝑎 = 𝑨1, . . , 𝑨𝑙 be the longest

common subsequence (LCS) of 𝑌 and 𝑍

If 𝑦𝑛 = 𝑧𝑜, then 𝑨𝑙 = 𝑦𝑛 = 𝑧𝑜, and 𝑎𝑙−1 is an LCS of 𝑌𝑛−1 and 𝑍

𝑜−1

If 𝑦𝑛 ≠ 𝑧𝑜, then 𝑎 is either an LCS of 𝑌𝑛 and 𝑍

𝑜−1, or an LCS of 𝑌𝑛−1 and 𝑍 𝑜

Define the length of the LCS of 𝑌𝑗 and 𝑍

𝑘 as:

𝑑 𝑗, 𝑘 = 𝑗 = 0 ∨ 𝑘 = 0 𝑑 𝑗 − 1, 𝑘 − 1 + 1 𝑗, 𝑘 > 0 ∧ 𝑦𝑗 = 𝑧𝑘 max{𝑑 𝑗, 𝑘 − 1 , 𝑑 𝑗 − 1, 𝑘 }

therwise

SLIDE 15

15

Algorithm

Algorithm LongestCommonSubsequence(𝑌, 𝑍) For 𝑗 ← 0 To 𝑌 Do c[𝑗, 0] ← 0 For 𝑘 ← 1 To 𝑍 Do c[0, 𝑘] ← 0 For 𝑗 ← 1 To |𝑌| Do If 𝑌 𝑗 = 𝑍[𝑘] Then 𝑑 𝑗, 𝑘 ← 𝑑 𝑗 − 1, 𝑘 − 1 + 1, ℎ 𝑗, 𝑘 ← “+” Else If 𝑑 𝑗 − 1, 𝑘 > 𝑑[𝑗, 𝑘 − 1] Then 𝑑 𝑗, 𝑘 ← 𝑑[𝑗 − 1, 𝑘], ℎ 𝑗, 𝑘 ← “X” Else 𝑑 𝑗, 𝑘 ← 𝑑[𝑗, 𝑘 − 1], ℎ 𝑗, 𝑘 ← “Y” PrintLCS(𝑌, ℎ, |𝑌|, |𝑍|) Return 𝑑 𝑌 , 𝑍

SLIDE 16

16

Algorithm (cont.)

Algorithm PrintLCS(ℎ, 𝑌, 𝑗, 𝑘) If 𝑗 = 0 Or 𝑘 = 0 Then Return If ℎ 𝑗, 𝑘 = “+” Then PrintLCS(ℎ, 𝑌, 𝑗 − 1, 𝑘 − 1) Print 𝑌[𝑗] Else If ℎ 𝑗, 𝑘 = “X” Then PrintLCS(ℎ, 𝑌, 𝑗 − 1, 𝑘) Else PrintLCS(ℎ, 𝑌, 𝑗, 𝑘 − 1)

SLIDE 17

17

NP Complexity

SLIDE 18

18

Time Complexity for Decision Problems

From this point on we analyse time complexity for problems, not algorithms
We want to know what is the best possible complexity for the problem
Our focus now is on decision problems, not optimization problems
Decision problems: Yes/No answer
Optimization: “find best”, “find maximum”, “find minimum”
We also need to distinguish “finding” and “checking” a solution

SLIDE 19

19

Time Complexity - Classes

A problem is solvable in polynomial time if there is an algorithm that solves it, that

runs in 𝑃 𝑜𝑙 , where 𝑙 ∈ Θ 1 and 𝑜 is the size of the input representation

Example: sort (𝑃 𝑜 log 𝑜 ⊂ 𝑃 𝑜2 ), select (𝑃(𝑜)), longest common subsequence

(𝑃(𝑜2)), matrix multiplication (𝑃(𝑜3) or better)

P: set of all decision problems that are solvable in polynomial time
NP (non-deterministic P): set of all decision problems for which a given certificate

can be checked in polynomial time

SLIDE 20

20

Example: Hamiltonian Path

Problem: given a graph, is there a path that goes through every node exactly
nce?
Decision problem: answer is yes or no
Optimization problem: find a path with minimum cost, etc.; not required
Is this problem in NP?
Given a path, can we verify that the path is correct in polynomial time?
Is this problem in P?
Can we solve it in polynomial time?

SLIDE 21

21

Example: Satisfiability

Given a set of boolean variables 𝑦1, . . 𝑦𝑜 and a set of clauses where each clause is a

disjunction of variables or their complements, such as: 𝑦1 ∨ 𝑦2 ∨ 𝑦4 𝑦1 ∨ 𝑦3 ∨ 𝑦4 𝑦2 ∨ 𝑦3 ∨ 𝑦5 ∨ 𝑦7

Problem: can we assign True/False values to each value so that each clause is

satisfied?

Problem simplification: every clause has at most 3 variables (3-SAT)
Is this problem in NP?
Is this problem in P?

SLIDE 22

22

NP Complete Problem

Turns out nobody knows!
There is no known algorithm that runs in polynomial time
There is no proof that such an algorithm doesn’t exist
NP complete: set of all decision problems that:
Are in NP (solution can be verified in polynomial time)
Are at least as hard as any problem in NP
NP hard: set of all problems for which the second rule applies
Includes non-decision problems (e.g., optimization)

SLIDE 23

23

Problem Reduction

Some times the solution to a problem is to reduce it into another
Reduction: given a problem A, reduce it to B by:
providing a way to transform the input (instance) of A into a valid input of B in

polynomial time

providing a translation from the result of B into the result of A
proving that this transformation results in a valid solution for A
Example: stable matching problem, co-op setting, multiple jobs in a company
transform a company with 𝑙 jobs into 𝑙 companies with one job each, all with

same preferences

each student lists the 𝑙 companies instead of the one company in any order
in the result, translate matches for these 𝑙 companies into matches for original

company

SLIDE 24

24

NP Complete

How do we prove a problem 𝑞 is NP complete?
Reduce any known NP complete problem 𝑞′ into 𝑞 in polynomial time
If 𝑞′ is NP complete, and we can solve 𝑞′ by reducing it into 𝑞, then 𝑞 is bound

in time complexity by the problem 𝑞′

SLIDE 25

25

Graph Coloring

Graph coloring: can we color all nodes in a graph using at most 𝑙 colors, such that

adjacent nodes (i.e., nodes linked by an edge) have different colors?

Application: exam scheduling, map coloring
This is in NP, since, given a coloring, we can verify it’s valid in 𝑃 𝐹
Is this NP complete?
Let’s reduce the satisfiability problem into it