SLIDE 1
1
CPSC 320: Intermediate Algorithm Design and Analysis
July 28, 2014
SLIDE 2 2
Course Outline
- Introduction and basic concepts
- Asymptotic notation
- Greedy algorithms
- Graph theory
- Amortized analysis
- Recursion
- Divide-and-conquer algorithms
- Randomized algorithms
- Dynamic programming algorithms
- NP-completeness
SLIDE 3
3
Dynamic Programming
SLIDE 4 4
Dynamic Programming Components
- Analyse the structure of an optimal solution
- Separate one choice (usually the last) from a subproblem
- Phrase the value of a choice as a function of the choice and the subproblem
- Phrase an optimal solution as the value of the best choice
- Usually a max/min result
- Implement the calculation of the optimal value
- Memoization: save optimal values as we compute them
- Bottom-up: evaluate smaller problems and use them for bigger problems
- Top-down: evaluate big problem by calling smaller problems recursively and
saving result
- Keep record of the choice made in each level
- Rebuild the optimal solution from the optimal value result
SLIDE 5
5
Knapsack Problem
Algorithm Knapsack(π₯, π, π) β π₯ is array of weights, π is array of values, π is limit π 0, π β 0, π 0, π β false for π = 0,1,2, β¦ , π For π β 1 To π₯ Do For π β 1 To π Do If π₯ π > π Or π π β 1, π > π π β 1, π β π₯ π + π[π] Then π π, π β π π β 1, π , π π, π β π[π β 1, π] Else π π, π β π π β 1, π β π₯ π + π[π], π π, π β π π‘ β β
, π¦ β π₯ While π > 0 And π[π¦, π] is not false Do π¦ β π[π¦, π], π‘ β π‘ βͺ π¦, π β π β π₯ π¦ , π¦ β π¦ β 1 Return π‘
SLIDE 6 6
Knapsack Algorithm - Complexity
- What is the time complexity of the knapsack algorithm?
- π(ππ) (number of items times the weight limit)
- This algorithm is called pseudo-polynomial
- Time complexity is based on the value of the input, not just the size
- There is no known polynomial algorithm to solve the knapsack problem
SLIDE 7 7
Algorithm Strategies - Review
- Dynamic programming algorithms:
- Choice is made based on evaluation of all possible results
- Time and space complexity are usually higher
- Greedy algorithms:
- Choice is made based on locally optimal solution
- Usually faster, but may not result in globally optimal solution
- Divide and conquer algorithms:
- Choice of input division is made based on assumption that merging result of
subproblems is optimal
SLIDE 8 8
Global Sequence Alignment Problem
- Problem: given two sequences, analyse how similar they are
- Allow both gaps and mismatches
- Application:
- Finding suggestions for misspelled words (comparing strings)
- Comparing files (diff)
- Analyse if two pieces of DNA match
- Example: βocurranceβ vs βoccurrenceβ
- There is a letter βcβ missing (gap)
- An βaβ was used instead of an βeβ (mismatch)
- Mismatches may be seen as gaps in both sides
- βoc-urra-nceβ vs βoccurr-enceβ
SLIDE 9 9
Formal Definition
- We represent a gap with a hyphen βββ
- A sequence alignment of (π, π) is a pair (πβ², πβ²) of sequences, such that:
- πβ² minus the gaps is π, πβ² minus the gaps is π
- πβ² = πβ² (the size is the same for both sides)
- If ππ
β² = β, then π π β² β β (you canβt have gaps on both sides)
- A parameter π > 0 defines the gap penalty (penalty if one side has a gap)
- A parameter π½ππ defines the mismatch penalty of matching π and π (π½ππ = 0)
- The cost of a matching (πβ², πβ²) is π=0
πβ²
πππ(π¦π, π§π)
SLIDE 10 10
Finding the Best Alignment
- What is the choice to be made?
- Last character could be a gap on either side, or a potential mismatch
- Assume πΊ(π, π) is the penalty for the best alignment of π¦1. . π¦π and π§1. . π§π
πΊ π, π = π β
π π = 0 π β
π π = 0 min πΊ π β 1, π β 1 + π½π¦ππ§π, πΊ π β 1, π + π, πΊ π, π β 1 + π
SLIDE 11 11
Algorithm (Smith-Wasserman)
Algorithm SmithWasserman(π, π, π, π½) For π β 0 To |π| Do πΊ π, 0 β π β
π For π β 1 To |π| Do πΊ 0, π β π β
π For π β 1 To |π| Do π β πΊ π β 1, π β 1 + π½ π π , π π
ππ¦ β πΊ π, π β 1 + π, ππ§ β πΊ π β 1, π + π -- gap penalty in π, π If π β€ ππ¦ And π β€ ππ§ Then πΊ π, π β π, πΌ π, π ββmatchβ Else If ππ¦ β€ ππ§ Then πΊ π, π β ππ¦, πΌ π, π ββgap in Xβ Else πΊ π, π β ππ§, πΌ π, π ββgap in Yβ
SLIDE 12
12
Algorithm (cont.)
β¦ πβ² β ββ, πβ² β ββ π β π, π β π While π > 0 Or π > 0 Do If πΌ π, π = βmatchβ Then πβ² β π π . Xβ², πβ² β π π . Yβ² π β π β 1, π β π β 1 Else If πΌ π, π = βgap in Xβ Then πβ² β β . Xβ², πβ² β π π . Yβ² π β π β 1 Else πβ² β π π . Xβ², πβ² β β . Yβ² π β π β 1 Return πβ², πβ², πΊ[π, π]
SLIDE 13 13
Longest Common Subsequence
- Subsequence: any sequence of items that is contained in the original sequence in
the same order (but not necessarily consecutively)
- Example: πΆ, π·, πΈ, πΆ is a subsequence of π΅, πͺ, π«, πΆ, π¬, π΅, πͺ
- Problem: Given two sequences π and π, find the longest common subsequence of
π and π
- Application:
- Find common DNA sequences in different organisms
- Video compression (inter-frame comparison)
SLIDE 14 14
Characterizing the LCS
- Define ππ as the sequence π limited to the first π elements
- Given two sequences π = π¦1, . . , π¦π and π = π§1, . . , π§π , let π = π¨1, . . , π¨π be the longest
common subsequence (LCS) of π and π
- If π¦π = π§π, then π¨π = π¦π = π§π, and ππβ1 is an LCS of ππβ1 and π
πβ1
- If π¦π β π§π, then π is either an LCS of ππ and π
πβ1, or an LCS of ππβ1 and π π
- Define the length of the LCS of ππ and π
π as:
π π, π = π = 0 β¨ π = 0 π π β 1, π β 1 + 1 π, π > 0 β§ π¦π = π§π max{π π, π β 1 , π π β 1, π }
SLIDE 15
15
Algorithm
Algorithm LongestCommonSubsequence(π, π) For π β 0 To π Do c[π, 0] β 0 For π β 1 To π Do c[0, π] β 0 For π β 1 To |π| Do If π π = π[π] Then π π, π β π π β 1, π β 1 + 1, β π, π β β+β Else If π π β 1, π > π[π, π β 1] Then π π, π β π[π β 1, π], β π, π β βXβ Else π π, π β π[π, π β 1], β π, π β βYβ PrintLCS(π, β, |π|, |π|) Return π π , π
SLIDE 16
16
Algorithm (cont.)
Algorithm PrintLCS(β, π, π, π) If π = 0 Or π = 0 Then Return If β π, π = β+β Then PrintLCS(β, π, π β 1, π β 1) Print π[π] Else If β π, π = βXβ Then PrintLCS(β, π, π β 1, π) Else PrintLCS(β, π, π, π β 1)
SLIDE 17
17
NP Complexity
SLIDE 18 18
Time Complexity for Decision Problems
- From this point on we analyse time complexity for problems, not algorithms
- We want to know what is the best possible complexity for the problem
- Our focus now is on decision problems, not optimization problems
- Decision problems: Yes/No answer
- Optimization: βfind bestβ, βfind maximumβ, βfind minimumβ
- We also need to distinguish βfindingβ and βcheckingβ a solution
SLIDE 19 19
Time Complexity - Classes
- A problem is solvable in polynomial time if there is an algorithm that solves it, that
runs in π ππ , where π β Ξ 1 and π is the size of the input representation
- Example: sort (π π log π β π π2 ), select (π(π)), longest common subsequence
(π(π2)), matrix multiplication (π(π3) or better)
- P: set of all decision problems that are solvable in polynomial time
- NP (non-deterministic P): set of all decision problems for which a given certificate
can be checked in polynomial time
SLIDE 20 20
Example: Hamiltonian Path
- Problem: given a graph, is there a path that goes through every node exactly
- nce?
- Decision problem: answer is yes or no
- Optimization problem: find a path with minimum cost, etc.; not required
- Is this problem in NP?
- Given a path, can we verify that the path is correct in polynomial time?
- Is this problem in P?
- Can we solve it in polynomial time?
SLIDE 21 21
Example: Satisfiability
- Given a set of boolean variables π¦1, . . π¦π and a set of clauses where each clause is a
disjunction of variables or their complements, such as: π¦1 β¨ π¦2 β¨ π¦4 π¦1 β¨ π¦3 β¨ π¦4 π¦2 β¨ π¦3 β¨ π¦5 β¨ π¦7
- Problem: can we assign True/False values to each value so that each clause is
satisfied?
- Problem simplification: every clause has at most 3 variables (3-SAT)
- Is this problem in NP?
- Is this problem in P?
SLIDE 22 22
NP Complete Problem
- Turns out nobody knows!
- There is no known algorithm that runs in polynomial time
- There is no proof that such an algorithm doesnβt exist
- NP complete: set of all decision problems that:
- Are in NP (solution can be verified in polynomial time)
- Are at least as hard as any problem in NP
- NP hard: set of all problems for which the second rule applies
- Includes non-decision problems (e.g., optimization)
SLIDE 23 23
Problem Reduction
- Some times the solution to a problem is to reduce it into another
- Reduction: given a problem A, reduce it to B by:
- providing a way to transform the input (instance) of A into a valid input of B in
polynomial time
- providing a translation from the result of B into the result of A
- proving that this transformation results in a valid solution for A
- Example: stable matching problem, co-op setting, multiple jobs in a company
- transform a company with π jobs into π companies with one job each, all with
same preferences
- each student lists the π companies instead of the one company in any order
- in the result, translate matches for these π companies into matches for original
company
SLIDE 24 24
NP Complete
- How do we prove a problem π is NP complete?
- Reduce any known NP complete problem πβ² into π in polynomial time
- If πβ² is NP complete, and we can solve πβ² by reducing it into π, then π is bound
in time complexity by the problem πβ²
SLIDE 25 25
Graph Coloring
- Graph coloring: can we color all nodes in a graph using at most π colors, such that
adjacent nodes (i.e., nodes linked by an edge) have different colors?
- Application: exam scheduling, map coloring
- This is in NP, since, given a coloring, we can verify itβs valid in π πΉ
- Is this NP complete?
- Letβs reduce the satisfiability problem into it