CSC373 Week 3: Dynamic Programming
373F19 - Nisarg Shah 1
CSC373 Week 3: Dynamic Programming Nisarg Shah 373F19 - Nisarg - - PowerPoint PPT Presentation
CSC373 Week 3: Dynamic Programming Nisarg Shah 373F19 - Nisarg Shah 1 Recap Greedy Algorithms Interval scheduling Interval partitioning Minimizing lateness Huffman encoding 373F19 - Nisarg Shah 2 Jeff Erickson on
373F19 - Nisarg Shah 1
373F19 - Nisarg Shah 2
➢ Interval scheduling ➢ Interval partitioning ➢ Minimizing lateness ➢ Huffman encoding ➢ …
373F19 - Nisarg Shah 3
373F19 - Nisarg Shah 4
The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named
pathological fear and hatred of the word ‘research’. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term ‘research’ in his presence. You can imagine how he felt, then, about the term ‘mathematical’. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? — Richard Bellman, on the origin of his term ‘dynamic programming’ (1984)
373F19 - Nisarg Shah 5
➢ Breaking the problem down into simpler subproblems,
➢ The next time the same subproblem occurs, instead of
➢ Hopefully, we save a lot of computation at the expense of
➢ Also called “memoization”
➢ Job 𝑘 starts at time 𝑡
𝑘 and finishes at time 𝑔 𝑘
➢ Each job 𝑘 has a weight 𝑥
𝑘
➢ Two jobs are compatible if they don’t overlap ➢ Goal: find a set 𝑇 of mutually compatible jobs with highest
𝑘
𝑘 = 1, then this is simply the interval
➢ Greedy algorithm based on earliest finish time ordering was
373F19 - Nisarg Shah 6
373F19 - Nisarg Shah 7
➢ Fails spectacularly!
373F19 - Nisarg Shah 8
➢ By weight: choose jobs with highest 𝑥
𝑘 first
➢ Maximum weight per time: choose jobs with highest
𝑘/(𝑔 𝑘 − 𝑡 𝑘) first
➢ ...
➢ They’re arbitrarily worse than the optimal solution ➢ In fact, under a certain formalization, “no greedy
373F19 - Nisarg Shah 9
➢ Jobs are sorted by finish time: 𝑔
1 ≤ 𝑔 2 ≤ ⋯ ≤ 𝑔 𝑜
➢ 𝑞 𝑘 = largest index 𝑗 < 𝑘 such that job 𝑗 is compatible
𝑗 < 𝑡 𝑘)
Among jobs before job 𝑘, the ones compatible with it are precisely 1 … 𝑗
E.g. 𝑞[8] = 1, 𝑞[7] = 3, 𝑞[2] = 0
373F19 - Nisarg Shah 10
➢ Let OPT be an optimal solution ➢ Two cases regarding job 𝑜:
➢ OPT is best of both ➢ Note: In both cases, knowing how to solve any prefix of
373F19 - Nisarg Shah 11
➢ 𝑃𝑄𝑈(𝑘) = maximum value from compatible jobs in 1, … , 𝑘 ➢ Base case: 𝑃𝑄𝑈 0 = 0 ➢ Two cases regarding job 𝑘:
➢ 𝑃𝑄𝑈(𝑘) is best of both worlds ➢ Bellman equation:
373F19 - Nisarg Shah 12
373F19 - Nisarg Shah 13
a)
b)
c)
d)
373F19 - Nisarg Shah 14
➢ It is possible that 𝑞 𝑘 = 𝑘 − 1 for each 𝑘 ➢ Then, we call COMPUTE-OPT(𝑘 − 1) twice in COMPUTE-OPT 𝑘 ➢ So this might take 2𝑜 steps ➢ But we can just check if 𝑘 is compatible with 𝑘 − 1, and if
➢ Now the worst case is where 𝑞 𝑘 = 𝑘 − 2 for each 𝑘 ➢ Running time: 𝑈 𝑜 = 𝑈 𝑜 − 1 + 𝑈 𝑜 − 2
373F19 - Nisarg Shah 15
➢ Some solutions are being computed many, many times
and COMPUTE-OPT(3)
➢ Simply remember what you’ve already computed, and re-
373F19 - Nisarg Shah 16
373F19 - Nisarg Shah 17
➢ Sorting jobs takes 𝑃 𝑜 log 𝑜 ➢ It also takes 𝑃(𝑜 log 𝑜) to do 𝑜 binary searches to
➢ M-Compute-OPT(𝑘) is called at most once for each 𝑘 ➢ Each such call takes 𝑃(1) time, not considering the time
➢ So M-Compute-OPT(𝑜) takes only 𝑃 𝑜 time ➢ Overall time is 𝑃 𝑜 log 𝑜
373F19 - Nisarg Shah 18
373F19 - Nisarg Shah 19
➢ …when not all sub-solutions need to be computed on
➢ …because one does not need to think of the “right order”
➢ …when all sub-solutions will anyway need to be
➢ …because it is sometimes faster as it prevents recursive
373F19 - Nisarg Shah 20
➢ Typically, this is done by maintaining the optimal value
➢ So, we compute two quantities:
𝑃𝑄𝑈 𝑘 = ൝ if 𝑘 = 0 max 𝑃𝑄𝑈 𝑘 − 1 , 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 if 𝑘 > 0 𝑇 𝑘 = ൞ ∅ if 𝑘 = 0 𝑇(𝑘 − 1) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 ≥ 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 𝑘 ∪ 𝑇(𝑞 𝑘 ) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 < 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘
373F19 - Nisarg Shah 21
𝑃𝑄𝑈 𝑘 = ൝ if 𝑘 = 0 max 𝑃𝑄𝑈 𝑘 − 1 , 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 if 𝑘 > 0 𝑇 𝑘 = ൞ ∅ if 𝑘 = 0 𝑇(𝑘 − 1) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 ≥ 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 𝑘 ∪ 𝑇(𝑞 𝑘 ) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 < 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘
This works with both top-down (memoization) and bottom-up approaches. In this problem, we can do something simpler: just compute 𝑃𝑄𝑈 first, and later compute 𝑇 using only 𝑃𝑄𝑈.
373F19 - Nisarg Shah 22
➢ Optimal solution to a problem contains (or can be
➢ You can think of divide-and-conquer as a special case of
➢ So there’s no need for memoization ➢ In dynamic programming, one of the subproblems may in
373F19 - Nisarg Shah 23
➢ 𝑜 items: item 𝑗 provides value 𝑤𝑗 > 0 and has weight 𝑥𝑗 > 0 ➢ Knapsack has weight capacity 𝑋 ➢ Assumption: 𝑋, each 𝑤𝑗, and each 𝑥𝑗 is an integer ➢ Goal: pack the knapsack with a collection of items with
373F19 - Nisarg Shah 24
➢ Goal: Compute 𝑃𝑄𝑈(𝑋) ➢ Claim? 𝑃𝑄𝑈(𝑥) must use at least one job 𝑘 with weight ≤ 𝑥
𝑘
➢ Let 𝑥∗ = min
𝑘
𝑘
➢ 𝑃𝑄𝑈 𝑥 = ቐ
𝑘:𝑥𝑘≤𝑥 𝑤𝑘 + 𝑃𝑄𝑈 𝑥 − 𝑥 𝑘
➢ It might use an item more than once!
373F19 - Nisarg Shah 25
➢ Goal: Compute 𝑃𝑄𝑈(𝑜, 𝑋)
➢ If 𝑥𝑗 > 𝑥, then we can’t choose 𝑗. Just use 𝑃𝑄𝑈(𝑗 − 1, 𝑥) ➢ If 𝑥𝑗 ≤ 𝑥, there are two cases:
373F19 - Nisarg Shah 26
➢ 𝑗 ∈ 1, … , 𝑜 ➢ 𝑥 ∈ {1, … , 𝑋} (recall weights and capacity are integers) ➢ There are 𝑃(𝑜 ⋅ 𝑋) possible evaluations of 𝑃𝑄𝑈 ➢ Each is evaluated at most once (memoization) ➢ Each takes 𝑃(1) time to evaluate ➢ So the total running time is 𝑃(𝑜 ⋅ 𝑋)
➢ A: No! But it’s pseudo-polynomial.
373F19 - Nisarg Shah 27
➢ Then we can use a different dynamic programming
373F19 - Nisarg Shah 28
➢ Goal: Compute max 𝑤 ∈ 1, … , 𝑊 ∶ 𝑃𝑄𝑈 𝑗, 𝑤 ≤ 𝑋
➢ If we choose 𝑗, we need capacity 𝑥𝑗 + 𝑃𝑄𝑈(𝑗 − 1, 𝑤 − 𝑤𝑗) ➢ If we don’t choose 𝑗, we need capacity 𝑃𝑄𝑈 𝑗 − 1, 𝑤
𝑃𝑄𝑈 𝑗, 𝑤 = if 𝑤 ≤ 0 ∞ if 𝑤 > 0, 𝑗 = 0 min 𝑥𝑗 + 𝑃𝑄𝑈 𝑗 − 1, 𝑤 − 𝑤𝑗 , 𝑃𝑄𝑈 𝑗 − 1, 𝑤 if 𝑤 > 0, 𝑗 > 0
373F19 - Nisarg Shah 29
➢ Goal: Compute max 𝑤 ∈ 1, … , 𝑊 ∶ 𝑃𝑄𝑈 𝑗, 𝑤 ≤ 𝑋
➢ Not likely. Knapsack problem is NP-complete (we’ll see
373F19 - Nisarg Shah 30
➢ For any 𝜗 > 0, we can get a value that is within 1 + 𝜗
𝑃 𝑞𝑝𝑚𝑧 𝑜, log 𝑋 , log 𝑊 ,
1 𝜗
➢ Such algorithms are known as fully polynomial-time
➢ Core idea behind FPTAS for knapsack:
373F19 - Nisarg Shah 31
➢ Input: A directed graph 𝐻 = (𝑊, 𝐹) with edge lengths ℓ𝑤𝑥
➢ Goal: Compute the length of the shortest path from 𝑡 to
➢ Dijkstra’s algorithm can be used for this purpose ➢ But it fails when some edge lengths can be negative ➢ What do we do in this case?
373F19 - Nisarg Shah 32
➢ You can traverse the cycle arbitrarily many times to get
𝑡
373F19 - Nisarg Shah 33
➢ Shortest paths are well-defined even when some of the
➢ Consider the shortest 𝑡 ⇝ 𝑢 path with the fewest edges
➢ If it has a cycle, removing the cycle creates a path with
373F19 - Nisarg Shah 34
➢ It could be just a single edge ➢ But if 𝑄 has more than one edges, consider 𝑣 which
➢ If 𝑡 ⇝ 𝑢 is shortest, 𝑡 ⇝ 𝑣 must be shortest as well and it
𝑢
373F19 - Nisarg Shah 35
➢ Either this path uses at most 𝑗 − 1 edges ⇒ 𝑃𝑄𝑈(𝑢, 𝑗 − 1) ➢ Or it uses 𝑗 edges ⇒ min
𝑣 𝑃𝑄𝑈 𝑣, 𝑗 − 1 + ℓ𝑣𝑢 𝑢
373F19 - Nisarg Shah 36
➢ Either this path uses at most 𝑗 − 1 edges ⇒ 𝑃𝑄𝑈(𝑢, 𝑗 − 1) ➢ Or it uses 𝑗 edges ⇒ min
𝑣 𝑃𝑄𝑈 𝑣, 𝑗 − 1 + ℓ𝑣𝑢
𝑃𝑄𝑈 𝑢, 𝑗 = ൞ ∞ 𝑗 = 0 ∨ 𝑢 = 𝑡 𝑗 = 0 ∧ 𝑢 ≠ 𝑡 min 𝑃𝑄𝑈 𝑢, 𝑗 − 1 , min
𝑣 𝑃𝑄𝑈 𝑣, 𝑗 − 1 + ℓ𝑣𝑢
➢ Running time: 𝑃(𝑜2) calls, each takes 𝑃(𝑜) time ⇒ 𝑃 𝑜3 ➢ Q: What do you need to store to also get the actual paths?
373F19 - Nisarg Shah 37
➢ Improvement over
➢ Running time
➢ But the space
373F19 - Nisarg Shah 38
373F19 - Nisarg Shah 39
➢ Our DP doesn’t work because its path from 𝑡 to 𝑢 might
➢ But path from 𝑡 to 𝑣 might in turn go through 𝑢 ➢ The path may no longer remain simple
➢ Hamiltonian path problem (i.e. is there a path of length
373F19 - Nisarg Shah 40
➢ Input: A directed graph 𝐻 = (𝑊, 𝐹) with edge lengths ℓ𝑤𝑥
➢ Goal: Compute the length of the shortest path from all
➢ Run single-source shortest paths from each source 𝑡 ➢ Running time is 𝑃 𝑜4 ➢ Actually, we can do this in 𝑃(𝑜3) as well
373F19 - Nisarg Shah 41
➢ Input: A directed graph 𝐻 = (𝑊, 𝐹) with edge lengths ℓ𝑤𝑥
➢ Goal: Compute the length of the shortest path from all
373F19 - Nisarg Shah 42
➢ Input: Matrices 𝑁1, … , 𝑁𝑜 where the dimension of 𝑁𝑗 is
➢ Goal: Compute 𝑁1 ⋅ 𝑁2 ⋅ … 𝑁𝑜
➢ 𝐵 ⋅ 𝐶 ⋅ 𝐷 = 𝐵 ⋅ 𝐶 ⋅ 𝐷 ➢ So isn’t the optimal solution going to call the algorithm
➢ Insight: the time it takes to multiply two matrices
373F19 - Nisarg Shah 43
➢ We use the brute force approach for matrix multiplication ➢ So multiplying 𝑞 × 𝑟 and 𝑟 × 𝑠 matrices requires 𝑞 ⋅ 𝑟 ⋅ 𝑠
➢ 𝑁1 is 5 X 10, 𝑁2 is 10 X 100, and 𝑁3 is 100 X 50 ➢ 𝑁1 ⋅ 𝑁2 ⋅ 𝑁3 requires 5 ⋅ 10 ⋅ 100 + 5 ⋅ 100 ⋅ 50 =
➢ 𝑁1 ⋅ 𝑁2 ⋅ 𝑁3 requires 10 ⋅ 100 ⋅ 50 + 5 ⋅ 10 ⋅ 50 =
373F19 - Nisarg Shah 44
➢ Our input is simply the dimensions 𝑒0, 𝑒1, … , 𝑒𝑜 and not
➢ Optimal substructure property ➢ Think of the final product computed, say 𝐵 ⋅ 𝐶 ➢ 𝐵 is the product of some prefix, 𝐶 is the product of the
➢ For the overall optimal computation, each of 𝐵 and 𝐶
373F19 - Nisarg Shah 45
𝑘
➢ Here, 1 ≤ 𝑗 ≤ 𝑘 ≤ 𝑜 ➢ Q: Why do we not just care about prefixes and suffices?
⋅ 𝑁5 ⇒ need to know optimal solution for 𝑁2 ⋅ 𝑁3 ⋅ 𝑁4
➢ Running time: 𝑃 𝑜2 calls, 𝑃(𝑜) time per call ⇒ 𝑃 𝑜3
𝑃𝑄𝑈 𝑗, 𝑘 = ൝ 𝑗 = 𝑘 min 𝑃𝑄𝑈 𝑗, 𝑙 + 𝑃𝑄𝑈 𝑙 + 1, 𝑘 + 𝑒𝑗−1𝑒𝑙𝑒𝑘 ∶ 𝑗 ≤ 𝑙 < 𝑘 if 𝑗 < 𝑘
373F19 - Nisarg Shah 46
➢ Surprisingly, yes. But not by a DP algorithm (that I know of) ➢ Hu & Shing (1981) developed 𝑃(𝑜 log 𝑜) time algorithm by
Source: Wikipedia Example
triangles This slide is not in the scope of the course