[PDF] - Amortized Analysis PDF Document, free download

SLIDE 1

1

Amortized Analysis

《《《《老子老子老子老子》》》》第五十八章第五十八章第五十八章第五十八章：：：：

禍兮福之所倚

禍兮福之所倚禍兮福之所倚禍兮福之所倚，，，，福兮禍之所伏福兮禍之所伏福兮禍之所伏福兮禍之所伏

《

《《《詩經詩經詩經詩經》：》：》：》：

福兮禍所依

福兮禍所依福兮禍所依福兮禍所依，，，，禍兮福所伏禍兮福所伏禍兮福所伏禍兮福所伏

Amortized analysis is a concept used to evaluate the

cost or performance of a sequence of operations on a particular data structure more accurately, instead of considering the worst-case scenario. OP1, OP2, …, OPm : m operations. The worst-case time complexity of these m operations is not necessarily equal to the sum of their worst-case time complexities.

SLIDE 2

2

Potential Functions : an Example

Assume that each OPi (1 ≤ ≤ ≤ ≤ i ≤ ≤ ≤ ≤ m) consists of zero

r more pop’s followed by one push, with respect

to a stack. ti : the number of pop’s and push performed by OPi. tave : the average number of pop’s and push performed by each OPi. For example, m = 8, OP1, OP2 : 1 push; OP3 : 2 pop’s and 1 push; OP4, OP5 OP6 : 1 push; OP7 : 2 pop’s and 1 push; OP8 : 1 pop and 1 push. t1 = t2 = t4 = t5 = t6 = 1, t3 = t7 = 3, t8 = 2, and tave = 13/8 = 1.625.

SLIDE 3

3

Clearly, tave < 2, because there are m push’s and at most m − − − − 1 pop’s performed. The upper bound of 2 can be derived as well by a potential function, as shown below. φ φ φ φi : the number of elements in the stack after OPi. (φ φ φ φi denotes the potential for OPi+1 to perform pop’s, i.e., the maximal number

f pop’s performed by OPi+1 is bounded

by φ φ φ φi.)

SLIDE 4

4

φ

φ φ φi = φ φ φ φi−

− − −1 + 1 −

− − − (ti − − − − 1) = φ φ φ φi−

− − −1 + 2 −

− − − ti

ti = 2 −

− − − (φ φ φ φi − − − − φ φ φ φi−

− − −1)

i

m i

t

=

1

= 2m − − − −

i i

m i − =

1

1

( )

φ φ φ φ φ φ φ φ −

− − −

= 2m − − − − (φ φ φ φm − − − − φ φ φ φ0) = 2m − − − − φ φ φ φm ≤ ≤ ≤ ≤ 2m − − − − 1.

tave

≤ ≤ ≤ ≤ (2m − − − − 1)/m < 2.

SLIDE 5

5

Amortized Analysis of Heaps

Suppose that H is a heap of n nodes. Let OP1, OP2, …, OPm be m operations on H. Assume that each OPi (1 ≤ ≤ ≤ ≤ i ≤ ≤ ≤ ≤ m) is one of the following two operations: (1) insert an element x into H; (2) delete the minimum from H. We show below that the average time consumed by each OPi is O(log2 n). Each OPi can be implemented by a melding

peration on two heaps Hi,1 and Hi,2.

SLIDE 6

6

Melding Hi,1 and Hi,2 :

1 25 20 13 10 50 16 5 14 30 12 19 40

Hi,1 Hi,2 Step 1. Merge the right paths of Hi,1 and Hi,2. (e.g., merge (1, 10, 20) and (5, 12, 14) into (1, 5, 10, 12, 14, 20)) Step 2. Attach all left parts of Hi,1 and Hi,2 to the merged path.

1 13 10 19 5 50 16 40 12 30 14 20 25

SLIDE 7

7

Step 3. Swap the left subtree and the right subtree

f each node (except the lowest one) in the

merged path. (e.g., swap the left subtrees and the right subtrees of nodes 1, 5, 10, 12 and 14)

1 12 19 10 50 5 16 40 13 14 30 20 25

SLIDE 8

8

The execution time of a melding operation is proportional to the total length of the two right paths (e.g., (1, 10, 20) and (5, 12, 14)) of Hi,1 and Hi,2. The purpose of Step 3 is to reduce the length of the right path of the resulting heap (hence reduce the execution time of the next melding operation). Insertion of x into H can be realized by letting Hi,1 = x and Hi,2 = H. Deletion of the minimum from H can be realized by letting Hi,1 (Hi,2) be the left (right) subtree of the root node of H.

SLIDE 9

9

Heavy nodes and light nodes of H : x : a node of H. w(x) : the number of nodes in the subtree of H that is rooted at x. p(x) : the parent node of x in H. A non-root node x of H is heavy if w(x) > w(p(x))/2, and light if w(x) ≤ ≤ ≤ ≤ w(p(x))/2. A heavy node is further referred to as a right (left) heavy node, if it is the right (left) child node of its parent node.

SLIDE 10

10

For example,

1 13 10 19 5 50 16 40 12 30 14 20 25

x w(x)

1 13 5 11 right heavy 10 8 right heavy 12 5 right heavy 13 2 light 14 3 right heavy 16 1 light 19 2 light 20 2 right heavy 25 1 light 30 1 light 40 1 light 50 1 light

SLIDE 11

11

Let Hi denote the heap obtained after OPi, where 1 ≤ ≤ ≤ ≤ i ≤ ≤ ≤ ≤ m. OP1 OP2 OP3 OPm H       → → → → H1       → → → → H2       → → → → •

•
•


     → → → → Hm Assume that Hi has ni nodes. Consider H = H0 and n = n0. Then, ni = ni−

− − −1 ±

± ± ± 1 for 1 ≤ ≤ ≤ ≤ i ≤ ≤ ≤ ≤ m. Suppose that Hi−

− − −1,1 and H i− − − −1,2 are the two heaps

perated by OPi.

Hi-

1,1

Hi-

1,2

Hi OPi

SLIDE 12

12

ti : the total length of the two right paths of Hi−

− − −1,1

and Hi−

− − −1,2, which is proportional to the time

spent for OPi. tave : the average time consumed by each OPi, i.e., tave = (

) /

i

m i

m

t

=

1

. φ φ φ φi : the number of right heavy nodes in Hi (e.g., φ φ φ φi = 0 for the heap of page 7). φ φ φ φi is used as the potential function to evaluate tave. Let Ai = ti + (φ φ φ φi − − − − φ φ φ φi−

− − −1).

m

i i

A

=

1

=

i

m i

t

=

1

+ (φ

φ φ φm − − − − φ φ φ φ0). Assume that m is a constant and φ φ φ φm − − − − φ φ φ φ0 ≥ ≥ ≥ ≥ 0.

(m ×

× × × tave =)

i

m i

t

=

1

≤ ≤ ≤ ≤

m i i

A

=

1

.

m i i

A

=

1

is evaluated below.

SLIDE 13

13

Fact 1. Each node has at most one child heavy node. ki,1 : the number of light nodes in the left path

f Hi.

ki,2 : the number of right heavy nodes attached to the left path of Hi.

ki,1

light nodes heavy nodes

. . . ki,2

ki,2 ≤

≤ ≤ ≤ ki,1 + 1

SLIDE 14

14

Fact 2. There are at most

log2 ni
light nodes in the

left (right) path of Hi. If y2, y3, y4, … are light nodes, then w(y2) ≤ ≤ ≤ ≤ ni/2, w(y3) ≤ ≤ ≤ ≤ w(y2)/2 ≤ ≤ ≤ ≤ ni/4, w(y4) ≤ ≤ ≤ ≤ w(y3)/2 ≤ ≤ ≤ ≤ ni/8, …

There are at most
log2 ni
+ 1 right

heavy nodes attached to the left path

f Hi.

SLIDE 15

15

Let ni−

− − −1,1 (ni− − − −1,2) : the number of nodes in Hi− − − −1,1 (Hi− − − −1,2);

hi−

− − −1,1 (hi− − − −1,2) : the number of heavy nodes in the

right path of Hi−

− − −1,1 (Hi− − − −1,2);

ri−

− − −1,1 (ri− − − −1,2) : the number of light nodes and heavy

nodes in the right path of Hi−

− − −1,1 (Hi− − − −1,2).

ti = (1 + ri−

− − −1,1) + (1 + ri− − − −1,2)

ri−

− − −1,1 ≤

≤ ≤ ≤ hi−

− − −1,1 +

log2 ni−

− − −1,1

ri−

− − −1,2 ≤

≤ ≤ ≤ hi−

− − −1,2 +

log2 ni−

− − −1,2

ti ≤

≤ ≤ ≤ 2 + hi−

− − −1,1 + hi− − − −1,2 +

log2 ni−

− − −1,1

+
log2 ni−

− − −1,2

ti < 2 + hi−

− − −1,1 + hi− − − −1,2 + 2 ×

× × ×

log2 ni

SLIDE 16

16

Compared with Hi−

− − −1,1 and Hi− − − −1,2,

(1) Hi decreases hi−

− − −1,1 + hi− − − −1,2 right heavy nodes,

which are in the right paths of Hi−

− − −1,1 and

Hi−

− − −1,2), as a consequence of Step 3;

(2) Hi increases at most

log2 ni
+ 1 right heavy

nodes, which are attached to the left path

f Hi, as a consequence of Step 3.
φ

φ φ φi − − − − φ φ φ φi−

− − −1 ≤

≤ ≤ ≤ (

log2 ni
+ 1) −

− − − (hi−

− − −1,1 + hi− − − −1,2)

Ai = ti + (φ

φ φ φi − − − − φ φ φ φi−

− − −1)

< (2 + hi−

− − −1,1 + hi− − − −1,2 + 2 ×

× × ×

log2 ni
) +

(

log2 ni
+ 1 −

− − − hi−

− − −1,1 −

− − − hi−

− − −1,2)

= 3 + 3 × × × ×

log2 ni
(ni ≤

≤ ≤ ≤ n + i ≤ ≤ ≤ ≤ n + m and m is a constant)

Ai = O(log2 n)

SLIDE 17

17

The average time tave = (

) /

i

m i

m

t

=

1

consumed

by each OPi has an upper bound of

( ) /

i

m i

A m

=

1

= O(log2 n).

Exercise 15. Read Sec. 10-3 (Amortized Analysis of AVL-Trees) of the textbook and give a summary.

SLIDE 18

18

Amortized Analysis of a Self-Organizing

Sequential Search Method

Given an initially empty list and a sequence of m search queries, the move-to-the-front sequential search method moves the query item to the front

f the list whenever a search is completed.

query sequence: (B, D, A, D, D, C, A)

query items list # of comparisons B B D D B 1 A A D B 2 D D A B 2 D D A B 1 C C D A B 3 A A C D B 3

SLIDE 19

19

C(S) : the total number of comparisons incurred by a query sequence S. C(S) : the minimum number of comparisons required, if all distinct items of S are arranged in the list initially. When S = (B, D, A, D, D, C, A), we have C(S) = 12 and C(S) = 14, where the list for C(S) is (D, A, B, C) or (D, A, C, B). When S = (A, B, C, A, B, C, A, B, C), we have C(S) = 21 and C(S) = 18, where the list for C(S) can be any permutation of A, B, C. When S = (A, B, C)k, i.e., k repetitions of A, B, C, we have C(S) = 9k − − − − 6 and C(S) = 6k, where the list for C(S) can be any permutation of A, B, C. We show C(S) < 2 × × × × C(S) for any query sequence S below.

SLIDE 20

20

A comparison between X and Y is unsuccessful (successful) if X ≠ ≠ ≠ ≠ Y (X = Y). Fact 1. (Pairwise Independent Property) Suppose that X and Y are two distinct items of S. The total number of comparisons between X and Y depends only on the (X, Y)-subsequence of S and is independent of the other items. (The (X, Y)-subsequence of S contains X’s and Y’s

nly.)

SLIDE 21

21

For example, let S = (C, A, C, B, C, A), X = A, and

Y = B. query items list # of comparisons between A and B C C A A C C C A B B C A 1 C C B A A A C B 1

The (A, B)-subsequence of S is (A, B, A).

query items list # of comparisons between A and B A A B B A 1 A A B 1

SLIDE 22

22

nX,Y : the total number of comparisons between

X and Y (X ≠

≠ ≠ ≠ Y) with respect to the (X, Y)- subsequence of S. N(−

− − −)(S) : the total number of unsuccessful

comparisons with respect to S. According to Fact 1, we have N(−

− − −)(S) =

, ,

X Y X Y X YS

n

∈ ≠

.

For example, for S = (C, A, C, B, C, A), we have N(−

− − −)(S) = nA,B + nA,C + nB, C = 2 + 3 + 2 = 7.

SLIDE 23

23

N*(−

− − −)(S) : the total number of unsuccessful

comparisons in C*(S). Fact 2. N(−

− − −)(S) ≤

≤ ≤ ≤ 2 × × × × N*(−

− − −)(S).

Proof. Suppose that X ≠

≠ ≠ ≠ Y are any two distinct items

f S.

Assume that there are a X’s and b Y’s in S and a ≤ ≤ ≤ ≤ b. When the (X, Y)-subsequence of S is ((Y, X)a, Y b−

− − −a), we have nX,Y = 2a if a < b,

and nX,Y = 2a− − − −1 if a = b, which is maximum. On the other hand, let n*

X,Y be the minimum

number of comparisons incurred for any permutation of X’s and Y’s.

n*

X,Y = a, when Y’s precede X’s in the list.

nX,Y ≤

≤ ≤ ≤ 2 × × × × n*

X,Y.

SLIDE 24

24

N(−

− − −)(S) =

, ,

X Y X Y X YS

n

∈ ≠

≤

≤ ≤ ≤ 2 × × × ×

,

*

X Y X Y X YS

n

∈ ≠

,

= 2 × × × × N*(−

− − −)(S).

N(+)(S) : the total number of successful comparisons

with respect to S. N(+)(S) : the total number of successful comparisons with respect to C(S). N(+)(S) ≤ ≤ ≤ ≤ |S| − − − − 1; N*(+)(S) = |S|. C(S) = N(−

− − −)(S) + N(+)(S); C*(S) = N*(− − − −)(S) + N*(+)(S).

C(S) < 2 ×

× × × C*(S). As explained below, the constant 2 is tight.

SLIDE 25

25

Consider the query sequence S = (A, B, C, D)m, where m ≥ ≥ ≥ ≥ 1.

query items list # of comparisons A A B B A 1 C C B A 2 D D C B A 3 A A D C B 4 B B A D C 4 C C B A D 4 D D C B A 4

C(S) = (1 + 2 + 3) + 4 ×

× × × 4 × × × × (m − − − − 1) = 16m − − − − 10 When the items are arranged as A, B, C, D in the list, C*(S) = (1 + 2 + 3 + 4) + (1 + 2 + 3 + 4) + … = 10m.

SLIDE 26

26

In general, if S = (X1, X2, …, Xk)m, where k ≥ ≥ ≥ ≥ 4, then C(S) = (1 + 2 + … + (k − − − − 1)) + k × × × × k × × × × (m − − − − 1) =

( ) k k −1 2

+ k2 ×

× × × (m − − − − 1) and C*(S) = (1 + 2 + … + k) × × × × m = m × × × ×

( ) k k + 1 2

.

( )

( ) C S C S *

→