CS293S Static Single-Assignment (SSA) Yufei Ding Summary Domain - - PowerPoint PPT Presentation
CS293S Static Single-Assignment (SSA) Yufei Ding Summary Domain - - PowerPoint PPT Presentation
CS293S Static Single-Assignment (SSA) Yufei Ding Summary Domain Direction Uses AVAIL Expressions Forward GCSE LIVEOUT Variables Backward Register alloc. Detect uninit. Construct SSA Useless-store Elim. VERYBUSY Expressions
2
Summary
Domain Direction Uses AVAIL Expressions Forward GCSE LIVEOUT Variables Backward Register alloc. Detect uninit. Construct SSA Useless-store Elim. VERYBUSY Expressions Backward Hoisting CONSTANT Pairs <v,c> Forward Constant folding
Reaching Definitions
A definition d of some variable v reaches operation i if there is at least one
path leading from that definition to operation i before v is redefined.
REACHES(n): the set of definitions that reach the start of node n. DEDEF(n): the set of downward-exposed definitions in n. i.e. their defined variables are not redefined before leaving n. DEFKILL(n): all definitions killed by a definition in n.
REACHES(no) = Ø REACHES(n) = È m Î pred(n) DEDEF(m) È (REACHES(m) Ç DEFKILL(m))
4
Recall that n dominates m iff n is on every path from n0 to m Every node dominates itself
DOM(n0 ) = { n0 }
A rapid data-flow framework n’s immediate dominator is its closest dominator
except itself, IDOM(n)†
Dominance
†IDOM(n ) ≠ n, unless n is n0, by convention.
Initially, DOM(n) = all, " n≠n0
DOM(n) = { n } È (ÇpÎpreds(n) DOM(p))
5
Example
B1 B2 B3 B4 B5 B6 B7 B0 Control Flow Graph Progress of iterative solution for DOM Results of iterative solution for DOM
DOM(n) = { n } È (ÇpÎpreds(n) DOM(p))
6
Example
Dominance Tree Progress of iterative solution for DOM Results of iterative solution for DOM B1 B2 B3 B4 B5 B6 B7 B0
SSA Construction
An important technique for many data flow and control flow
analyses
A great example for showing the uses of data flow analysis Many different data-flow problems SSA is a variant form of the program that encodes both data
flow and control flow directly into the IR.
Can serve as the basis for a large set of transformations
7
8
Outline
Review of the SSA concept Simple way to construct a (maximal) SSA Better way to construct (semi-pruned) SSA SSA construction algorithm
9
Review
SSA-form
Each name is defined exactly once Each use refers to exactly one name
What’s hard
Joins in the CFG are hard
Building SSA Form
Insert Ø-functions at joint points Rename all values for uniqueness
x ¬ 17 - 4 x ¬ a + b x ¬ y - z x ¬ 13 z ¬ x * q s ¬ w - x
10
Ø-function
A Ø-function is a special kind of copy that selects
- ne of its parameters.
We assume that all Ø-functions in a block will execute at the same time: order doesn’t matter. Real machines do not implement a Ø-function directly in hardware.
y1 ¬ ... y2 ¬ ... y3 ¬ Ø(y1,y2)
11
SSA Construction Algorithm (High-level sketch)
- 1. Insert Ø-functions
- 2. Rename values
… that’s all ...
… of course, there is some bookkeeping to be done ...
12
SSA Construction Algorithm (Lower-level sketch)
- 1. Insert Ø-functions at every join for every name appearing in
the CFG
- 2. Solve reaching definitions
- 3. Rename each use to the def that reaches
(will be unique)
x ¬ 17 - 4 x ¬ a + b x ¬ y - z x ¬ 13 z ¬ x * q s ¬ w - x
Problem
Too many Ø-functions Precision Space Time How to eliminate the useless Ø-functions?
Produces “maximal” SSA
y ¬ 17 - 4 x ¬ a + b x ¬ y - z x ¬ 13 z ¬ x * q s ¬ w - x
Algorithms based on dominance
x ¬ 17 - 4
14
For a definition of x defined in a block n, it is enough to insert Ø- functions for that definition in the blocks that are right outside the dominated region of n. x ¬ A B D C
AÎ Dom(D)
x ¬ A B D
A Ï Dom(D)
A Ï Dom(p(D))
x ¬ A B D
A Ï Dom(D)
A Î Dom(p(D)) i.e. D Î DF(A)
thus, insert Ø-function in D no insertion in D insertion in D because of B but not A
15
Dominance Frontiers
Dominance Frontiers
- DF(n ) is fringe just beyond the region n dominates
- m ÎDF(n) : iff n Ï(Dom(m) - {m}) but n Î DOM(p) for some
p Î preds(m). B1 B2 B3 B4 B5 B6 B7 B0
i.e., n doesn’t strictly dominate m i.e., n dominates p
Control Flow Graph
16
Computing Dominance Frontiers
- Only join points are in DF(n) for some n
- Leads to a simple, intuitive algorithm for computing
dominance frontiers For each join point x (i.e., |preds(x)| > 1) For each CFG predecessor of x Walk up to IDOM(x ) in the dominator tree, adding each node y in the walk to DF(n) except IDOM(x). B1 B2 B3 B4 B5 B6 B7 B0 Dominance Tree B1 B2 B3 B4 B5 B6 B7 B0 Control Flow Graph
17
Example
B1 B2 B3 B4 B5 B6 B7 B0
- ¬ in 1 forces Ø-function in DF(1) = {1}
(halt ) x¬ ...
x¬ Ø(...)
- DF(4) is {6}, so ¬ in 4 forces Ø-function in 6
x¬ Ø(...)
- ¬ in 6 forces Ø-function in DF(6) = {7}
x¬ Ø(...)
- ¬ in 7 forces Ø-function in DF(7) = {1}
B1 B2 B3 B4 B5 B6 B7 B0
18
SSA Construction Algorithm
- 1. Insert Ø-functions
a.) calculate dominance frontiers b.) find global names (names appearing in >1 blocks) for each name, build a list of blocks that define it c.) insert Ø-functions " global name n " block b in which n is assigned " block d in b’s dominance frontier insert a Ø-function for n in d add d to n’s list of defining blocks
a ¬ Ø(a,a) b ¬ Ø(b,b) c ¬ Ø(c,c) d ¬ Ø(d,d) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i ¬ ••• B0 b ¬ ••• c ¬ ••• d ¬ ••• B2 a ¬ Ø(a,a) b ¬ Ø(b,b) c ¬ Ø(c,c) d ¬ Ø(d,d) i ¬ Ø(i,i) a ¬ ••• c ¬ ••• B1 a ¬ ••• d ¬ ••• B3 d ¬ ••• B4 c ¬ ••• B5 d ¬ Ø(d,d) c ¬ Ø(c,c) b ¬ ••• B6 i > 100
With all the Ø-functions
- Lots of new ops
- Renaming is next
Assume a, b, c, & d defined before B0
Example
Excluding local names avoids Ø’s for y & z
20
SSA Construction Algorithm (Details)
- 2. Rename variables in a pre-order walk over dominator tree
(use an array of stacks, one stack per global name) Staring with the root block, b a.) generate unique names for each Ø-function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses of global names with the current version (from the stack)
- ii. Rewrite definition by inventing & pushing new name
c.) fill in Ø-function parameters of successor blocks d.) recurse on b’s children in the dominance tree e.) <on exit from b> pop names generated in b from stacks
1 counter per name for subscripts Reset the state
21
SSA Construction Algorithm (Details)
for each global name i counter[i] ¬ 0 stack[i] ¬ Ø call Rename(n0) NewName(n) i ¬ counter[n] counter[n] ¬ counter[n] + 1 push ni onto stack[n] return ni Rename(b) for each Ø-function in b, x ¬ Ø(…) rename x as NewName(x) for each operation “x ¬ y op z” in b rewrite y as top(stack[y]) rewrite z as top(stack[z]) rewrite x as NewName(x) for each successor of b in the CFG rewrite appropriate Ø parameters for each successor s of b in dom. tree Rename(s) for each operation “x ¬ y op z” in b pop(stack[x])
Adding all the details ...
a ¬ Ø(a,a) b ¬ Ø(b,b) c ¬ Ø(c,c) d ¬ Ø(d,d) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i ¬ ••• B0 b ¬ ••• c ¬ ••• d ¬ ••• B2 a ¬ Ø(a,a) b ¬ Ø(b,b) c ¬ Ø(c,c) d ¬ Ø(d,d) i ¬ Ø(i,i) a ¬ ••• c ¬ ••• B1 a ¬ ••• d ¬ ••• B3 d ¬ ••• B4 c ¬ ••• B5 d ¬ Ø(d,d) c ¬ Ø(c,c) b ¬ ••• B6 i > 100
Counters Stacks 1 1 1 1 a
a0 b0 c0 d0
Before processing B0
b c d i Assume a, b, c, & d defined before B0 i has not been defined
Example
Assume a, b, c, & d defined before B0
i ≤ 100
a ¬ Ø(a,a) b ¬ Ø(b,b) c ¬ Ø(c,c) d ¬ Ø(d,d) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i0 ¬ ••• B0 b ¬ ••• c ¬ ••• d ¬ ••• B2 a ¬ Ø(a0,a) b ¬ Ø(b0,b) c ¬ Ø(c0,c) d ¬ Ø(d0,d) i ¬ Ø(i0,i) a ¬ ••• c ¬ ••• B1 a ¬ ••• d ¬ ••• B3 d ¬ ••• B4 c ¬ ••• B5 d ¬ Ø(d,d) c ¬ Ø(c,c) b ¬ ••• B6 i > 100
Counters Stacks 1 1 1 1 1 a b c d i
a0 b0 c0 d0
End of B0
i0
Example
Assume a, b, c, & d defined before B0
i ≤ 100
a ¬ Ø(a,a) b ¬ Ø(b,b) c ¬ Ø(c,c) d ¬ Ø(d,d) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i0 ¬ ••• B0 b ¬ ••• c ¬ ••• d ¬ ••• B2 a1 ¬ Ø(a0,a) b1 ¬ Ø(b0,b) c1 ¬ Ø(c0,c) d1 ¬ Ø(d0,d) i1 ¬ Ø(i0,i) a2 ¬ ••• c2 ¬ ••• B1 a ¬ ••• d ¬ ••• B3 d ¬ ••• B4 c ¬ ••• B5 d ¬ Ø(d,d) c ¬ Ø(c,c) b ¬ ••• B6 i > 100
Counters Stacks 3 2 3 2 2 a b c d i
a0 b0 c0 d0
End of B1
i0 a1 b1 c1 d1 i1 a2 c2
Example
Assume a, b, c, & d defined before B0
i ≤ 100
a ¬ Ø(a2,a) b ¬ Ø(b2,b) c ¬ Ø(c3,c) d ¬ Ø(d2,d) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i0 ¬ ••• B0 b2 ¬ ••• c3 ¬ ••• d2 ¬ ••• B2 a1 ¬ Ø(a0,a) b1 ¬ Ø(b0,b) c1 ¬ Ø(c0,c) d1 ¬ Ø(d0,d) i1 ¬ Ø(i0,i) a2 ¬ ••• c2 ¬ ••• B1 a ¬ ••• d ¬ ••• B3 d ¬ ••• B4 c ¬ ••• B5 d ¬ Ø(d,d) c ¬ Ø(c,c) b ¬ ••• B6 i > 100
Counters Stacks 3 3 4 3 2 a b c d i
a0 b0 c0 d0
End of B2
i0 a1 b1 c1 d1 i1 a2 c2 b2 d2 c3
Example
Assume a, b, c, & d defined before B0
i ≤ 100
a ¬ Ø(a2,a) b ¬ Ø(b2,b) c ¬ Ø(c3,c) d ¬ Ø(d2,d) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i0 ¬ ••• B0 b2 ¬ ••• c3 ¬ ••• d2 ¬ ••• B2 a1 ¬ Ø(a0,a) b1 ¬ Ø(b0,b) c1 ¬ Ø(c0,c) d1 ¬ Ø(d0,d) i1 ¬ Ø(i0,i) a2 ¬ ••• c2 ¬ ••• B1 a ¬ ••• d ¬ ••• B3 d ¬ ••• B4 c ¬ ••• B5 d ¬ Ø(d,d) c ¬ Ø(c,c) b ¬ ••• B6 i > 100 i ≤ 100
Counters Stacks 3 3 4 3 2 a b c d i
a0 b0 c0 d0
Before starting B3
i0 a1 b1 c1 d1 i1 a2 c2
Example
Assume a, b, c, & d defined before B0
a ¬ Ø(a2,a) b ¬ Ø(b2,b) c ¬ Ø(c3,c) d ¬ Ø(d2,d) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i0 ¬ ••• B0 b2 ¬ ••• c3 ¬ ••• d2 ¬ ••• B2 a1 ¬ Ø(a0,a) b1 ¬ Ø(b0,b) c1 ¬ Ø(c0,c) d1 ¬ Ø(d0,d) i1 ¬ Ø(i0,i) a2 ¬ ••• c2 ¬ ••• B1 a3 ¬ ••• d3 ¬ ••• B3 d ¬ ••• B4 c ¬ ••• B5 d ¬ Ø(d,d) c ¬ Ø(c,c) b ¬ ••• B6 i > 100
Counters Stacks 4 3 4 4 2 a b c d i
a0 b0 c0 d0
End of B3
i0 a1 b1 c1 d1 i1 a2 c2 a3 d3
Example
Assume a, b, c, & d defined before B0
a ¬ Ø(a2,a) b ¬ Ø(b2,b) c ¬ Ø(c3,c) d ¬ Ø(d2,d) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i0 ¬ ••• B0 b2 ¬ ••• c3 ¬ ••• d2 ¬ ••• B2 a1 ¬ Ø(a0,a) b1 ¬ Ø(b0,b) c1 ¬ Ø(c0,c) d1 ¬ Ø(d0,d) i1 ¬ Ø(i0,i) a2 ¬ ••• c2 ¬ ••• B1 a3 ¬ ••• d3 ¬ ••• B3 d4 ¬ ••• B4 c ¬ ••• B5 d ¬ Ø(d4,d) c ¬ Ø(c2,c) b ¬ ••• B6 i > 100
Counters Stacks 4 3 4 5 2 a b c d i
a0 b0 c0 d0
End of B4
i0 a1 b1 c1 d1 i1 a2 c2 a3 d3 d4
Example
Assume a, b, c, & d defined before B0
a ¬ Ø(a2,a) b ¬ Ø(b2,b) c ¬ Ø(c3,c) d ¬ Ø(d2,d) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i0 ¬ ••• B0 b2 ¬ ••• c3 ¬ ••• d2 ¬ ••• B2 a1 ¬ Ø(a0,a) b1 ¬ Ø(b0,b) c1 ¬ Ø(c0,c) d1 ¬ Ø(d0,d) i1 ¬ Ø(i0,i) a2 ¬ ••• c2 ¬ ••• B1 a3 ¬ ••• d3 ¬ ••• B3 d4 ¬ ••• B4 c4 ¬ ••• B5 d ¬ Ø(d4,d3) c ¬ Ø(c2,c4) b ¬ ••• B6 i > 100
Counters Stacks 4 3 5 5 2 a b c d i
a0 b0 c0 d0
End of B5
i0 a1 b1 c1 d1 i1 a2 c2 a3 d3 c4
Example
Assume a, b, c, & d defined before B0
a ¬ Ø(a2,a3) b ¬ Ø(b2,b3) c ¬ Ø(c3,c5) d ¬ Ø(d2,d5) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i0 ¬ ••• B0 b2 ¬ ••• c3 ¬ ••• d2 ¬ ••• B2 a1 ¬ Ø(a0,a) b1 ¬ Ø(b0,b) c1 ¬ Ø(c0,c) d1 ¬ Ø(d0,d) i1 ¬ Ø(i0,i) a2 ¬ ••• c2 ¬ ••• B1 a3 ¬ ••• d3 ¬ ••• B3 d4 ¬ ••• B4 c4 ¬ ••• B5 d5 ¬ Ø(d4,d3) c5 ¬ Ø(c2,c4) b3 ¬ ••• B6 i > 100
Counters Stacks 4 4 6 6 2 a b c d i
a0 b0 c0 d0
End of B6
i0 a1 b1 c1 d1 i1 a2 c2 a3 d3 c5 d5 b3
Example
Assume a, b, c, & d defined before B0
a ¬ Ø(a2,a3) b ¬ Ø(b2,b3) c ¬ Ø(c3,c5) d ¬ Ø(d2,d5) y ¬ a+b z ¬ c+d i ¬ i+1 B7 i > 100 i0 ¬ ••• B0 b2 ¬ ••• c3 ¬ ••• d2 ¬ ••• B2 a1 ¬ Ø(a0,a) b1 ¬ Ø(b0,b) c1 ¬ Ø(c0,c) d1 ¬ Ø(d0,d) i1 ¬ Ø(i0,i) a2 ¬ ••• c2 ¬ ••• B1 a3 ¬ ••• d3 ¬ ••• B3 d4 ¬ ••• B4 c4 ¬ ••• B5 d5 ¬ Ø(d4,d3) c5 ¬ Ø(c2,c4) b3 ¬ ••• B6 i > 100
Counters Stacks 4 4 6 6 2 a b c d i
a0 b0 c0 d0
Before B7
i0 a1 b1 c1 d1 i1 a2 c2
Example
Assume a, b, c, & d defined before B0
a4 ¬ Ø(a2,a3) b4 ¬ Ø(b2,b3) c6 ¬ Ø(c3,c5) d6 ¬ Ø(d2,d5) y ¬ a4+b4 z ¬ c6+d6 i2 ¬ i1+1 B7 i > 100 i0 ¬ ••• B0 b2 ¬ ••• c3 ¬ ••• d2 ¬ ••• B2 a1 ¬ Ø(a0,a4) b1 ¬ Ø(b0,b4) c1 ¬ Ø(c0,c6) d1 ¬ Ø(d0,d6) i1 ¬ Ø(i0,i2) a2 ¬ ••• c2 ¬ ••• B1 a3 ¬ ••• d3 ¬ ••• B3 d4 ¬ ••• B4 c4 ¬ ••• B5 d5 ¬ Ø(d4,d3) c5 ¬ Ø(c2,c4) b3 ¬ ••• B6 i > 100
Counters Stacks 5 5 7 7 3 a b c d i
a0 b0 c0 d0
End of B7
i0 a1 b1 c1 d1 i1 a2 c2 a4 b4 c6 d6 i2
Example
Assume a, b, c, & d defined before B0
a4 ¬ Ø(a2,a3) b4 ¬ Ø(b2,b3) c6 ¬ Ø(c3,c5) d6 ¬ Ø(d2,d5) y ¬ a4+b4 z ¬ c6+d6 i2 ¬ i1+1 B7 i > 100 i0 ¬ ••• B0 b2 ¬ ••• c3 ¬ ••• d2 ¬ ••• B2 a1 ¬ Ø(a0,a4) b1 ¬ Ø(b0,b4) c1 ¬ Ø(c0,c6) d1 ¬ Ø(d0,d6) i1 ¬ Ø(i0,i2) a2 ¬ ••• c2 ¬ ••• B1 a3 ¬ ••• d3 ¬ ••• B3 d4 ¬ ••• B4 c4 ¬ ••• B5 d5 ¬ Ø(d4,d3) c5 ¬ Ø(c2,c4) b3 ¬ ••• B6 i > 100
Counters Stacks
Example
After renaming
- Semi-pruned SSA form
- We’re done …
Semi-pruned Þ only names appearing in 2 or more blocks are “global names”. Assume a, b, c, & d defined before B0
34
SSA Construction Algorithm (Pruned SSA)
Semi-pruned SSA: discard names used in only one block Significant reduction in total number of Ø-functions Needs only local Live (appearance) information (cheap to
compute)
Pruned SSA: only insert Ø-functions where their value is live Inserts even fewer Ø-functions, but costs more to do Requires global Live variable analysis (more expensive)
35
SSA Deconstruction
At some point, we need executable code
No machines implement Ø operations Need to fix up the flow of values
Basic idea
Insert copies to Ø-function predecessors Adds lots of copies Most of them coalesce away
X17 ¬ Ø(x10,x11) ... ¬ x17 ... ... ... ¬ x17 X17 ¬ x10 X17 ¬ x11