GEOMETRIC STEINER TREE PACKING WITH DENSITY CONSTRAINTS Nicolai - - PowerPoint PPT Presentation
GEOMETRIC STEINER TREE PACKING WITH DENSITY CONSTRAINTS Nicolai - - PowerPoint PPT Presentation
GEOMETRIC STEINER TREE PACKING WITH DENSITY CONSTRAINTS Nicolai Hhnle, Pietro Saccardi Aussois Combinatorial Optimization Workshop January 12, 2017 Research Institute for Discrete Mathematics, Bonn 1/30 OVERVIEW Motivation: global routing
OVERVIEW
Motivation: global routing in chip design Traditional Steiner tree packing in grid graphs Geomtric Steiner tree packing on rhomboids Shortest paths and Steiner trees on rhomboids Experimental results
2/30
MOTIVATION: GLOBAL ROUTING IN CHIP DESIGN
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
HOW IS A CHIP BUILT?
“Zoom Into a Microchip” by NISENet, available under a CC BY license.
4/30
WIRE DENSITY IS EXTREMELY HIGH
Tightly packed wires on a chip routed by BonnRoute; zoom is 1000×. Only signal wires are shown (power grid is omitted). Wires are colored by routing layer.
5/30
FACTS AND FIGURES
Chip structure
- A chip has a layered 3D structure.
- Transistors are placed on the lowest layers.
- Up to 16 routing layers for interconnects.
Chip size A chip with an area of few mm2 packs:
- several billions transistors
- several meters of wires (few nm thick)
- several millions of connections
6/30
ROUTING IN CHIP DESIGN
Assumptions
- The chip functionality is decomposed into elementary Boolean operations.
- Logic gates (Boolean operations) are implemented at a transistor level.
- All the logic gates are already placed on the chip.
- All the connections (nets) between logic gates are known.
Goal Compute connections for each net in such a way that:
- different nets are disjoint;
- design constraints are satisfied;
- timing closure is achieved;
- various objectives (wire length, power consumption, yield, …) are optimized.
7/30
ROUTING IN CHIP DESIGN
Assumptions
- The chip functionality is decomposed into elementary Boolean operations.
- Logic gates (Boolean operations) are implemented at a transistor level.
- All the logic gates are already placed on the chip.
- All the connections (nets) between logic gates are known.
Goal Compute connections for each net in such a way that:
- different nets are disjoint;
- design constraints are satisfied;
- timing closure is achieved;
- various objectives (wire length, power consumption, yield, …) are optimized.
7/30
MODELING ROUTING
Constraints due to fabrication process Constraints due to problem hardness Huge restriction on wiring Steiner tree problem in 3D grid graphs 109 vertices!
- Axis-parallel wires.
- Monodirectional layers.
- Possible wire locations are discretized (tracks).
- Problem in split into global and detailed:
- Global routing ignores disjointness and design rules compliancy.
It optimizes globally objectives such as packing density, length and yield.
- Detailed routing explores a subset of the routing graph.
It handles disjointness and design rules compliancy.
8/30
MODELING ROUTING
Constraints due to fabrication process Constraints due to problem hardness Huge restriction on wiring Steiner tree problem in 3D grid graphs 109 vertices!
8/30
MODELING ROUTING
Constraints due to fabrication process Constraints due to problem hardness Huge restriction on wiring Steiner tree problem in 3D grid graphs 109 vertices!
- Axis-parallel wires.
- Monodirectional layers.
- Possible wire locations are discretized (tracks).
- Problem in split into global and detailed:
- Global routing ignores disjointness and design rules compliancy.
It optimizes globally objectives such as packing density, length and yield.
- Detailed routing explores a subset of the routing graph.
It handles disjointness and design rules compliancy.
8/30
TRADITIONAL STEINER TREE PACKING IN GRID GRAPHS
TRADITIONAL APPROACH TO GLOBAL ROUTING
Coarsening the routing graph → global routing graph.
10/30
TRADITIONAL APPROACH TO GLOBAL ROUTING
Coarsening the routing graph → global routing graph.
10/30
TRADITIONAL APPROACH TO GLOBAL ROUTING
Coarsening the routing graph → global routing graph. Problem (Simplified Global Routing) Input: Graph G′, capacities u : E(G′) → Z≥0, lengths l : E(G′) → R≥0, nets N, where n ∈ N is ∅ ̸= n ⊂ V(G′), and wire widths w : N × E(G′) → R≥0. Task: Find Steiner trees Yn, minimizing ∑
n∈N
∑
e∈E(Yn)
l(e) and meeting the capacity constraints: ∑
n∈N:e∈E(Yn)
w(n, e) ≤ u(e), ∀e ∈ E(Yn).
10/30
MIN-MAX RESOURCE SHARING
Problem (Min-Max Resource Sharing) Input:
- R finite set of resources (edges) of finite capacity
- C finite set of customers (nets)
- Bc set of solutions for c ∈ C (Steiner trees)
- Resource consumption function usgc : Bc → RR
≥0
- σ-approximate oracle function fc : RR
≥0 → Bc
Task: Find bc attaining λ∗ := inf { max
r∈R
∑
c∈C
(usgc(bc))r
- bc ∈ Bc, c ∈ C
} .
11/30
ALGORITHM OUTLINE: MIN-MAX RESOURCE SHARING
- It maintains resource prices (initially set to 1).
- The algorithm proceeds in phases. In each phase:
- For every net, find (approx.) cheapest route w.r.t. the given resource prices.
- Update the price of the used resources grow multiplicatively with load.
- Return convex combination of Steiner trees for each net.
Theorem (D. Müller, K. Radke, J. Vygen, 2011) Let ω > 0. A σ(1 + ω)-approximate fractional solution can be computed in time O(θ log |R|(|C| + |R|)(log log |R| + ω−2)), where θ is the time for an oracle call. IP techniques [T. H. Wu, A. Davoodi, and J. T. Linderoth, 2011] could be used but are inpractical for our instances.
12/30
POTENTIAL DRAWBACKS
- Poor topologies.
- Does not connect to pin shapes (important for signal delay estimation).
- Depends heavily on the choice of the grid.
- Does not support input wires.
13/30
GEOMTRIC STEINER TREE PACKING ON RHOMBOIDS
RHOMBOIDAL TILES
Definition Tile set T , layers L, chip area □. Ti ∈ T is a ℓ1-ball of unit radius, with ˚ Ti ∩ ˚ Tj = ∅ and □ ⊆ ∪
T∈T T.
Given a tile price function c : T → R≥0, we define the cost of a segment s as c(s) := ∑
T∈T c(T) ℓ(s ∩ T). The definition can be extented to rectilinear paths
and graphs embedded in the plane. Ti Tj
15/30
WHY RHOMBOIDAL TILES?
Rhomboids (ℓ1 balls) are simple objects! Let c(s) be the cost of a segment s. Then c(s) is (w.r.t. the coordinates of s):
- 1. continuous and piecewise linear;
- 2. smooth, unless it intersects some tile’s vertex or the endpoints lie on some
tile’s boundary. ⇓ If we drag a segment or its endpoints, the change in cost is described by a smooth linear function, as long as condition 2 is respected:
16/30
WHY RHOMBOIDAL TILES?
Rhomboids (ℓ1 balls) are simple objects! Let c(s) be the cost of a segment s. Then c(s) is (w.r.t. the coordinates of s):
- 1. continuous and piecewise linear;
- 2. smooth, unless it intersects some tile’s vertex or the endpoints lie on some
tile’s boundary. ⇓ If we drag a segment or its endpoints, the change in cost is described by a smooth linear function, as long as condition 2 is respected:
16/30
RHOMBOIDAL TILES AS RESOURCES
R := T × L (rhomboidal tiles × layers). Let Yn denote a route for n ∈ N. Definition T ∈ T × L on layer lT; w(n, l) wire width in tracks on layer l. Capacity: u(T) := total length of free tracks. Consumption: (usgn(Yn))T := ∑
s∈E(Yn) w(n, lT) ℓ(s∩T) u(T) .
Congestion: cong(T) := ∑
n∈N (usgn(Yn))T .
Given a resource price vector y
0, if we set c T
w n lT
yT u T ,
y usgn Yn c E Yn Can we compute a route of (approx.) minimal cost c ?
17/30
RHOMBOIDAL TILES AS RESOURCES
R := T × L (rhomboidal tiles × layers). Let Yn denote a route for n ∈ N. Definition T ∈ T × L on layer lT; w(n, l) wire width in tracks on layer l. Capacity: u(T) := total length of free tracks. Consumption: (usgn(Yn))T := ∑
s∈E(Yn) w(n, lT) ℓ(s∩T) u(T) .
Congestion: cong(T) := ∑
n∈N (usgn(Yn))T .
Given a resource price vector y ∈ RR
≥0, if we set c(T) := w(n, lT) yT u(T),
⇒ y⊺ usgn(Yn) = c(E(Yn)) Can we compute a route of (approx.) minimal cost c ?
17/30
RHOMBOIDAL TILES AS RESOURCES
R := T × L (rhomboidal tiles × layers). Let Yn denote a route for n ∈ N. Definition T ∈ T × L on layer lT; w(n, l) wire width in tracks on layer l. Capacity: u(T) := total length of free tracks. Consumption: (usgn(Yn))T := ∑
s∈E(Yn) w(n, lT) ℓ(s∩T) u(T) .
Congestion: cong(T) := ∑
n∈N (usgn(Yn))T .
Given a resource price vector y ∈ RR
≥0, if we set c(T) := w(n, lT) yT u(T),
⇒ y⊺ usgn(Yn) = c(E(Yn)) Can we compute a route of (approx.) minimal cost c(·)?
17/30
SHORTEST PATHS AND STEINER TREES ON RHOMBOIDS
GRID GRAPHS FROM TWO ENDPOINTS
- GT(0): grid graph induced by tiles’ centres.
- GT(1): adds grid lines through two endpoints s and t.
- GT(2)
s,t : recursively add grid lines at the intersection with tiles’ border.
The number of edges is 4 1 2 (graph size grows fast!).
19/30
GRID GRAPHS FROM TWO ENDPOINTS
- GT(0): grid graph induced by tiles’ centres.
- GT(1): adds grid lines through two endpoints s and t.
- GT(2)
s,t : recursively add grid lines at the intersection with tiles’ border.
The number of edges is 4 1 2 (graph size grows fast!). s t
19/30
GRID GRAPHS FROM TWO ENDPOINTS
- GT(0): grid graph induced by tiles’ centres.
- GT(1): adds grid lines through two endpoints s and t.
- GT(2)
s,t : recursively add grid lines at the intersection with tiles’ border.
The number of edges is 4 1 2 (graph size grows fast!).
19/30
GRID GRAPHS FROM TWO ENDPOINTS
- GT(0): grid graph induced by tiles’ centres.
- GT(1): adds grid lines through two endpoints s and t.
- GT(2)
s,t : recursively add grid lines at the intersection with tiles’ border.
The number of edges is 4(|Λ| − 1)2 (graph size grows fast!). Λ
19/30
GRID GRAPHS FROM TWO ENDPOINTS
- GT(0): grid graph induced by tiles’ centres.
- GT(1): adds grid lines through two endpoints s and t.
- GT(2)
s,t : recursively add grid lines at the intersection with tiles’ border.
The number of edges is 4(|Λ| − 1)2 (graph size grows fast!). Definition (GT(2)
Λ )
Given a finite set Λ ⊂ [0, 1], 0, 1 ∈ Λ, X := 2Z + {λ, 2 − λ : λ ∈ Λ} Y := 2Z + {1 − λ, λ − 1 : λ ∈ Λ} . Define: V ( GT(2)) := □ ∩ (X × Y) and add to E ( GT(2)) an edge for every grid segment connecting two vertices.
19/30
SHORTEST PATHS
Lemma There exists a shortest rectilinear s-t-path with all the corners at V ( GT(2)) . Such path has no more corners than any other shortest rectilinear s-t-path. Proof. Consider a shortest s-t-path.
- Induction on the vertices, starting from s ∈ V
( GT(2)) .
- Let v0 be the first vertex /
∈ V ( GT(2)) ; let v−1 ∈ V ( GT(2)) be its predecessor, and denote the subsequent vertices with v1, v2, . . .
- If v0 = t, we’re done.
- Else, drag the segment after v0, v0v1.
20/30
SHORTEST PATHS
Lemma There exists a shortest rectilinear s-t-path with all the corners at V ( GT(2)) . Such path has no more corners than any other shortest rectilinear s-t-path. Proof. Consider a shortest s-t-path.
- Induction on the vertices, starting from s ∈ V
( GT(2)) .
- Let v0 be the first vertex /
∈ V ( GT(2)) ; let v−1 ∈ V ( GT(2)) be its predecessor, and denote the subsequent vertices with v1, v2, . . .
- If v0 = t, we’re done.
- Else, drag the segment after v0, v0v1.
20/30
SHORTEST PATHS: PROOF
Continued proof.
- Path cost changes linearly ⇒ non-increasing in one direction.
Drag until:
- 1. v0 snaps to the grid.
- 2. v1 hits some tile’s boundary.
Snap v1 to the grid, drag v0v1 v1v2 simultaneously.
2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.
v−1 v0 v1 ■ 21/30
SHORTEST PATHS: PROOF
Continued proof.
- Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
- 1. v0 snaps to the grid.
- 2. v1 hits some tile’s boundary.
Snap v1 to the grid, drag v0v1 v1v2 simultaneously.
2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.
v−1 v0 v1 ■ 21/30
SHORTEST PATHS: PROOF
Continued proof.
- Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
- 1. v0 snaps to the grid.
- 2. v1 hits some tile’s boundary.
Snap v1 to the grid, drag v0v1 v1v2 simultaneously.
2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.
v−1 v0 v1 ■ 21/30
SHORTEST PATHS: PROOF
Continued proof.
- Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
- 1. v0 snaps to the grid.
- 2. v1 hits some tile’s boundary. Snap v1 to the grid, drag v0v1, v1v2 simultaneously.
2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.
v−1 v0 v1 ■ 21/30
SHORTEST PATHS: PROOF
Continued proof.
- Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
- 1. v0 snaps to the grid.
- 2. v1 hits some tile’s boundary. Snap v1 to the grid, drag v0v1, v1v2 simultaneously.
2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.
v−1 v0 v1 ■ 21/30
SHORTEST PATHS: PROOF
Continued proof.
- Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
- 1. v0 snaps to the grid.
- 2. v1 hits some tile’s boundary. Snap v1 to the grid, drag v0v1, v1v2 simultaneously.
2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.
■ 21/30
GRID GRAPHS ON MULTIPLE LAYERS
Chip volume □
□ = □ × L, where the layers are L := {1, . . . , L}.
- Create a copy of the tiles in each layer, T × L.
- Create a copy of (V, E) on each layer.
- Connect adjacent copies of the same vertex.
- Prune edges that are not in preferred direction.
22/30
SHORTEST PATHS ON MULTIPLE LAYERS
Does the Lemma hold with multiple layers?
- Segments never change layer; need to drag single vias.
✓ Multiple-layer GT(2) contains a shortest path.
- GT(1) contains a shortest path with ≤ 4 more corners in the planar case.
Can we use a “smaller” graph? ✓ GT(1) contains a shortest path with ≤ 4 more corners in the planar case. ✗ Multiple-layer GT(1) does not always contain a shortest path. ✗ If we remove edges from GT(1) ⇒ suboptimal paths even in the planar case.
23/30
SHORTEST PATHS ON MULTIPLE LAYERS
Does the Lemma hold with multiple layers?
- Segments never change layer; need to drag single vias.
✓ Multiple-layer GT(2) contains a shortest path.
- GT(1) contains a shortest path with ≤ 4 more corners in the planar case.
Can we use a “smaller” graph? ✓ GT(1) contains a shortest path with ≤ 4 more corners in the planar case. ✗ Multiple-layer GT(1) does not always contain a shortest path. ✗ If we remove edges from GT(1) ⇒ suboptimal paths even in the planar case. t
23/30
SHORTEST PATHS ON MULTIPLE LAYERS
Does the Lemma hold with multiple layers?
- Segments never change layer; need to drag single vias.
✓ Multiple-layer GT(2) contains a shortest path.
- GT(1) contains a shortest path with ≤ 4 more corners in the planar case.
Can we use a “smaller” graph? ✓ GT(1) contains a shortest path with ≤ 4 more corners in the planar case. ✗ Multiple-layer GT(1) does not always contain a shortest path. ✗ If we remove edges from GT(1) ⇒ suboptimal paths even in the planar case. t
23/30
FROM GRAPHS TO GLOBAL WIRES
Precompute one GT(2)
Λ
for each net Recursively compute shortest path be- tween connected components of the net (2-approximation). Resource Sharing Algorithm Randomized rounding, resample and reroute.
24/30
IMPROVEMENTS
Exploit group symmetry to reduce size of GT(2) ⇒ hard to approximate below 1
2 log n (Set-Cover in disguise).
Shortest path algorithms
- Currently multi-directional Dijkstra is used to reduce runtime.
- Using goal-directed for Dijkstra’s algorithm (A∗)? It is possible to use
landmarks to direct path search as in [A. V. Goldberg and C. Harrelson, 2005].
25/30
LANDMARKS FOR GT(2)
Want: estimate distance from a given landmark and use in subsequent path search to orient path search. Problem: GT(2) changes for every net. Solution: compute distances on GT(0) and interpolate linearly to several GT(2). landmark
10 20 30 40 50
distance tile prices
1.0 2.7 4.3 6.0 7.7 9.3 11.0 26/30
EXPERIMENTAL RESULTS
CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES
28/30
CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES
28/30
CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES
28/30
CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES
28/30
CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES
28/30
CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES
28/30
COMPARISON WITH A TRADITIONAL GLOBAL ROUTER
Testbed setup
- Global and detailed router subsequently
- 21 IBM microprocessor units (part of a chip), 14nm technology
- 2× Intel Xeon E5-2690 CPU, 16 threads
- 125 resource sharing phases
- Reusable graph structure (plug and play Λ)
The detailed router is tuned for the traditional global router!
29/30
COMPARISON WITH A TRADITIONAL GLOBAL ROUTER
Unit |N | Flow |ΛN| Time Op Wl Via avg mm:ss mm ×103 A 187 edges — 00:39 7.14 1.3 tiles 5.6 01:18 7.17 1.5 B 2210 edges — 05:29 18 41.16 13.4 tiles 5.7 04:27 ⋆ 20 40.96 ⋆ 12.5 ⋆ C 2427 edges — 01:11 16.32 16.0 tiles 6.5 00:52 ⋆ 16.23 ⋆ 15.8 ⋆ D 3063 edges — 00:25 28.33 22.4 tiles 6.6 01:01 28.12 ⋆ 22.0 ⋆ E 3237 edges — 00:42 1 34.11 21.9 tiles 6.6 01:05 1 33.96 ⋆ 21.2 ⋆ F 3975 edges — 02:01 1 41.01 23.0 tiles 6.0 02:30 1 40.90 ⋆ 22.4 ⋆ G 4459 edges — 01:03 32.71 28.7 tiles 6.4 01:40 32.50 ⋆ 27.9 ⋆ H 5946 edges — 01:34 13 68.76 43.6 tiles 6.3 03:09 14 68.74 ⋆ 43.2 ⋆ I 10799 edges — 02:57 202.95 97.8 tiles 6.3 27:54 23 202.16 ⋆ 103.5 J 10974 edges — 01:27 14 135.24 80.8 tiles 6.3 05:15 19 135.21 ⋆ 81.0 K 12701 edges — 01:20 6 115.93 84.2 tiles 6.2 04:33 6 115.58 ⋆ 83.3 ⋆ Unit |N | Flow |ΛN| Time Op Wl Via avg mm:ss mm ×103 L 13451 edges — 02:09 7 175.06 86.5 tiles 6.2 04:24 5 ⋆ 174.62 ⋆ 83.4 ⋆ M 14712 edges — 07:34 7 170.89 111.8 tiles 6.3 10:40 2 ⋆ 171.97 113.6 N 16403 edges — 02:35 200.85 110.4 tiles 6.1 06:55 1 199.88 ⋆ 106.3 ⋆ O 16952 edges — 02:07 178.36 113.9 tiles 6.2 31:21 1 178.44 114.0 P 42530 edges — 06:56 3 609.65 329.3 tiles 6.3 38:13 10 616.72 339.0 Q 42625 edges — 07:21 552.49 327.2 tiles 6.3 25:09 559.83 339.2 R 49859 edges — 06:47 13 614.32 382.4 tiles 6.3 22:43 7 ⋆ 615.55 385.4 S 50716 edges — 06:46 2 472.78 321.8 tiles 6.2 17:40 1 ⋆ 473.70 318.3 ⋆ T 82693 edges — 12:46 9 1036.21 583.5 tiles 6.3 27:19 4 ⋆ 1037.49 582.3 ⋆ U 107142 edges — 11:34 1 1202.08 713.8 tiles 6.3 26:40 2 1196.23 ⋆ 706.5 ⋆