[PPT] - GEOMETRIC STEINER TREE PACKING WITH DENSITY CONSTRAINTS Nicolai PowerPoint Presentation

SLIDE 1

GEOMETRIC STEINER TREE PACKING WITH DENSITY CONSTRAINTS

Nicolai Hähnle, Pietro Saccardi Aussois Combinatorial Optimization Workshop January 12, 2017

Research Institute for Discrete Mathematics, Bonn 1/30

SLIDE 2

OVERVIEW

Motivation: global routing in chip design Traditional Steiner tree packing in grid graphs Geomtric Steiner tree packing on rhomboids Shortest paths and Steiner trees on rhomboids Experimental results

2/30

SLIDE 3

MOTIVATION: GLOBAL ROUTING IN CHIP DESIGN

SLIDE 4

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 5

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 6

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 7

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 8

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 9

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 10

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 11

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 12

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 13

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 14

HOW IS A CHIP BUILT?

“Zoom Into a Microchip” by NISENet, available under a CC BY license.

4/30

SLIDE 15

WIRE DENSITY IS EXTREMELY HIGH

Tightly packed wires on a chip routed by BonnRoute; zoom is 1000×. Only signal wires are shown (power grid is omitted). Wires are colored by routing layer.

5/30

SLIDE 16

FACTS AND FIGURES

Chip structure

A chip has a layered 3D structure.
Transistors are placed on the lowest layers.
Up to 16 routing layers for interconnects.

Chip size A chip with an area of few mm2 packs:

several billions transistors
several meters of wires (few nm thick)
several millions of connections

6/30

SLIDE 17

ROUTING IN CHIP DESIGN

Assumptions

The chip functionality is decomposed into elementary Boolean operations.
Logic gates (Boolean operations) are implemented at a transistor level.
All the logic gates are already placed on the chip.
All the connections (nets) between logic gates are known.

Goal Compute connections for each net in such a way that:

different nets are disjoint;
design constraints are satisfied;
timing closure is achieved;
various objectives (wire length, power consumption, yield, …) are optimized.

7/30

SLIDE 18

ROUTING IN CHIP DESIGN

Assumptions

The chip functionality is decomposed into elementary Boolean operations.
Logic gates (Boolean operations) are implemented at a transistor level.
All the logic gates are already placed on the chip.
All the connections (nets) between logic gates are known.

Goal Compute connections for each net in such a way that:

different nets are disjoint;
design constraints are satisfied;
timing closure is achieved;
various objectives (wire length, power consumption, yield, …) are optimized.

7/30

SLIDE 19

MODELING ROUTING

Constraints due to fabrication process Constraints due to problem hardness Huge restriction on wiring Steiner tree problem in 3D grid graphs 109 vertices!

Axis-parallel wires.
Monodirectional layers.
Possible wire locations are discretized (tracks).
Problem in split into global and detailed:
Global routing ignores disjointness and design rules compliancy.

It optimizes globally objectives such as packing density, length and yield.

Detailed routing explores a subset of the routing graph.

It handles disjointness and design rules compliancy.

8/30

SLIDE 20

MODELING ROUTING

Constraints due to fabrication process Constraints due to problem hardness Huge restriction on wiring Steiner tree problem in 3D grid graphs 109 vertices!

8/30

SLIDE 21

MODELING ROUTING

Constraints due to fabrication process Constraints due to problem hardness Huge restriction on wiring Steiner tree problem in 3D grid graphs 109 vertices!

Axis-parallel wires.
Monodirectional layers.
Possible wire locations are discretized (tracks).
Problem in split into global and detailed:
Global routing ignores disjointness and design rules compliancy.

It optimizes globally objectives such as packing density, length and yield.

Detailed routing explores a subset of the routing graph.

It handles disjointness and design rules compliancy.

8/30

SLIDE 22

TRADITIONAL STEINER TREE PACKING IN GRID GRAPHS

SLIDE 23

TRADITIONAL APPROACH TO GLOBAL ROUTING

Coarsening the routing graph → global routing graph.

10/30

SLIDE 24

TRADITIONAL APPROACH TO GLOBAL ROUTING

Coarsening the routing graph → global routing graph.

10/30

SLIDE 25

TRADITIONAL APPROACH TO GLOBAL ROUTING

Coarsening the routing graph → global routing graph. Problem (Simplified Global Routing) Input: Graph G′, capacities u : E(G′) → Z≥0, lengths l : E(G′) → R≥0, nets N, where n ∈ N is ∅ ̸= n ⊂ V(G′), and wire widths w : N × E(G′) → R≥0. Task: Find Steiner trees Yn, minimizing ∑

n∈N

∑

e∈E(Yn)

l(e) and meeting the capacity constraints: ∑

n∈N:e∈E(Yn)

w(n, e) ≤ u(e), ∀e ∈ E(Yn).

10/30

SLIDE 26

MIN-MAX RESOURCE SHARING

Problem (Min-Max Resource Sharing) Input:

R finite set of resources (edges) of finite capacity
C finite set of customers (nets)
Bc set of solutions for c ∈ C (Steiner trees)
Resource consumption function usgc : Bc → RR

≥0

σ-approximate oracle function fc : RR

≥0 → Bc

Task: Find bc attaining λ∗ := inf { max

r∈R

∑

c∈C

(usgc(bc))r

bc ∈ Bc, c ∈ C

} .

11/30

SLIDE 27

ALGORITHM OUTLINE: MIN-MAX RESOURCE SHARING

It maintains resource prices (initially set to 1).
The algorithm proceeds in phases. In each phase:
For every net, find (approx.) cheapest route w.r.t. the given resource prices.
Update the price of the used resources grow multiplicatively with load.
Return convex combination of Steiner trees for each net.

Theorem (D. Müller, K. Radke, J. Vygen, 2011) Let ω > 0. A σ(1 + ω)-approximate fractional solution can be computed in time O(θ log |R|(|C| + |R|)(log log |R| + ω−2)), where θ is the time for an oracle call. IP techniques [T. H. Wu, A. Davoodi, and J. T. Linderoth, 2011] could be used but are inpractical for our instances.

12/30

SLIDE 28

POTENTIAL DRAWBACKS

Poor topologies.
Does not connect to pin shapes (important for signal delay estimation).
Depends heavily on the choice of the grid.
Does not support input wires.

13/30

SLIDE 29

GEOMTRIC STEINER TREE PACKING ON RHOMBOIDS

SLIDE 30

RHOMBOIDAL TILES

Definition Tile set T , layers L, chip area □. Ti ∈ T is a ℓ1-ball of unit radius, with ˚ Ti ∩ ˚ Tj = ∅ and □ ⊆ ∪

T∈T T.

Given a tile price function c : T → R≥0, we define the cost of a segment s as c(s) := ∑

T∈T c(T) ℓ(s ∩ T). The definition can be extented to rectilinear paths

and graphs embedded in the plane. Ti Tj

15/30

SLIDE 31

WHY RHOMBOIDAL TILES?

Rhomboids (ℓ1 balls) are simple objects! Let c(s) be the cost of a segment s. Then c(s) is (w.r.t. the coordinates of s):

1. continuous and piecewise linear;
2. smooth, unless it intersects some tile’s vertex or the endpoints lie on some

tile’s boundary. ⇓ If we drag a segment or its endpoints, the change in cost is described by a smooth linear function, as long as condition 2 is respected:

16/30

SLIDE 32

WHY RHOMBOIDAL TILES?

Rhomboids (ℓ1 balls) are simple objects! Let c(s) be the cost of a segment s. Then c(s) is (w.r.t. the coordinates of s):

1. continuous and piecewise linear;
2. smooth, unless it intersects some tile’s vertex or the endpoints lie on some

tile’s boundary. ⇓ If we drag a segment or its endpoints, the change in cost is described by a smooth linear function, as long as condition 2 is respected:

16/30

SLIDE 33

RHOMBOIDAL TILES AS RESOURCES

R := T × L (rhomboidal tiles × layers). Let Yn denote a route for n ∈ N. Definition T ∈ T × L on layer lT; w(n, l) wire width in tracks on layer l. Capacity: u(T) := total length of free tracks. Consumption: (usgn(Yn))T := ∑

s∈E(Yn) w(n, lT) ℓ(s∩T) u(T) .

Congestion: cong(T) := ∑

n∈N (usgn(Yn))T .

Given a resource price vector y

0, if we set c T

w n lT

yT u T ,

y usgn Yn c E Yn Can we compute a route of (approx.) minimal cost c ?

17/30

SLIDE 34

RHOMBOIDAL TILES AS RESOURCES

R := T × L (rhomboidal tiles × layers). Let Yn denote a route for n ∈ N. Definition T ∈ T × L on layer lT; w(n, l) wire width in tracks on layer l. Capacity: u(T) := total length of free tracks. Consumption: (usgn(Yn))T := ∑

s∈E(Yn) w(n, lT) ℓ(s∩T) u(T) .

Congestion: cong(T) := ∑

n∈N (usgn(Yn))T .

Given a resource price vector y ∈ RR

≥0, if we set c(T) := w(n, lT) yT u(T),

⇒ y⊺ usgn(Yn) = c(E(Yn)) Can we compute a route of (approx.) minimal cost c ?

17/30

SLIDE 35

RHOMBOIDAL TILES AS RESOURCES

R := T × L (rhomboidal tiles × layers). Let Yn denote a route for n ∈ N. Definition T ∈ T × L on layer lT; w(n, l) wire width in tracks on layer l. Capacity: u(T) := total length of free tracks. Consumption: (usgn(Yn))T := ∑

s∈E(Yn) w(n, lT) ℓ(s∩T) u(T) .

Congestion: cong(T) := ∑

n∈N (usgn(Yn))T .

Given a resource price vector y ∈ RR

≥0, if we set c(T) := w(n, lT) yT u(T),

⇒ y⊺ usgn(Yn) = c(E(Yn)) Can we compute a route of (approx.) minimal cost c(·)?

17/30

SLIDE 36

SHORTEST PATHS AND STEINER TREES ON RHOMBOIDS

SLIDE 37

GRID GRAPHS FROM TWO ENDPOINTS

GT(0): grid graph induced by tiles’ centres.
GT(1): adds grid lines through two endpoints s and t.
GT(2)

s,t : recursively add grid lines at the intersection with tiles’ border.

The number of edges is 4 1 2 (graph size grows fast!).

19/30

SLIDE 38

GRID GRAPHS FROM TWO ENDPOINTS

GT(0): grid graph induced by tiles’ centres.
GT(1): adds grid lines through two endpoints s and t.
GT(2)

s,t : recursively add grid lines at the intersection with tiles’ border.

The number of edges is 4 1 2 (graph size grows fast!). s t

19/30

SLIDE 39

GRID GRAPHS FROM TWO ENDPOINTS

GT(0): grid graph induced by tiles’ centres.
GT(1): adds grid lines through two endpoints s and t.
GT(2)

s,t : recursively add grid lines at the intersection with tiles’ border.

The number of edges is 4 1 2 (graph size grows fast!).

19/30

SLIDE 40

GRID GRAPHS FROM TWO ENDPOINTS

GT(0): grid graph induced by tiles’ centres.
GT(1): adds grid lines through two endpoints s and t.
GT(2)

s,t : recursively add grid lines at the intersection with tiles’ border.

The number of edges is 4(|Λ| − 1)2 (graph size grows fast!). Λ

19/30

SLIDE 41

GRID GRAPHS FROM TWO ENDPOINTS

GT(0): grid graph induced by tiles’ centres.
GT(1): adds grid lines through two endpoints s and t.
GT(2)

s,t : recursively add grid lines at the intersection with tiles’ border.

The number of edges is 4(|Λ| − 1)2 (graph size grows fast!). Definition (GT(2)

Λ )

Given a finite set Λ ⊂ [0, 1], 0, 1 ∈ Λ, X := 2Z + {λ, 2 − λ : λ ∈ Λ} Y := 2Z + {1 − λ, λ − 1 : λ ∈ Λ} . Define: V ( GT(2)) := □ ∩ (X × Y) and add to E ( GT(2)) an edge for every grid segment connecting two vertices.

19/30

SLIDE 42

SHORTEST PATHS

Lemma There exists a shortest rectilinear s-t-path with all the corners at V ( GT(2)) . Such path has no more corners than any other shortest rectilinear s-t-path. Proof. Consider a shortest s-t-path.

Induction on the vertices, starting from s ∈ V

( GT(2)) .

Let v0 be the first vertex /

∈ V ( GT(2)) ; let v−1 ∈ V ( GT(2)) be its predecessor, and denote the subsequent vertices with v1, v2, . . .

If v0 = t, we’re done.
Else, drag the segment after v0, v0v1.

20/30

SLIDE 43

SHORTEST PATHS

Lemma There exists a shortest rectilinear s-t-path with all the corners at V ( GT(2)) . Such path has no more corners than any other shortest rectilinear s-t-path. Proof. Consider a shortest s-t-path.

Induction on the vertices, starting from s ∈ V

( GT(2)) .

Let v0 be the first vertex /

∈ V ( GT(2)) ; let v−1 ∈ V ( GT(2)) be its predecessor, and denote the subsequent vertices with v1, v2, . . .

If v0 = t, we’re done.
Else, drag the segment after v0, v0v1.

20/30

SLIDE 44

SHORTEST PATHS: PROOF

Continued proof.

Path cost changes linearly ⇒ non-increasing in one direction.

Drag until:

1. v0 snaps to the grid.
2. v1 hits some tile’s boundary.

Snap v1 to the grid, drag v0v1 v1v2 simultaneously.

2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.

v−1 v0 v1 ■ 21/30

SLIDE 45

SHORTEST PATHS: PROOF

Continued proof.

Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
1. v0 snaps to the grid.
2. v1 hits some tile’s boundary.

Snap v1 to the grid, drag v0v1 v1v2 simultaneously.

2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.

v−1 v0 v1 ■ 21/30

SLIDE 46

SHORTEST PATHS: PROOF

Continued proof.

Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
1. v0 snaps to the grid.
2. v1 hits some tile’s boundary.

Snap v1 to the grid, drag v0v1 v1v2 simultaneously.

2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.

v−1 v0 v1 ■ 21/30

SLIDE 47

SHORTEST PATHS: PROOF

Continued proof.

Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
1. v0 snaps to the grid.
2. v1 hits some tile’s boundary. Snap v1 to the grid, drag v0v1, v1v2 simultaneously.

2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.

v−1 v0 v1 ■ 21/30

SLIDE 48

SHORTEST PATHS: PROOF

Continued proof.

Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
1. v0 snaps to the grid.
2. v1 hits some tile’s boundary. Snap v1 to the grid, drag v0v1, v1v2 simultaneously.

2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.

v−1 v0 v1 ■ 21/30

SLIDE 49

SHORTEST PATHS: PROOF

Continued proof.

Path cost changes linearly ⇒ non-increasing in one direction. Drag until:
1. v0 snaps to the grid.
2. v1 hits some tile’s boundary. Snap v1 to the grid, drag v0v1, v1v2 simultaneously.

2.1 Drag until all the vertices snap to the grid, or 2.2 if v2 hits some tile’s boundary, snap it to the boundary, and repeat step 2.

■ 21/30

SLIDE 50

GRID GRAPHS ON MULTIPLE LAYERS

Chip volume □

□ = □ × L, where the layers are L := {1, . . . , L}.

Create a copy of the tiles in each layer, T × L.
Create a copy of (V, E) on each layer.
Connect adjacent copies of the same vertex.
Prune edges that are not in preferred direction.

22/30

SLIDE 51

SHORTEST PATHS ON MULTIPLE LAYERS

Does the Lemma hold with multiple layers?

Segments never change layer; need to drag single vias.

✓ Multiple-layer GT(2) contains a shortest path.

GT(1) contains a shortest path with ≤ 4 more corners in the planar case.

Can we use a “smaller” graph? ✓ GT(1) contains a shortest path with ≤ 4 more corners in the planar case. ✗ Multiple-layer GT(1) does not always contain a shortest path. ✗ If we remove edges from GT(1) ⇒ suboptimal paths even in the planar case.

23/30

SLIDE 52

SHORTEST PATHS ON MULTIPLE LAYERS

Does the Lemma hold with multiple layers?

Segments never change layer; need to drag single vias.

✓ Multiple-layer GT(2) contains a shortest path.

GT(1) contains a shortest path with ≤ 4 more corners in the planar case.

Can we use a “smaller” graph? ✓ GT(1) contains a shortest path with ≤ 4 more corners in the planar case. ✗ Multiple-layer GT(1) does not always contain a shortest path. ✗ If we remove edges from GT(1) ⇒ suboptimal paths even in the planar case. t

23/30

SLIDE 53

SHORTEST PATHS ON MULTIPLE LAYERS

Does the Lemma hold with multiple layers?

Segments never change layer; need to drag single vias.

✓ Multiple-layer GT(2) contains a shortest path.

GT(1) contains a shortest path with ≤ 4 more corners in the planar case.

Can we use a “smaller” graph? ✓ GT(1) contains a shortest path with ≤ 4 more corners in the planar case. ✗ Multiple-layer GT(1) does not always contain a shortest path. ✗ If we remove edges from GT(1) ⇒ suboptimal paths even in the planar case. t

23/30

SLIDE 54

FROM GRAPHS TO GLOBAL WIRES

Precompute one GT(2)

Λ

for each net Recursively compute shortest path be- tween connected components of the net (2-approximation). Resource Sharing Algorithm Randomized rounding, resample and reroute.

24/30

SLIDE 55

IMPROVEMENTS

Exploit group symmetry to reduce size of GT(2) ⇒ hard to approximate below 1

2 log n (Set-Cover in disguise).

Shortest path algorithms

Currently multi-directional Dijkstra is used to reduce runtime.
Using goal-directed for Dijkstra’s algorithm (A∗)? It is possible to use

landmarks to direct path search as in [A. V. Goldberg and C. Harrelson, 2005].

25/30

SLIDE 56

LANDMARKS FOR GT(2)

Want: estimate distance from a given landmark and use in subsequent path search to orient path search. Problem: GT(2) changes for every net. Solution: compute distances on GT(0) and interpolate linearly to several GT(2). landmark

10 20 30 40 50

distance tile prices

1.0 2.7 4.3 6.0 7.7 9.3 11.0 26/30

SLIDE 57

EXPERIMENTAL RESULTS

SLIDE 58

CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES

28/30

SLIDE 59

CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES

28/30

SLIDE 60

CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES

28/30

SLIDE 61

CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES

28/30

SLIDE 62

CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES

28/30

SLIDE 63

CONGESTION ACROSS SEVERAL RESOURCE SHARING PHASES

28/30

SLIDE 64

COMPARISON WITH A TRADITIONAL GLOBAL ROUTER

Testbed setup

Global and detailed router subsequently
21 IBM microprocessor units (part of a chip), 14nm technology
2× Intel Xeon E5-2690 CPU, 16 threads
125 resource sharing phases
Reusable graph structure (plug and play Λ)

The detailed router is tuned for the traditional global router!

29/30

SLIDE 65

COMPARISON WITH A TRADITIONAL GLOBAL ROUTER

Unit |N | Flow |ΛN| Time Op Wl Via avg mm:ss mm ×103 A 187 edges — 00:39 7.14 1.3 tiles 5.6 01:18 7.17 1.5 B 2210 edges — 05:29 18 41.16 13.4 tiles 5.7 04:27 ⋆ 20 40.96 ⋆ 12.5 ⋆ C 2427 edges — 01:11 16.32 16.0 tiles 6.5 00:52 ⋆ 16.23 ⋆ 15.8 ⋆ D 3063 edges — 00:25 28.33 22.4 tiles 6.6 01:01 28.12 ⋆ 22.0 ⋆ E 3237 edges — 00:42 1 34.11 21.9 tiles 6.6 01:05 1 33.96 ⋆ 21.2 ⋆ F 3975 edges — 02:01 1 41.01 23.0 tiles 6.0 02:30 1 40.90 ⋆ 22.4 ⋆ G 4459 edges — 01:03 32.71 28.7 tiles 6.4 01:40 32.50 ⋆ 27.9 ⋆ H 5946 edges — 01:34 13 68.76 43.6 tiles 6.3 03:09 14 68.74 ⋆ 43.2 ⋆ I 10799 edges — 02:57 202.95 97.8 tiles 6.3 27:54 23 202.16 ⋆ 103.5 J 10974 edges — 01:27 14 135.24 80.8 tiles 6.3 05:15 19 135.21 ⋆ 81.0 K 12701 edges — 01:20 6 115.93 84.2 tiles 6.2 04:33 6 115.58 ⋆ 83.3 ⋆ Unit |N | Flow |ΛN| Time Op Wl Via avg mm:ss mm ×103 L 13451 edges — 02:09 7 175.06 86.5 tiles 6.2 04:24 5 ⋆ 174.62 ⋆ 83.4 ⋆ M 14712 edges — 07:34 7 170.89 111.8 tiles 6.3 10:40 2 ⋆ 171.97 113.6 N 16403 edges — 02:35 200.85 110.4 tiles 6.1 06:55 1 199.88 ⋆ 106.3 ⋆ O 16952 edges — 02:07 178.36 113.9 tiles 6.2 31:21 1 178.44 114.0 P 42530 edges — 06:56 3 609.65 329.3 tiles 6.3 38:13 10 616.72 339.0 Q 42625 edges — 07:21 552.49 327.2 tiles 6.3 25:09 559.83 339.2 R 49859 edges — 06:47 13 614.32 382.4 tiles 6.3 22:43 7 ⋆ 615.55 385.4 S 50716 edges — 06:46 2 472.78 321.8 tiles 6.2 17:40 1 ⋆ 473.70 318.3 ⋆ T 82693 edges — 12:46 9 1036.21 583.5 tiles 6.3 27:19 4 ⋆ 1037.49 582.3 ⋆ U 107142 edges — 11:34 1 1202.08 713.8 tiles 6.3 26:40 2 1196.23 ⋆ 706.5 ⋆

29/30

SLIDE 66

EXAMPLE OF OUTPUT

30/30

SLIDE 67