Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
1
Sparse matrix partitioning, ordering, and visualisation by Mondriaan - - PowerPoint PPT Presentation
Sparse matrix partitioning, ordering, and visualisation by Mondriaan 3.0 Outline Partitioning Matrix-vector Rob H. Bisseling, Albert-Jan Yzelman, Bas Fagginger Auer Movies Hypergraphs Ordering Mathematical Institute, Utrecht University SBD
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
1
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
2
◮ National supercomputer Huygens named after Christiaan
◮ Huygens, the machine, has 104 nodes ◮ Each node has 16 processors ◮ Each processor has 2 cores and a a shared L3 cache ◮ Each core has a local L1 and L2 cache
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
3
1 22 2 3 5 5 9 1 3 4 6 5 8 4 6 41 3 1 9 2 64 9 1
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
4
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
5
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
6
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
7
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
8
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
9
vertices nets ◮ Hypergraph H = (V, N) ⇒ exact communication volume
◮ Columns ≡ Vertices: 0, 1, 2, 3, 4, 5, 6.
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
10
◮ 138 × 138 symmetric matrix bcsstk22, nz = 696, p = 8 ◮ Reordered to Bordered Block Diagonal (BBD) form ◮ Split of row i over λi processors causes
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
11
◮ Row split has unit cost, irrespective of λi
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
12
◮ p = 4, ǫ = 0.2, global non-permuted view
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
13
◮ Each individual nonzero is a vertex in the hypergraph,
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
14
◮ New algorithms for vector partitioning. ◮ Much faster, by a factor of 10 compared to version 1.0. ◮ 10% better quality of the matrix partitioning. ◮ Inclusion of fine-grain partitioning method ◮ Inclusion of hybrid between original Mondriaan and
◮ Can also handle p = 2q.
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
15
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
16
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
17
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
18
◮ Ordering to SBD and BBD structure: cut rows are placed
◮ Visualisation through Matlab interface, MondriaanPlot,
◮ Metrics: λ − 1 for parallelism, and cut-net for other
◮ Library-callable, so you can link it to your own program ◮ Interface to PaToH hypergraph partitioner
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
19
◮ Compressed Row Storage (CRS, left) and
◮ Zig-zag CRS avoids unnecessary end-of-row jumps in
◮ Yzelman and Bisseling, SIAM Journal on Scientific
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
20
◮ SBD structure is obtained by recursively partitioning the
◮ Mondriaan is used in one-dimensional mode, splitting only
◮ The cut rows are sparse and serve as a gentle transition
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
21
◮ The recursive, fractal-like nature makes the ordering
◮ The ordering is cache-oblivious.
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
22
◮ Ordering the matrix in SBD format makes the
◮ We also like to forget about the cores: core-oblivious. And
◮ All that is needed is a good ordering of the rows and
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
23
! " # $ % %&" %&$ %&' %&( ! !&" !&$ )*+,-./01234, 510!+-24/6*-0 / / 758 ! " # $ 9 !% !9 "%
◮ Experiments on 1 core of the dual-core 4.7 GHz Power6+
◮ 64 kB L1 cache, 4 MB L2, 32 MB L3. ◮ Test matrices: 1. stanford; 2. stanford berkeley;
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
24
◮ 9 × 9 chess-arrowhead matrix, nz = 49, p = 2, ǫ = 0.2. ◮ DSBD structure is obtained by recursively partitioning the
◮ The nonzeros must also be reordered by a Z-like ordering. ◮ Mondriaan is used in two-dimensional mode.
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
25
◮ Matrix rhpentium, split over 30 processors
Outline Partitioning
Matrix-vector Movies Hypergraphs
Ordering
SBD
Conclusions
26
◮ We have presented two combinatorial problems:
◮ Reordering is a promising method for oblivious computing.
◮ Mondriaan 3.0, to be released soon, provides new
◮ Visualisation can help in designing new algorithms!