ACMP: An Architecture to Handle Amdahl’s Law
- M. Aater Suleman
Advisor: Yale Patt
HPS Research Group
ACMP: An Architecture to Handle Amdahls Law M. Aater Suleman - - PowerPoint PPT Presentation
ACMP: An Architecture to Handle Amdahls Law M. Aater Suleman Advisor: Yale Patt HPS Research Group Acknowledgements Eric Sprangle, Intel Anwar Rohillah, Intel Anwar Ghuloum, Intel Doug Carmean, Intel Background Single-thread
HPS Research Group
For I = 1 to N A[I] = (A[I-1] + A[I])/2
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Programmer Effort Degree of Parallelism
Data-parallel Loops Loops with early termination Irregular code
2 4 6 8 10 12 14 16 18 0.2 0.4 0.6 0.8 1 Degree of Parallelism Speedup vs. 1 P6-type Core ACMP Niagara P6-Tile
2 4 6 8 10 12 14 16 18 0.2 0.4 0.6 0.8 1 Degree of Parallelism Speedup vs. 1 P6-type Core ACMP Niagara P6-Tile
At low parallelism, ACMP and P6-Tile
2 4 6 8 10 12 14 16 18 0.2 0.4 0.6 0.8 1 Degree of Parallelism Speedup vs. 1 P6-type Core ACMP Niagara P6-Tile
At high parallelism, Niagara
2 4 6 8 10 12 14 16 18 0.2 0.4 0.6 0.8 1 Degree of Parallelism Speedup vs. 1 P6-type Core ACMP Niagara P6-Tile
At medium parallelism, ACMP wins
2 4 6 8 10 12 14 16 18 0.2 0.4 0.6 0.8 1 Degree of Parallelism Speedup vs. 1 P6-type Core ACMP Niagara P6-Tile
The cut-off point moves to the right in the future
– Niagara: 16 small cores – P6-Tile: 4 large cores – ACMP: 1 Large core, 12 small cores
ring interconnect
– Master thread large core – All additional threads small cores
0.2 0.4 0.6 0.8 1 1.2 1.4
mcf is_nasp fft_splash cg_nasp ep_nasp art_omp mg_nasp fmm_splash cholesky page convert h.264 ed
Speedup vs. Niagara
Low Parallelism Medium Parallelism High Parallelism