SLIDE 9 9
Evaluation: Comparison w/ Original Halide on CPU
◆ Different platforms × different backends ◆ Energy efficient & performant on both platforms and all backends
Benchmark Data Size & Type VU9P (AWS F1) Stratix 10 MX Pattern (Backend) Energy Eff. Speedup Energy Eff. Speedup Harris 2448×3264, Uint8 29.11 10.31 12.36 9.89 Stencil (SODA) Blur 648×482, UInt16 10.98 3.89 9.34 7.47 Stencil (SODA) Linear Blur 768×1280×3, Float32 12.65 4.48 10.75 8.60 Stencil (SODA) Stencil Chain 1536×2560, UInt16 4.29 1.52 3.64 2.91 Stencil (SODA) Dilation 6480×4820, UInt16 4.69 1.66 1.99 1.59 Stencil (SODA) Median Blur 6480×4820, UInt16 12.51 4.43 5.30 4.24 Stencil (SODA) GEMM 1024³, Int16 9.97 3.53 — — Systolic Array (PolySA) K-Means 320×32, k=15, Int32 29.00 10.27 — — General (Merlin Compiler)
— 11.44 4.05 6.02 4.82 —
CPU: dual Xeon 2680v4, 14nm, 2.4GHz, 240W; VU9P on AWS F1, 16nm, 250MHz, 85W; Stratix 10 MX, 14nm, 480MHz, 192W Not to serve as a fair comparison between the two FPGAs