Computatio ion Reuse in in DNNs by Exploiting Input Sim imilarity - - PowerPoint PPT Presentation

β–Ά
computatio ion reuse in in dnns by
SMART_READER_LITE
LIVE PREVIEW

Computatio ion Reuse in in DNNs by Exploiting Input Sim imilarity - - PowerPoint PPT Presentation

Computatio ion Reuse in in DNNs by Exploiting Input Sim imilarity Marc Riera , Jose Maria Arnau, Antonio Gonzlez Sequence Processing Applications Speech Audio Signal 4/06/2018 ISCA 2018 2 Sequence Processing Applications 4/06/2018 ISCA


slide-1
SLIDE 1

Marc Riera, Jose Maria Arnau, Antonio GonzΓ‘lez

Computatio ion Reuse in in DNNs by Exploiting Input Sim imilarity

slide-2
SLIDE 2

Sequence Processing Applications

4/06/2018 2

Speech Audio Signal

ISCA 2018

slide-3
SLIDE 3

Sequence Processing Applications

4/06/2018 3 ISCA 2018

slide-4
SLIDE 4

Sequence Processing Applications

4/06/2018 4 ISCA 2018

slide-5
SLIDE 5

Sequence Processing Applications

4/06/2018 5 ISCA 2018

slide-6
SLIDE 6

Sequence Processing Applications

4/06/2018 6

Speech Recognition DNN executions to classify a sequence of audio frames in phonemes

ISCA 2018

slide-7
SLIDE 7

Benchmarks

4/06/2018 7 ISCA 2018

DNN Name DNN Type DNN Application #Parameters Accuracy Kaldi MLP Acoustic Scoring 4,7M 89,04% EESEN RNN Speech Recognition 11M 68,85% C3D CNN Video Classification 78M 93,48% AutoPilot CNN Self-Driving Cars 1,6M 99,63%

slide-8
SLIDE 8

In Input Sim imilarity

4/06/2018 8

45% 69% 77% 52% 61% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Kaldi C3D Autopilot EESEN Average Input Similarity (%)

ISCA 2018

slide-9
SLIDE 9

Exploiting Temporal Sim imilarity Example

4/06/2018 9

Frame i Frame i+1

Baseline

N 𝐽0

𝑗

𝐽1

𝑗

𝐽2

𝑗

𝑃𝑗 = 𝐽0

𝑗π‘₯0 + 𝐽1 𝑗π‘₯1 + 𝐽2 𝑗π‘₯2 + 𝑐

π‘₯0 π‘₯1 π‘₯2 N 𝐽0

𝑗+1

𝐽1

𝑗+1

𝐽2

𝑗+1

𝑃𝑗+1 = 𝐽0

𝑗+1π‘₯0 + 𝐽1 𝑗+1π‘₯1 + 𝐽2 𝑗+1π‘₯2 + 𝑐

π‘₯0 π‘₯1 π‘₯2

ISCA 2018

slide-10
SLIDE 10

Exploiting Temporal Sim imilarity Example

4/06/2018 10

Frame i Frame i+1

Proposal

N 𝐽0

𝑗

𝐽1

𝑗

𝐽2

𝑗

𝑃𝑗 = 𝐽0

𝑗π‘₯0 + 𝐽1 𝑗π‘₯1 + 𝐽2 𝑗π‘₯2 + 𝑐

π‘₯0 π‘₯1 π‘₯2 N 𝐽0

𝑗+1

𝐽1

𝑗+1

𝐽2

𝑗+1

𝑷𝒋+𝟐 = 𝑷𝒋 + (π‘±πŸ‘

𝒋+πŸβˆ’π‘±πŸ‘ 𝒋 )π’™πŸ‘

π‘₯0 π‘₯1 π‘₯2

Number of computations before = 6 Number of computations after = 2

Note: Substraction of the inputs is almost negligible since its performed once per input

ISCA 2018

slide-11
SLIDE 11

Computatio ion Reuse

4/06/2018 11 ISCA 2018

53% 74% 79% 55% 66% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Kaldi C3D Autopilot EESEN Average Computation Reuse (%)

slide-12
SLIDE 12

DNN Processing Unit

4/06/2018 12 ISCA 2018

Tile

slide-13
SLIDE 13

FC Execution in the Reuse Accelerator (1)

4/06/2018 13 ISCA 2018

slide-14
SLIDE 14

FC Execution in the Reuse Accelerator (2)

4/06/2018 14 ISCA 2018

slide-15
SLIDE 15

FC Execution in the Reuse Accelerator (3)

4/06/2018 15 ISCA 2018

slide-16
SLIDE 16

Other Supported Layers

4/06/2018 16 ISCA 2018

Recurrent Neural Network (RNN) Convolutional Neural Network (CNN)

slide-17
SLIDE 17

Evalu luation Methodology

4/06/2018 17

  • Simulator to evaluate the performance and energy of the accelerator
  • Design Compiler to obtain power and delay of logic modules
  • 28/32nm library from Synopsys and the DesignWare logic modules
  • CACTI used for SRAM and eDRAM memories
  • MICRON LPDDR4 for main Memory
  • Accelerator Configuration:

ISCA 2018

slide-18
SLIDE 18

Memory ry Footprint Overheads

4/06/2018 18 ISCA 2018

2 4 6 8 10 12 14 16 18 20 On-Chip IO Buffer Off-Chip Main Memory Memory Increase (%)

slide-19
SLIDE 19

Results: SpeedUp

4/06/2018 19 ISCA 2018

slide-20
SLIDE 20

Results: Energy Savin ings

4/06/2018 20 ISCA 2018

slide-21
SLIDE 21

Conclusions

4/06/2018 21 ISCA 2018

  • More than 60% of the inputs remain unmodified respect the previous execution
  • Our proposed scheme checks which inputs have changed:
  • Unmodified inputs are ignored, avoiding computations and memory accesses
  • Modified inputs are used to correct the previous output of each neuron
  • On average, 63% energy savings and 3.5x speedup
  • Small area overhead of less than 1% mainly for additional storage
slide-22
SLIDE 22

Marc Riera, Jose Maria Arnau, Antonio GonzΓ‘lez

Computatio ion Reuse in in DNNs by Exploiting Input Sim imilarity