High-Performance Embedded High-Performance Embedded - - PowerPoint PPT Presentation

high performance embedded high performance embedded
SMART_READER_LITE
LIVE PREVIEW

High-Performance Embedded High-Performance Embedded - - PowerPoint PPT Presentation

High-Performance Embedded High-Performance Embedded Systems-on-a-Chip Systems-on-a-Chip Sanjay Rajopadhye Sanjay Rajopadhye Computer Science, Colorado State Computer Science, Colorado State University University Lecture 3: Systolic Arrays


slide-1
SLIDE 1

High-Performance Embedded Systems-on-a-Chip Sanjay Rajopadhye Computer Science, Colorado State University Lecture 3: Systolic Arrays & Systolic Synthesis High-Performance Embedded Systems-on-a-Chip Sanjay Rajopadhye Computer Science, Colorado State University Lecture 3: Systolic Arrays & Systolic Synthesis

slide-2
SLIDE 2

2

Systolic Synthesis

With mathematical specification (SARE + reductions), do the following (not necessarily in order):

  • 1. Serialize reductions and Align inputs and outputs
  • 2. Localize dependences
  • 3. Schedule the SARE
  • 4. Allocate the computation to processors
  • 5. Transform the SARE
  • 6. Generate the HDL
slide-3
SLIDE 3

3

Example: Convolution

Initial specification:

yi =

n−1

  • j=0

wj ∗ xi−j

slide-4
SLIDE 4

4

Serialization & Alignment

Replace (unbounded fan-in) by sequence of binary

  • additions. Align input and output vars.
  • yi = Y [i, n − 1]

Y [i, j] =

  • j = 0 : wj ∗ xi−j

j > 0 : Y [i, j − 1]+ wj ∗ xi−j

slide-5
SLIDE 5

5

Localization/Uniformization

Remove unbounded fan-out (i.e., “long”) dependences.

  • yi = Y [i, n − 1]

Y [i, j] =

  • j = 0 : W [i, j] ∗ X[i, j]

j > 0 : Y [i, j − 1]+ W [i, j] ∗ X[i, j] X[i, j] =

  • j = 0 : xi

j > 0 : X[i − 1, j − 1] W [i, j] =

  • i = 0 : wj

i > 0 : W [i − 1, j]

slide-6
SLIDE 6

6

Scheduling & Allocation

slide-7
SLIDE 7

6

Scheduling & Allocation

  • A date: t(i, j) = i + j
slide-8
SLIDE 8

6

Scheduling & Allocation

  • A date: t(i, j) = i + j

A place: a(i, j) = j

slide-9
SLIDE 9

7

Geometric Transformation

T = (i, j → i + j, j)

  • yi = Y [i, n − 1]

Y [i, j] =

  • j = 0 : W [i, j] ∗ X[i, j]

j > 0 : Y [i − 1, j − 1]+ W [i, j] ∗ X[i, j] X[i, j] =

  • j = 0 : xi

j > 0 : X[i − 2, j − 1] W [i, j] =

  • i = j : wj

i > j : W [i − 1, j]

slide-10
SLIDE 10

8

Generate HDL

+ * + * + * + *