[PPT] - Models of Architecture Maxime Pelcat INSA Rennes, IETR, Institut PowerPoint Presentation

SLIDE 1

Models of Architecture

Nokia Bell Labs 2018 Maxime Pelcat INSA Rennes, IETR, Institut Pascal

This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 732105: CERBERO.

SLIDE 2

INSA Rennes – IETR VAADER

INSA Rennes
IETR VAADER
Institut Pascal

2

SLIDE 3

Abstracting computational architecture to

–Predict performance –Support current hardware evolutions

Models of Architecture

SLIDE 4

Hardware Architectures are becoming

–More complex –More heterogeneous –More High Performance embedded Computing (HPeC)

Embedded deep learning, near-sensor computing, fog

computing, edge computing, many-cores, etc.

Real-time constraints, stream processing applications

Motivation: architecture evolution

SLIDE 5

Let’s look at ARM-based HPeC

–Let us consider 4 heterogeneous solutions

ARM = control path + some of the data path
in red: data path

Motivation: HPeC architectures

Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM

SLIDE 6

Let’s look at ARM-based HPeC

Motivation: HPeC architectures

Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM

SLIDE 7

ARM big.LITTLE: Samsung Exynos 5422

Motivation: HPeC architectures

A15 A15 A15 A15

SCU ACE

A7 A7 A7 A7

SCU 2MB 0.5MB 2GB DDR (PoP) Easy to program Linux SMP Thread migration 12Gflops <10W

Low energy cores High Performance cores 2GHz 1.4GHz

SLIDE 8

Let’s look at ARM-based HPeC

Motivation: HPeC architectures

Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM

SLIDE 9

Multi-ARM + GPGPU: Nvidia Jetson TX1 module

Motivation: HPeC architectures

A57 A57 A57 A57

SCU 4GB external DDR

n

module Less easy to program Linux SMP + CUDA/OpenCL

32 cores /warp Control path 1.6GHz

256-core Maxwell GPGPU

Data path

H.264 4K 60Hz

SLIDE 10

Let’s look at ARM-based HPeC

Motivation: HPeC architectures

Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM

SLIDE 11

Multi-ARM + DSP: Texas Instruments Keystone II TCI6638K2K

Motivation: HPeC architectures

A15 A15 A15 A15

SCU 4MB Difficult to program (well) Linux SMP + Open Event Machine 160 Gflops <15W

Control path 1.4GHz Data path

Teranet

C66

1MB

C66

1MB

C66

1MB

C66

1MB

C66

1MB

C66

1MB

C66

1MB

C66

1MB

FFTC

6MB MSMC

1.2GHz

SLIDE 12

Let’s look at ARM-based HPeC

Motivation: HPeC architectures

Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM

SLIDE 13

Multi-ARM + FPGA: Xilinx Zynq Ultrascale +

Motivation: HPeC architectures

A53 A53 A53 A53

SCU 1MB More difficult to program (well) Linux SMP + HLS or HDL

Control path 1.5GHz Data path

GPU FPGA R5 R5

Switch fabric

Not GPGPU Up to 4MB 1MFF 0.5MLUT 600MHz

SLIDE 14

Motivation: HPeC architectures

Current trends

– FPGAs are gaining importance: what about flops? – Adding video/image accelerators

Video Compression: H.264/AVC, H.265/HEVC, etc.
AI: For tensor applications  reach 1Tops/W

–RISC-V as an open HW competitor to ARM

SLIDE 15

Towards more complexity

–More cores, hierarchies of clusters –Heteronegeneity, Interconnect complexity

Reminds intra-core modifications in XXth

Motivation: architecture evolution

ALU

clk clk clk

+ + + + Ld × + Str SIMD VLIW

SLIDE 16

But there are some differences between intra-

core and inter-core parallelism

–At coarse grain, PEs communicate asynchronously –There is no (or less) centralized processing decision –There is no performance portability (nothing equivalent to C-to-VLIW compilers)

How can/should we manage this HW

complexity?

–Can we predict performance at design time? How?

Motivation: architecture evolution

SLIDE 17

System Objectives

Maxime Pelcat 17

T°C Energy Reliability Memory Unit Cost

$

Security Maintenance Cost

$

Performance Peak Power

SLIDE 18

System Prototype

System Design: Y-Chart

Maxime Pelcat 18

Architecture Design Algorithm Application Redesign Redesign

SLIDE 19

Model of Architecture (MoA) conform to

Model-Based Design

19

KPI Architecture Model KPI Evaluation Algorithm Algorithm Model Redesign

Maxime Pelcat

Model of Computation(MoC) conforms to Redesign

SLIDE 20

On MoC Side: Many Results

#EdwardALee, #ProgrammingParadigms
Discrete Event MoCs
Finite State Machines  imperative languages
Functional MoCs
Petri Nets
Dataflow MoCs SDF, CSDF, IDF, IBSDF, PSDF,

SPDF, PiSDF, etc.

20 Maxime Pelcat

PREESM

SLIDE 21

And they are not all here…

Dataflow MoCs Case

Feature

SDF ADF IBSDF DSSF PSDF PiSDF SADF SPDF DPN KPN

Expressivity Low Med. Turing complete Hierarchical X X X X Compositional X X X Reconfigurable X X X X X X Statically schedulable X X X X Decidable X X X X (X) (X) X (X) Variable rates X X X X X X X Non-determinism X X X

SDF: Synchronous Dataflow ADF: Affine Dataflow IBSDF: Interface-Based Dataflow DSSF: Deterministic SDF with Shared Fifos PSDF: Parameterized SDF PiSDF Parameterized and Interfaced SDF SADF: Scenario-Aware Dataflow SPDF: Schedulable Parametric Dataflow DPN: Dataflow Process Network KPN: Kahn Process Network

SLIDE 22

But Still a Lot to Do

on Real-Time Multicore systems especially
Usually, RT application specification =

–Multiple tasks sharing resources –Activation periods or triggering events

Objective = keeping resources busy

22 Maxime Pelcat

T1 T2 T3

SLIDE 23

MoCs are not sufficient

23

Energy Energy Evaluation Algorithm Algorithm Model

Maxime Pelcat

Model of Computation(MoC) conforms to

SLIDE 24

Models of Architecture

Maxime Pelcat 24

Model of Architecture (MoA) conform to KPI Architecture Model KPI Evaluation Algorithm Algorithm Model Redesign Redesign

SLIDE 25

Problem: Predict System Quality

How to predict a system « quality » ?

–Efficiently (simple procedure) –Early (from abstract models) –Accurately (with a good fidelity) –With reproducibility (same models = same prediction)

25 Maxime Pelcat

SLIDE 26

Model of Architecture

Definition

–Model of a system Non-Functional Property –Application-independent –Abstract –Reproducible

26 Maxime Pelcat

Pelcat, M; Mercat, A; Desnos, K; Maggiani, L; Liu, Y; Heulot, J; Nezan, J-F; Hamidouche, W; Ménard, D; Bhattacharyya, S (2017) "Reproducible Evaluation of System Efficiency with a Model of Architecture: From Theory to Practice", IEEE TCAD.

SLIDE 27

Model of Architecture

Maxime Pelcat 27

Model Reproducible Application- independent Abstract AADL

  

MCA SHIM

  

UML MARTE

 / 

AAA

  

CHARMED

  

S-LAM

  

MAPS

  

LSLA

  

SLIDE 28

NFP = MoA( ) activity( )

MoA depends on MoC

Model of Architecture

28

One and always the same quality evaluation Model H conforms to MoA Model G conforms to MoC Activity

MoC( )

Maxime Pelcat

application

Performance Power Energy Memory T°C Reliability Security Cost

SLIDE 29

Model of Architecture

29

KPI MoA MoC Act

Maxime Pelcat

SLIDE 30

LSLA: First MoA

LSLA = Linear System-Level Architecture

Model

Motivated by the additive nature of energy

consumption

Maxime Pelcat 30

SLIDE 31

System Objectives

Maxime Pelcat 31

T°C Energy Reliability Memory Unit Cost

$

Security Maintenance Cost

$

Performance Peak Power

SLIDE 32

Energy/Power Define Architecture

20W 20kW 20MW Need a dissipator 2W 7W Need a fan Embedded system Dedicated system

r conventional system

HPC HPeC influence

SLIDE 33

LSLA Model of Architecture

33

Task1 signal signal Task2 Task3 Task4 Task5 1 1 1 1 1 1 1

PE1 PE2

CN

10x+1 2x+0 3x+0

16+12+22=50

Maxime Pelcat

token quantum Compositional

SLIDE 34

LSLA Model of Architecture

34

Task1 signal signal Task2 Task3 Task4 Task5 1 1 1 1 1 1 1

PE1 PE2

CN

10x+1 2x+0 3x+0

16+12+22=50

Maxime Pelcat

SDF: Model of Computation Activity LSLA: Model of Architecture

SLIDE 35

LSLA MoA for Energy Prediction

86% of fidelity on octo-core ARM 

35 Maxime Pelcat

SLIDE 36

LSLA MoA for Energy Prediction

The model is learnt from energy

measurements

36

PE PE

CN

PE PE PE PE

CN

PE PE

CN

Maxime Pelcat

SLIDE 37

LSLA MoA for Energy Prediction

The model is learnt from energy

measurements

37

PE PE

CN

α 1.5W 1.5W PE PE 1.5W 1.5W PE PE

CN

γ 0.3W 0.3W PE PE 0.3W 0.3w

CN

β

Maxime Pelcat

SLIDE 38

LSLA: MoA, not MoHW

LSLA models HW + communication

libraries + scheduler + Oss +…

LSLA models the service the platform
ffers to the applications
Top-down approach

–Learning parameters from experiments

Maxime Pelcat 38

SLIDE 39

System Objectives

Maxime Pelcat 39

T°C Energy Reliability Memory Unit Cost

$

Security Maintenance Cost

$

Latency Peak Power

SLIDE 40

MoAs: Limits of LSLA

Energy

 Linear model OK

Latency
Latency does not have an additive nature

40



Maxime Pelcat

Task1 Task2 1 1 1 Task1 Task2 1 1 1 1

Latency = sum Latency = max

!

SLIDE 41

Activity & MoA for Latency

41

Task1 signal signal Task2 Task3 Task4 Task5 1 1 1 1 1 1 1

SDF a) b)

Maxime Pelcat

c)

SLIDE 42

Activity & MoA for Latency

42

PE1 PE2

CN

10x+1 2x+0 3x+0

Σ  12+12+11=35 Σ 8+6+11=25 max(35,25)=35 a) b)

Maxime Pelcat

MaxPlus

SLIDE 43

c)

Activity & MoA for Latency

43

PE1 PE2

CN

10x+1 2x+0 3x+0

Σ  24

Maxime Pelcat

SLIDE 44

System Prototype

Accuracy? No, Fidelity!

Maxime Pelcat 44

Architecture Design Algorithm Application Redesign Redesign

SLIDE 45

Current Activities

Maxime Pelcat 45

SLIDE 46

Cross-layer Design of Reconfigurable

Cyber-Physical Systems

H2020 CERBERO

Maxime Pelcat 46

SLIDE 47

CERBERO System Adaptation

Maxime Pelcat 47

SLIDE 48

H2020 Cerbero Toolchain

Maxime Pelcat 48

VT AOW DynAA PAPI SPIDER JADE ARTICO3 MDC Intermediate Representation C++ System Model Application / Architecture Runtime Support Low-Level Implementation (Hardware Abstraction) PREESM MECA End-user interaction

SLIDE 49

GdR SOC2

Groupement de recherche SOC2

– Systems on a Chip, Connected Systems – Industrial partner club

Maxime Pelcat 49

SLIDE 50

GdR SOC2

Maxime Pelcat 50

SLIDE 51

SAMOS XIX

19th edition of SAMOS Conference
July 7-11, submission in March

Maxime Pelcat 51

SLIDE 52

Takeaway Message

MoAs can early predict performance/quality

–Especially for HPeC systems

MoAs are not HW Models

–They model HW + protocols + OS + …

MoAs are built/learnt top-down

–They can and should be simple

The need for MoAs may rise

–Due to Fog/Edge Computing complexity

Maxime Pelcat 52

KPI MoA MoC Act

SLIDE 53

Questions?

Maxime Pelcat 53

Pelcat, M; Mercat, A; Desnos, K; Maggiani, L; Liu, Y; Heulot, J; Nezan, J-F; Hamidouche, W; Ménard, D; Bhattacharyya, S (2017) "Reproducible Evaluation of System Efficiency with a Model of Architecture: From Theory to Practice", IEEE TCAD.

www.cerbero-h2020.eu http://preesm.org