Open Source Virtual Platforms for SW Prototyping on FPGA Mark Burton - - PowerPoint PPT Presentation

open source virtual platforms for sw prototyping on fpga
SMART_READER_LITE
LIVE PREVIEW

Open Source Virtual Platforms for SW Prototyping on FPGA Mark Burton - - PowerPoint PPT Presentation

F1 Open Source Virtual Platforms for SW Prototyping on FPGA Mark Burton Enabling System Level Design 1 Deep Learning Accelerator Nvidia has a Deep Learning Accelerator (called NVDLA) The NVIDIA Deep Learning Accelerator (NVDLA) is a free


slide-1
SLIDE 1

Enabling System Level Design

1

Open Source Virtual Platforms for SW Prototyping on FPGA

Mark Burton

F1

slide-2
SLIDE 2

Enabling System Level Design

2

Deep Learning Accelerator

  • Nvidia has a Deep Learning Accelerator (called NVDLA)
  • Nvidia also has a ‘c’ model of the DLA architecture (could be used as a systemc/tlm model)

The NVIDIA Deep Learning Accelerator (NVDLA) is a free and

  • pen architecture that promotes a standard way to design deep

learning inference accelerators. With its modular architecture, NVDLA is scalable, highly configurable, and designed to simplify integration and portability. The hardware supports a wide range

  • f IoT devices. Delivered as an open source project under the

NVIDIA Open NVDLA License, all of the software, hardware, and documentation will be available on GitHub. Contributions are welcome

slide-3
SLIDE 3

Enabling System Level Design

3

Turing Lecture 2017 : Hennessey and Patterson

https://www.youtube.com/watch?v=3LVeEjsn8Ts

slide-4
SLIDE 4

Enabling System Level Design

4

Goals

  • Bring HW and SW together
  • Minimize time to re-spin
  • (change in HW/change in SW)
  • Enable simulation to be used by anybody
  • Make it easy and quick to use
  • Make the simulation FAST
  • Enable S/W development
slide-5
SLIDE 5

Enabling System Level Design

5

Virtualization

Emulation Virtual Platform Virtualization (Para-)Virtualization Hardware

Algorithm execution Or full system virtualization

Application O/S Virtual platform

(model)

‘real binary’

Full binary execution

  • n virtual

platform (model)

Application O/S FPGA

Full binary execution

  • n REAL

platform (FPGA)

Application O/S Hardware

Full binary execution

  • n

Final Hardware

slide-6
SLIDE 6

Enabling System Level Design

6

Virtualization

Emulation Virtual Platform Virtualization (Para-)Virtualization Hardware

Algorithm execution Or full system virtualization

Application O/S Virtual platform

(model)

‘real binary’

Full binary execution

  • n virtual

platform (model)

Application O/S FPGA

Full binary execution

  • n REAL

platform (FPGA)

Application O/S Hardware

Full binary execution

  • n

Final Hardware

slide-7
SLIDE 7

Enabling System Level Design

7

Open Source SystemC Standard

Virtual Platform Standard is SystemC TLM-2.0 IEEE 1666

  • Open Source Simulator available for download from Accellera.org

Corporate members 2016

  • GreenSocs technology at the heart of TLM-2.0 standard.
  • All GreenSocs interfaces use TLM-2.0
  • GreenSocs helping Accellera forge a new Model to tool standard.
  • Preview available in GreenConfig.
  • Our solutions are tool independent, and work with all vendors.
slide-8
SLIDE 8

Enabling System Level Design

8

Qemu: Our Preferred source of CPU models

  • Qemu is the defacto standard Virtualizer.
  • Free and Open Source.
  • It is over 10 years old
  • GreenSocs is a key contributor:

Reverse execution and Multi-Core TCG Kernel.

  • Regular committers from many organizations

18 1100 43000 1000 989,863

Architectures CPU’s Commits Contributors Lines of code

slide-9
SLIDE 9

Enabling System Level Design

9

Existing Model database overview:

X86 ARM MIPS Alpha PowerPC SPARC Micro- blaze Cold- fire Cris SH4 Xtensa

Fast SW dev model (LT)

✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔

Cycle Accurate HW dev model (AT)

✔ ✔ ✔ ✔

CPU Family coverage:

Full list (of several hundred) available on GreenSocs.com

slide-10
SLIDE 10

Enabling System Level Design

10

QBox

  • Wraps up Qemu in a TLM2-0 API such that it can be used in

standard SystemC

  • QEMU is a generic and open source virtualizer – it covers

almost all CPU architectures and achieves extremely high performance.

SystemC QBox (qemu) TLM

QBox

slide-11
SLIDE 11

Enabling System Level Design

11

Qbox Syncronisation options

  • Real Time
  • Each simulator runs as close to real time as possible.
  • Can be simple “run as fast as you can”, no sync.
  • Windowed
  • Each simulator is allowed to run within a window, but if it

reaches the end, it must stop and wait

  • The window will automatically extend as simulators run.
  • (Windowed ‘behind’ to keep SystemC behind and the tlm

delta time positive)

  • Deterministic/single threaded
  • Each simulator runs in turn.
  • Pseudo random ordering to ‘catch’ S/W bugs.
  • (The advantage of a model…)
slide-12
SLIDE 12

Enabling System Level Design

12

Extending Qemu for Zynq

Clock framework

  • Enable the correct timing for events across the full Zynq device.

Large packet DMA framework

  • Significantly increase the speed of DMA activity in the simulated

device.

Fault Injection

  • Model fault injection in a convenient and scriptable way, to enable

safety and test features to be validated.

Safety and Test Library extensions to devices

  • Model the suite of devices in the Zynq that can be self tested.

GreenSocs is the partner upstreaming their device models

slide-13
SLIDE 13

Enabling System Level Design

13

Extending Qemu Speed MULTI Thread Qemu

  • A massive speed improvement for Qemu to take

advantage of multi-core hosts 1 10 20 30 40 50 1 VCPUs 2 VCPUs 4 VCPUs 1 VCPUs 2 VCPUs 4 VCPUs Upstream MTTCG 1 2 4

slide-14
SLIDE 14

Enabling System Level Design

14

Advanced features

  • NON-Deterministic Reverse Execution
  • Ability to debug from an error backwards,

irrespective of input stimulus

  • Supporting
  • No H/W required, No ‘JTAG collector’ limit.
  • Cache modeling
  • Cache Coherency performance estimation
  • Cache flushing S/W checking
slide-15
SLIDE 15

Enabling System Level Design

15

What’s OpenVP

  • User Application and user level

device code

  • Kernel and kernel modules
  • Virtual Platform model,
  • Based on QEMU and SystemC
  • ‘C’ model for NVDLA device itself

Guest User Space Guest Kernel Space Host User Space

CPU Cluster Model NVDLA FPGA Wrapper Model NVDLA Cmodel Mem Model KMD Applications UMD HW Tests FPGA Parser AWS Driver NVDLA device QEMU with TLM2C

slide-16
SLIDE 16

Enabling System Level Design

16

Problem

  • Simulation speed… the NVDLA – Accelerator – is modelled on

the host, so it will not ‘accelerate’.

  • Changes to the core NVDLA architecture require changes to

the model.

slide-17
SLIDE 17

Enabling System Level Design

17

Adding FPGA

  • User Application and user level

device code

  • Kernel and kernel modules
  • Virtual Platform model, with FPGA

wrapper

  • AWS framework
  • NVDLA FPGA hardware module
  • Runs at full speed!

Guest User Space Guest Kernel Space Host User Space Host Kernel AWS HW and FPGA

CPU Cluster Model NVDLA FPGA Wrapper Model NVDLA Cmodel Mem Model KMD Applications UMD AWS Kernel Driver AWS Shell NVDLA FPGA Transactor FPGA DRAM HW Tests FPGA Parser AWS Driver NVDLA device QEMU with TLM2C

slide-18
SLIDE 18

Enabling System Level Design

18

SPEED

  • SW on NVDLA C-Model
  • Anybody can download packaged Docker release
  • Configurable – build time ½ hour.
  • FAST TO SET UP.
  • SW on FPGA with NVDLA RTL
  • Anybody can run AWS env with pre-packages AMI and AFI
  • With AWS setup, easy to alter both FPGA images and associated
  • drivers. (e.g. less than a day).
  • FAST TO RUN.

Both available from nvdla.org

slide-19
SLIDE 19

Enabling System Level Design

19

All the components…

QEMU TLM2C Mem Model DLA Cmodel

slide-20
SLIDE 20

Enabling System Level Design

20

HW Test on FPGA

HW Description FPGA Parser FPGA Driver AWS SDK AWS FPGA

Why we need HW tests on FPGA

To guarantee the quality of FPGA release To identify corner case and issues in RTL

slide-21
SLIDE 21

Enabling System Level Design

21

Full S/W stack

  • Based on SW on Cmodel
  • Replace all Cmodels (NVDLA, Mem model) with FPGA

wrapper

  • Full user code executable on combined QEMU + FPGA model

UMD KMD QEMU FPGA Wrapper AWS SDK AWS HDK AWS FPGA

slide-22
SLIDE 22

Enabling System Level Design

22

Generalisation

  • Making this ‘generally’ applicable

requires more work L

  • Enable any architecture to be

modeled in a ‘cloud’ (public/ private), off-loading onto FPGA when required/appropriate.

  • Enable ‘Virtulization’ when host/

guest match.

slide-23
SLIDE 23

Enabling System Level Design

23

Future Possibilities

  • NVDLA Performance Model integration for Performance

evaluation

  • More AWS FPGA images release for different NVDLA

configuration

  • Enable RISCV in Virtual Platform
  • ARM Project Trillium
  • SiFive
slide-24
SLIDE 24

Enabling System Level Design

24

More information:

www.greensocs.com mark@greensocs.com NVDLA page http://nvdla.org/ OpenVP Doc http://nvdla.org/contents.html OpenVP Github page https://github.com/nvdla/vp https://github.com/nvdla/vp_awsfpga