Christopher Lavin, Marc Padilla, Jaren Lamprecht, Philip Lundrigan - - PowerPoint PPT Presentation

christopher lavin marc padilla jaren lamprecht philip
SMART_READER_LITE
LIVE PREVIEW

Christopher Lavin, Marc Padilla, Jaren Lamprecht, Philip Lundrigan - - PowerPoint PPT Presentation

RapidSmith: Do-It-Yourself CAD Tools for Xilinx FPGAs Christopher Lavin, Marc Padilla, Jaren Lamprecht, Philip Lundrigan Brent Nelson and Brad Hutchings FPL September 5-7, 2011 Why Build Your Own Tools Anyway? Proof of concept in their own


slide-1
SLIDE 1

RapidSmith: Do-It-Yourself CAD Tools for Xilinx FPGAs

Christopher Lavin, Marc Padilla, Jaren Lamprecht, Philip Lundrigan Brent Nelson and Brad Hutchings FPL September 5-7, 2011

slide-2
SLIDE 2

Why Build Your Own Tools Anyway?

  • Proof of concept in their own right

– Hypothetical architectures may not account for all real-world factors

  • Targeting real chips important
  • The field needs wild and crazy ideas

– The vendors don’t have all the answers!

  • That requires custom CAD tools

2

slide-3
SLIDE 3

The Challenge

  • Building custom

physical CAD Tools for commercial FPGAs == difficult

– Closed, proprietary device databases – Unsupported interfaces

  • Architectural nuances

complicate things…

3

slide-4
SLIDE 4

Motivation #1: Rapid Prototyping

tool runtime quality of result (QOR)

hours minutes seconds

slide-5
SLIDE 5

Motivation #1: Rapid Prototyping

tool runtime quality of result (QOR)

hours minutes seconds

Commercial tools focus here…

slide-6
SLIDE 6

Motivation #1: Rapid Prototyping

tool runtime quality of result (QOR)

hours minutes seconds

Commercial tools focus here… For rapid prototyping and implementation we would like tools which focus here…

slide-7
SLIDE 7

Motivation #2: Reliability

  • SEU mitigation using TMR

– Selective duplication tools – Single-bit TMR failures in routing

  • Half-latch detection

– Weak keeper tie-offs susceptible to SEUs

  • Need a way to do post-PAR analysis
  • Need a way to do post-PAR modifications

7

slide-8
SLIDE 8

XDL: A Physical Database for Xilinx

  • A textual design database representation

– For Xilinx designs

  • Available for many years

8

Xilinx

map

Xilinx

par –p

(route only)

Xilinx

par –r

(place only)

.NCD .NCD .NCD

Xilinx

xdl

Xilinx

xdl

Xilinx

xdl

.XDL .XDL .XDL

BYU

RapidSmith Tools

Xilinx

bitgen

.BIT

Custom CAD Tools

slide-9
SLIDE 9

#1: XDL as a Design Representation

9

  • xdl –ncd2xdl design

– Converts NCD to XDL

  • xdl –xdl2ncd design

– Converts XDL back to NCD

  • Can inject own CAD tools at any point in the

flow or bypass it entirely

  • Must convert back to NCD for bitgen

Xilinx

map

Xilinx

par –p

(route only)

Xilinx

par –r

(place only)

.NCD .NCD .NCD

Xilinx

xdl

Xilinx

xdl

Xilinx

xdl

.XDL .XDL .XDL

BYU

RapidSmith Tools

Xilinx

bitgen

.BIT

Custom CAD Tools

slide-10
SLIDE 10

#2: XDLRC as a Device Description

  • xdl -report -pips -all_conns partName

– Dumps textual description of specific device as a .xdlrc file – Details everything you need to write placers and routers (except timing data)

10

slide-11
SLIDE 11

Challenges of XDLRC Device Descriptions

  • They are massive!

– Up to 73GB of text for one device! – Difficult for tools to directly operate on XDLRC

  • They are missing some information

– Primitive sites that support more than 1 type – Pin name mappings missing for some sites – Result: placement/routing inefficiencies occur

  • RapidSmith solves these problems

11

slide-12
SLIDE 12

SOME TERMINOLOGY

12

slide-13
SLIDE 13

A Familiar View of the Fabric…

13

slide-14
SLIDE 14

A Familiar View of the Fabric…

14

slide-15
SLIDE 15

A Familiar View of the Fabric…

15

L_TERM INT L_TERM INT L_TERM INT INT_SO INT_SO INT_SO INT INT INT INT INT INT INT INT INT CLB CLB CLB CLB CLB CLB IOIS IOIS IOIS

slide-16
SLIDE 16

XDLRC Abstraction – 2D Tile Array

16

slide-17
SLIDE 17

XDLRC Abstraction - Tiles

17

HCLK_X1Y39 INT_X2Y37 CLB_X2Y37 DSP_X10Y32 BRAM_X5Y32

slide-18
SLIDE 18

XDLRC Abstraction – Primitive Sites

18

INT_X2Y37

Contains: TIEOFF_X2Y37

CLB_X2Y37

Contains: SLICE_X3Y75 SLICE_X3Y74 SLICE_X2Y75 SLICE_X2Y74

BRAM_X5Y32

Contains: RAMB16_X0Y8 FIFO16_X0Y8

DSP_X10Y32

Contains: DSP48_X0Y17 DSP48_X0Y16

slide-19
SLIDE 19

XDL EXAMPLES

19

slide-20
SLIDE 20

XDL Example

20

inst "inst23" "SLICEL",placed CLB_X13Y45 SLICE_X18Y91 , cfg " BXINV::BX BYINV::#OFF ... F:inst23lut0:#LUT:D=((~A1*A3)+(A1*A2)) G:inst23lut1:#LUT:D=((~A1*A3)+(A1*A4)) ... YUSED::#OFF "; ... net "shiftResult4" , cfg " ", inpin "inst4" G3 ,

  • utpin "inst5" YQ ,

;

slide-21
SLIDE 21

XDL Example

21

inst "inst23" "SLICEL",placed CLB_X13Y45 SLICE_X18Y91 , cfg " BXINV::BX BYINV::#OFF ... F:inst23lut0:#LUT:D=((~A1*A3)+(A1*A2)) G:inst23lut1:#LUT:D=((~A1*A3)+(A1*A4)) ... YUSED::#OFF "; ... net "shiftResult4" , cfg " ", inpin "inst4" G3 ,

  • utpin "inst5" YQ ,

pip CLB_X31Y53 IMUX_B18_INT -> G3_PINWIRE2 , pip CLB_X31Y54 YQ_PINWIRE2 -> SECONDARY_LOGIC_OUTS6_INT , pip INT_X31Y53 OMUX_S3 -> IMUX_B18 , pip INT_X31Y54 SECONDARY_LOGIC_OUTS6 -> OMUX3 , ;

slide-22
SLIDE 22

XDL Module Example

22

module "mux" "inst23" , cfg " _SYSTEM_MACRO::FALSE "; port "mux5i_0_inport" "inst31" "F4"; port "mux5i_1_inport" "inst33" "F2"; ... inst "inst23" "SLICEL",placed CLB_X13Y45 SLICE_X18Y91 , cfg " BXINV::BX BYINV::#OFF ... YUSED::#OFF "; ... net "shiftResult4" , cfg " ", inpin "inst4" G3 ,

  • utpin "inst5" YQ ,

pip CLB_X31Y53 IMUX_B18_INT -> G3_PINWIRE2 , pip CLB_X31Y54 YQ_PINWIRE2 -> SECONDARY_LOGIC_OUTS6_INT , pip INT_X31Y53 OMUX_S3 -> IMUX_B18 , pip INT_X31Y54 SECONDARY_LOGIC_OUTS6 -> OMUX3 , ; endmodule "mux";

slide-23
SLIDE 23

THE RAPIDSMITH TOOL SUITE

23

slide-24
SLIDE 24

RapidSmith

24

XDL File

RapidSmith

XDL File

slide-25
SLIDE 25

RapidSmith

25

XDL File

RapidSmith

XDL File

Internal Graph Represenation Java API

slide-26
SLIDE 26

RapidSmith

26

XDL File

RapidSmith

XDL File

Java API Internal Graph Represenation

Custom Cad Tools

( create, place, route, modify circuits )

slide-27
SLIDE 27

RapidSmith Abstractions

28

Design Instance PrimitiveType Attribute (List) PrimitiveSite Net NetType Pin (List) PIP (List) Module Port (List) Instance (List) Net (List) ModuleInstance Instance (List) Net (List Device Tile (2D Array) TileType PrimitiveSite (Array) PrimitiveType Tile Wire

XDL XDLRC

slide-28
SLIDE 28

XDLRC Device File Creation

  • Three major strategies to reduce XDLRC

information size:

– Aggressive wire and object reuse – Careful pruning of unnecessary wires – Customized serialization and compression

  • XDLRC size compression of >10,000X
  • Device files load in just a few seconds or

less

29

slide-29
SLIDE 29

RapidSmith Device Files Performance

30

Xilinx Part Name XDLRC Report Size RapidSmith File Size Memory Footprint Load Time From Disk Virtex 4 LX200 10.0 GB 1.01 MB 61 MB 602 ms Virtex 5 LX330 12.5 GB 1.07 MB 69 MB 622 ms Virtex 6 CX240T 8.5 GB 0.94 MB 35 MB 460 ms Virtex 6 LX760 22.8 GB 1.76 MB 77 MB 1.07 s Virtex 7 855T 32.0 GB 2.63 MB 115 MB 1.41 s Virtex 7 2000T 73.6 GB 5.96 MB 301 MB 3.34 s

slide-30
SLIDE 30

7 EXAMPLES OF RAPIDSMITH USE AND CAPABILITIES

31

slide-31
SLIDE 31

RapidSmith Example #1: Random Placer

32

public class RandomPlacer{ public static void main(String[] args){ // Create and load a design Design design = new Design(args[0]); Random rng = new Random(0); // Create random number generator // Place all unplaced instances for(Instance i : design.getInstances()){ if(i.isPlaced()) continue; PrimitiveSite[] sites = design.getDevice().getAllCompatibleSites(i.getType()); int idx = rng.nextInt(sites.length); int watchDog = 0; // Find a free primitive site while(design.isPrimitiveSiteUsed(sites[idx])){ if(++idx > sites.length) idx = 0; if(++watchDog > sites.length) System.out.println("Placement failed."); } i.place(sites[idx]); } // Save the placed design design.saveXDLFile(args[1]); } }

slide-32
SLIDE 32

RapidSmith Example #2: Placing a Module

33

// Load XDL file (parses XDL, populated design object) Design design = new Design("moduleContainingDesign.xdl"); // Get the 1024-FFT module definition by name Module fft = design.getModule("fft1024"); // Create an instance of the FFT module called "f0" ModuleInstance mi = design.createModuleInstance("f0", fft); //Find all compatible sites with the anchor PrimitiveType type = mi.getAnchor().getType(); PrimitiveSite[] sites = design.getDevice().getAllCompatibleSites(type); int i = 0; while(!mi.place(sites[i++], design.getDevice())){ if(i >= sites.length) error(mi.getName()+ " has no valid placement!"); }

slide-33
SLIDE 33

RapidSmith Example #3: VCC/GND Handling

  • GND/VCC supplied in two ways:

– LUTs configured to drive ‘1’ or ‘0’ – TIEOFF primitives in every switch box

  • Supplied GND / VCC posts
  • Router must partition nets into

neighborhoods to use local static sources

– RapidSmith includes a StaticSourceHandler class with a variety of methods to provide this functionality

34

slide-34
SLIDE 34

RapidSmith Example #4: HMFlow

35

.mdl

HM Cache Generic HMG

Design Parser & Mapper Design Stitcher XDL Hard Macro Placer XDL Router

.xdl

INPUT DESIGNS HARD MACRO SOURCES PLACED & ROUTED XDL

  • Rapid compilation

approach using hard macros

  • Built on top of

RapidSmith

Part of CHREC research project Demonstrated > 50X reduction in tool flow time

slide-35
SLIDE 35

RapidSmith Example #5: Device Browser

36

slide-36
SLIDE 36

RapidSmith Example #6: Design Explorer

37

slide-37
SLIDE 37

RapidSmith Example #7: Custom Hard Macro Placer

38

slide-38
SLIDE 38

RapidSmith Example #7: Timing Visualizer

slide-39
SLIDE 39

Conclusion

  • RapidSmith

– Provides XDL-based infrastructure – Designed to aid in the construction of custom CAD tools

  • Flexible

– Custom CAD flow

  • HMFlow for Hard Macro-Based Design

– Custom individual steps in the flow

  • Placer or router

– Post Xilinx flow circuit modifications

  • Reliability modifications

– Post Xilinx flow circuit analysis

  • Timing visualization
  • Available open source:

– http://rapidsmith.sourceforge.net

40