Systems March 15-16, 2018 Why additional timing analysis for - - PowerPoint PPT Presentation

systems
SMART_READER_LITE
LIVE PREVIEW

Systems March 15-16, 2018 Why additional timing analysis for - - PowerPoint PPT Presentation

Jignesh Shah Programmable Solutions Group, Intel ACM International Workshop on Timing Issues in the Specification and Synthesis of Digital Systems March 15-16, 2018 Why additional timing analysis for multi-voltage paths? In advanced FPGA


slide-1
SLIDE 1

Jignesh Shah Programmable Solutions Group, Intel ACM International Workshop on Timing Issues in the Specification and Synthesis of Digital Systems March 15-16, 2018

slide-2
SLIDE 2

Programmable Solutions Group

Why additional timing analysis for multi-voltage paths?

2

  • In advanced FPGA product the power rail of core & periphery can be different for higher performance & better

security and low power requirement.

  • > Vlow = {Vnominal – Regulator Noise – IR } of a distributed power rail
  • > Vhigh = {Vnominal + Regulator Noise + Overshoot} of a distributed power rail

Vsub Vtop Hard IP I/O Analog Programmable DSP Configurable Memory FPGA LUT Glue Logic

slide-3
SLIDE 3

Programmable Solutions Group

Limitation…

3

  • Total timing corners of a single voltage design for timing analysis are combination of extreme points those are

mentioned in table 1.

Table 1: Total corner of single voltage design = 3 * 5 * 2 * 2 = 60 Table 2: Total corner of multi voltage design can be around 120

  • For static timing analysis (aka STA) of a multi voltage design, total number of timings corners are increased a

quite lot which cause big penalty of runtime memory usage for FPGA software. Device Interconnect Voltage Temperature

SS TT FF RCWorst Cworst RCBest Cbest Typical Low High Hot Cold

Device Interconnect Voltage Combination Temperature

SS TT FF RCWorst Cworst RCBest Cbest Typical Low-Low High-Low Low-High High-High Hot Cold

slide-4
SLIDE 4

Programmable Solutions Group

Can we reduce total corners through additional margin?

4

  • During static timing analysis (aka STA), the FPGA software uses timing models of core and subsystem IPs

either both at low voltage or both at high voltage.

Figure 1: Timing Path between periphery & core Table 1: Scenario for timing analysis in FGPA software

  • Timing margin need to compute for Low-High & High-Low type of voltage crossing and apply it to timing model
  • f subsystem such that worst timing can happen in High-High or Low-Low type of voltage crossing. So timing

analysis for Low-High & High-Low type of voltage crossing do not require.

Voltage Combination Timing models used by FPGA software Vcore high: Vsub high Yes Vcore high: Vsub low No Vcore low: Vsub low Yes Vcore low: Vsub high No

slide-5
SLIDE 5

Programmable Solutions Group

How to compute timing margin for voltage domain crossing (aka VDC) signals?

5

  • T_race @ Vlow = Minimum difference of slowest datapath & fastest clock path at Low voltage
  • T_race @ Vhigh = Minimum difference of fastest datapath & slowest clock path at High voltage

Figure 1: VDC timing path through Input port Figure 2: VDC timing path through Output port

  • >If T_race is bigger for Vhigh compared to Vlow means delay change for clock path is larger than delay change

for datapath, when voltage of subsystem is different than that used in timing model.

  • >If (T_race @Vhigh > T_race @Vlow) then (Timing_margin = T_race @Vhigh – T_race @Vlow)

else ( Timing_margin = 0 )

slide-6
SLIDE 6

Programmable Solutions Group

Corner for timing margin computation

6

  • Timing margin generation of VDC signals for all corners can be expensive. Instead it should be computed in for

dominant corner and use that into timing models of all other corners.

  • Dominant corner for transistor devices can be found by studying data of voltage sensitivity of a few

representative circuits across all device process & temperature.

  • Interconnect delay is less sensitive to changes in supply voltage (i.e. Elmore Delay) and wires around

periphery boundary are usually longer and use higher metal layers. So the RCworst or Cworst corner can be the dominant interconnect.

  • With this approach less than 5% of pessimism could be added in timing result of VDC paths.
slide-7
SLIDE 7

Programmable Solutions Group

Automated flow

7

A list of VDC ports

  • f subsystem/IP

STA analysis @ Vlow in dominant corner STA analysis @ Vhigh in dominant corner T_race report @ Vlow T_race report @Vhigh Generate VDC margin of each port Update timing model with margin of VDC port PreVDC Timing Model of all corners Timing Model with VDC margin Integrate Timing Models Into FPGA Software QA of Timing Model VDC magin of all delay arcs of each VDC port

slide-8
SLIDE 8

Programmable Solutions Group

Summary

8

  • For multi voltage paths, FPGA software cannot enable timing analysis for all combination voltages due to

additional PVT corners and overhead associated with them.

  • By adding extra margin into delay arcs of VDC signal in the model of subsystem / IP, timing can be ensured

for multi voltage paths without running STA for any new PVT corner.

slide-9
SLIDE 9

Programmable Solutions Group

Thank You

slide-10
SLIDE 10

Programmable Solutions Group

Back UP

slide-11
SLIDE 11

Programmable Solutions Group

VDC margin for INPUT port of subsystem.

11

In liberty model timing arc defined between data input pin and clock pin. (ie setup/hold type of sequential arc)

Td_in = Delay from input port of subsystem to data pin

  • f internal flop.

Tck_int = Delay from source of subsystem clock to clock pin of internal flop. Tck_out = Delay from source of subsystem clock source to clock output port. Trace = (Tck_out + Td_in ) – Tck_int (Race between Data & Clock). Trace @ Vlow = Trace when Vsub is low. Trace @ Vhigh = Trace when Vsub is high.

  • > Setup check can be worst for Vtop = low and Vsub = high (ie faster capture clock) so VDC margin required.
  • > Hold check can be worst for Vtop = high and Vsub = low (ie slower capture clock) so VDC margin required.

Vtop Domain Vsub Domain Clock source in Vsub Domain

Td_in Tck_int Tck_out

slide-12
SLIDE 12

Programmable Solutions Group

Continue .. VDC margin for INPUT port of subsystem

12

  • > VDC margin for setup check:

if ( Trace @Vhigh > Trace @Vlow) then { VDC_margin_setup_check = (Trace @Vhigh – Trace @Vlow) VDC_margin_hold_check = (Trace @Vhigh – Trace @Vlow) } else { VDC_margin_setup_check = 0 VDC_margin_hold_check = 0 }

  • > Add “VDC_margin_setup_check” in lookup table of setup timing arc of timing model @ Vlow
  • > Add “VDC_margin_hold_check” in lookup table of hold timing arc of timing model @ Vhigh
slide-13
SLIDE 13

Programmable Solutions Group

VDC margin for OUTPUT port of subsystem

13

In liberty model timing arc defined between data input pin and clock pin. (ie CK-Q type of sequential arc) Td_out = Delay from internal flop to output port of subsystem Tck_int = Delay from source of subsystem clock to clock pin of internal flop. Tck_out = Delay from source of subsystem clock source to clock

  • utput port of subsystem.

Trace = (Td_out + Tck_int ) – Tck_out (Race between Data & Clock). Trace @ Vlow = Trace when Vsub is low. Trace @ Vhigh = Trace when Vsub is high.

  • >Setup check can be worst for Vtop = low and Vsub = high (i.e. faster capture clock) so VDC margin required.
  • > Hold check can be worst for Vtop = high and Vsub = low (i.e. slower capture clock) so VDC margin required.

Vtop Domain Vsub Domain

Td_out Tck_out Tck_int

Clock source in Vsub Domain

slide-14
SLIDE 14

Programmable Solutions Group

Contiune ..VDC margin for OUTPUT port of subsystem

14

  • > VDC margin for setup check:

if ( Trace @Vhigh > Trace @Vlow) then { VDC_margin_setup_check = (Trace @Vhigh – Trace @Vlow) VDC_margin_hold_check = (Trace @Vhigh – Trace @Vlow) } else { VDC_margin_setup_check = 0 VDC_margin_hold_check = 0 }

  • > Add “VDC_margin_setup_check” in lookup table of “Ck-> Q” type arc of timing model @ Vlow
  • > Subtract “VDC_margin_hold_check” in lookup table of “CK-Q” type arc with “min_delay_flag: true”of timing model @ Vhigh
slide-15
SLIDE 15

Programmable Solutions Group

VDC margin for feedthrough data and clock

15

Both data & clock type signal are feedthrough inside subsystem. Combinational arc defined between Input & ouput port. Td_ft = Data delay from input port to output port of subsystem Tck_ft = Clock delay from input port to output port of subsystem Trace = Td_ft – Tck_ft, Trace @Vlow = Trace when Vsub is at low voltage, Trace @Vhigh= Trace when Vsub is at high voltage

  • > VDC margin is depend on which side of a timing path (ie launch /capture) clock is going through subsystem. The margin can

be computed based on data & clock delay race in subsystem at Vlow & Vhigh.

Vtop Domain Vsub Domain Vsub Domain Vtop Domain

Td_ft Tck_ft Td_ft Tck_ft

slide-16
SLIDE 16

Programmable Solutions Group

Contiune ..VDC margin for feedthrough data & clock

16

  • > VDC margin for setup check:

if ( Trace @Vhigh > Trace @Vlow) then { VDC_margin_setup_check = (Trace @Vhigh – Trace @Vlow) VDC_margin_hold_check = (Trace @Vhigh – Trace @Vlow) } else { VDC_margin_setup_check = 0 VDC_margin_hold_check = 0 }

  • > Add “VDC_margin_setup_check” in lookup table of “combinational” type arc between input & output DATA signal for timing

model @ Vlow

  • > Subtract “VDC_margin_hold_check” in lookup table of “combinational” type arc between input & output DATA signal with

“min_delay_flag: true” for timing model @ Vhigh

slide-17
SLIDE 17

Programmable Solutions Group

Only data is going out and clock is internal to subsystem

17

Data is going to outside from internal flop of subsystem. Internal clock of subsystem is not going to outside. In liberty model timing arc defined between data pin and internal clock pin.

  • > No VDC margin for such data signal as setup check is worst for Vsub = low and hold check is worst for Vsub = high.

Vtop Domain Vsub Domain Clock in Vsub Domain Clock in Vtop Domain

slide-18
SLIDE 18

Programmable Solutions Group

Data is going out and clock is coming in to subsystem

18

Data is going to outside from internal flop of subsystem. Internal clock of subsystem is not going to outside. In liberty model timing arc defined between data pin and internal clock pin.

  • > No VDC margin for such data signal as setup check is worst for Vsub = low and hold check is worst for Vsub = high.

Vtop Domain Vsub Domain Clock in Vtop Domain Clock in Vtop Domain

slide-19
SLIDE 19

Programmable Solutions Group

Data & internal clock is going out from subsystem

19

Data is going to outside from internal flop of subsystem. Internal clock of subsystem is also going to outside. In liberty model timing arc defined between data pin and internal clock pin. Td_out = Delay from internal flop to output port of subsystem Tck_int = Delay from source of subsystem clock to clock pin of internal flop. Tck_out = Delay from source of subsystem clock source to clock

  • utput port of subsystem.

Trace = (Td_out + Tck_int ) – Tck_out (Race between Data & Clock). Trace @ Vlow = Trace when Vsub is low. Trace @ Vhigh = Trace when Vsub is high.

  • >Setup check can be worst for Vtop = low and Vsub = high (ie faster capture clock) so VDC margin required.
  • > Hold check can be worst for Vtop = high and Vsub = low (ie slower capture clock) so VDC margin required.

Vtop Domain Vsub Domain Clock in Vsub Domain

Td_out Tck_out Tck_int

slide-20
SLIDE 20

Programmable Solutions Group

Feedthrough Data only

20

Data is feedthrough inside subsystem. In liberty model no timing arc from a clock pin to a feedthrough data pin.

  • > No VDC margin for such data signal as setup check is worst for Vsub = low and hold check is worst for Vsub = high.

Vtop Domain Vsub Domain Clock in Vtop Domain Clock in Vtop Domain

slide-21
SLIDE 21

Programmable Solutions Group

Feedthrough Clock only

21

Only Clock is feedthrough inside subsystem

  • > VDC margin is depend on which side (ie launch /capture) clock is going through subsystem. The margin can be computed

based on clock delay through subsystem at Vlow & Vhigh.

Vtop Domain Vsub Domain Vsub Domain Vtop Domain

slide-22
SLIDE 22

Programmable Solutions Group

Feedthrough data and clock

22

Both data & clock type signal are feedthrough inside subsystem Td_ft = Data delay from input port to output port of subsystem Tck_ft = Clock delay from input port to output port of subsystem Trace = Td_ft – Tck_ft, Trace @Vlow = Trace when Vsub is at low voltage, Trace @Vhigh= Trace when Vsub is at high voltage

  • > VDC margin is depend on which side (ie launch /capture) clock is going through subsystem. The margin can be computed

based on data & clock delay race in subsystem at Vlow & Vhigh.

Vtop Domain Vsub Domain Vsub Domain Vtop Domain

Td_ft Tck_ft Td_ft Tck_ft

slide-23
SLIDE 23