Synchronizers and Arbiters David Kinniment University of Newcastle - - PowerPoint PPT Presentation

synchronizers and arbiters
SMART_READER_LITE
LIVE PREVIEW

Synchronizers and Arbiters David Kinniment University of Newcastle - - PowerPoint PPT Presentation

Synchronizers and Arbiters David Kinniment University of Newcastle 1 Tutorial 7 April 2008 Outline Whats the problem? Why does it matter? Synchronizer and arbiter circuits Noise, and its effects Latency, and how to overcome


slide-1
SLIDE 1

Tutorial 7 April 2008

1

Synchronizers and Arbiters

David Kinniment University of Newcastle

slide-2
SLIDE 2

Tutorial 7 April 2008

2

Outline

What’s the problem? Why does it matter? Synchronizer and arbiter circuits Noise, and its effects Latency, and how to overcome it Metastability measurement 1

– Simple measurements

Arbitration Metastability measurement 2

– Second order effects (Which may matter)

9:00 – 10:30 11:00 – 12:00

slide-3
SLIDE 3

Tutorial 7 April 2008

3

What’s the problem: The digital world and the real world

Your system

The synchronizer

Everything else,

  • r Reality
slide-4
SLIDE 4

Tutorial 7 April 2008

4

Synchronizers and arbiters

Your system Input

Synchronizer

Decides which clock cycle to use for input Your system Input 1 Input 2

Asynchronous arbiter

Decides which input to take first

slide-5
SLIDE 5

Tutorial 7 April 2008

5

Time Comparison Hardware

Digital comparison hardware

(which compares integers) is easy

– Fast – Bounded time

Analog comparison hardware (which

compares reals like time) is hard

– Normally fast, but takes longer as the difference becomes smaller – Can take forever

Synchronization and arbitration involve

comparison of time

slide-6
SLIDE 6

Tutorial 7 April 2008

6

History & Philosophy

Abu Hamid Ibn Muhammad Ibn Muhammad al-Tusi

al-Shafi’i al-Ghazali ~1100

– “Suppose two similar dates in front of a man who has a strong desire for them, but who is unable to take them both. Surely he will take one of them through a quality in him, the nature of which is to differentiate between two similar things” – He felt that this demonstrated free will Jehan Buridan, Rector of Paris University ~1340 – Buridan’s Ass (A dog with two bowls?) – “Should two courses be judged equal, then the will cannot break the deadlock, all it can do is to suspend judgment until the circumstances change, and the right course of action is clear” – He’s not so sure

slide-7
SLIDE 7

Tutorial 7 April 2008

7

Digital Computers

Voltages have a finite number of values in a

computer, 1 and 0

Time has a discrete number of instants in a

synchronous system BUT

Computers have to talk to other computers

and to people who are not synchronous

Known to early computer designers:

– Lubkin 1952, Catt 1966 – Chaney and Littlefield 1966/72

slide-8
SLIDE 8

Tutorial 7 April 2008

8

State of the art in 1980

slide-9
SLIDE 9

Tutorial 7 April 2008

9

Your options

Synchronizing a clocked system

– You have a limited time to synchronize – Synchronizer circuits may fail to work in that time – System sometimes fails – You fly into a mountain

Arbitrating requests for an asynchronous

system

– Can take forever (with decreasing probability) – You fly into a mountain

slide-10
SLIDE 10

Tutorial 7 April 2008

10

Why does it matter?

Systems are Globally Asynchronous

– 4 x increase in global asynchronous signalling by 2012 – 8 x by 2020 [ITRS 2005] – Communication time is an increasing part in system performance

And Locally Synchronous

– Many different clocks – Many synchronizers – Need to know the reliability of the synchronizer. – Synchronisation adds latency to communication time

slide-11
SLIDE 11

Tutorial 7 April 2008

11

A Network on Chip (Sparsø 2005)

Sparsø

Synchronization required Multiple Clocks Asynchronous Arbitration required

slide-12
SLIDE 12

Tutorial 7 April 2008

12

Outline

What’s the problem? Why does it matter? Synchronizer and arbiter circuits Noise, and its effects Latency, and how to overcome it Metastability measurement 1

– Simple measurements

Arbitration Metastability measurement 2

– Second order effects (Which may matter)

slide-13
SLIDE 13

Metastability is....

Tutorial 7 April 2008

13

Not being able to decide…

Q Q Clock D

∆tin ∆tin -> 0

D Clock

Request Processor Clock Set-up time violated

slide-14
SLIDE 14

Metastability in a Latch

Tutorial 7 April 2008

14 V1 V2 I1 I2 V1 V2

Stable points Metastable Point

V1 V2 I1 V2 V1 I2

slide-15
SLIDE 15

Tutorial 7 April 2008

15

Linear Model

Simple linear model

leads to two exponentials

τa is convergent, τb is

divergent

b a

t b t a

e K e K V

τ τ

. .

1

+ =

1 1

1 2 2 1 2 1 2 1 2 1

= + + + − τ τ τ τ . . ( ) . ( ). d V dt A dV dt A V

τ τ

1 1 1 2 2 2

= = C R A C R A . , .

Q1 Q2

  • A*V1

R1 C1 +

  • V2

V2 V1 V1 V2 V1

R A gm =

slide-16
SLIDE 16

Tutorial 7 April 2008

16

How often does it fail?

t

e

τ

.

The output trajectory is an

exponential that depends on the starting condition K, K depends on ∆tin

Suppose the clock frequency is fc,

the data rate fd, and Ka = 0

In M seconds we have M.fc clocks. The probability of a data change

within ∆tin of any clock is ∆tin. fd, so there will be one within M seconds if

The time taken to resolve this event

is t (Tw is the metastability window)

= V K

d c f

f M ∆tin . . 1 =

τ t in w

e t T K V = ∆ =

slide-17
SLIDE 17

Synchronizer

Tutorial 7 April 2008

17

t is time between clock

a and clock b

τ, and Tw depend on

circuit

d c in

f f t MTBF ∆ = 1

D Q D Q CLK a VALID #1 #2

OR

d c w t

f f T e MTBF . .

=

CLK b

slide-18
SLIDE 18

Tutorial 7 April 2008

18

Edge Triggered FF

Synchronizer

fails if time too long

Failures

proportional to

– Clock frequency – Data change frequency

Data Clock Clock D C Q Data Master Out D C Q Slave Out Master Slave Slave Out Master metastable Slave transparent Slave metastable

slide-19
SLIDE 19

Tutorial 7 April 2008

19

Synchronizer responses

Data Clock Q Output Data low Data Changing

D Q #1

  • Data and Clock are asynchronous
  • Trigger from Q and observe time

between clock and Q Osc 1 Osc 2 Scope Trigger

slide-20
SLIDE 20

Tutorial 7 April 2008

20

Typical responses

Q Output Clock

All starting points are equally possible Most are a long way from the “balance point” A few are very close and take a long time to resolve

slide-21
SLIDE 21

Event Histogram

Tutorial 7 April 2008

21

  • Propagation delay

Events The slope is -1/τ Log Probability of event depends on ∆ time Propagation delay Normal delay The intercept is ~Tw

slide-22
SLIDE 22

State of the art

Tutorial 7 April 2008

22

You require about 35 τ s in order to get the

MTBF out to about 1 century. (That’s for 1 synchronizer)

Each typical static gate delay is equivalent

to about 5 τ s in a properly designed synchronizing flop. (You can increase Vdd

  • n the flop to get it faster)

You should assume a ‘malicious’ input to

the synchronizer. This adds about 5 -10 τs to the delay (depending on how cautious you are).

slide-23
SLIDE 23

Tutorial 7 April 2008

23

Jamb latch synchronizer

Fast and simple

Node A Node B Clock Data Reset Out B Out A

slide-24
SLIDE 24

Tutorial 7 April 2008

24

The arbiter (MUTEX)

  • Asynchronous arbitration
  • No time bound

Grant 2 Grant 1 Request 1

Gnd

Request 2

slide-25
SLIDE 25

Tutorial 7 April 2008

25

Metastability filters

Half levels due to metastability need to be removed

– Low (or high) threshold inverters – Measure divergence

Filters define the time to reach a stable state

Vdd/2 Vdd/2 Vt =Vdd/4 Vdd/2

slide-26
SLIDE 26

Tutorial 7 April 2008

26

Arbitration time

Unlike a synchronizer, an arbiter may take for ever. It usually doesn’t, long responses are rare. On average the time is only τ longer than the normal

response.

Outputs are always monotonic

Request 1 Request 2 tm Grant 1 Grant 2

slide-27
SLIDE 27

Tutorial 7 April 2008

27

Future synchronizers

Synchronizers don’t work well in nanometre

technologies

Worse that gates! Why? Gate delays depend on large signal issues:

– C.VT/Ids determines how long does it take to discharge C to VT before the next gate changes state – Ids large when transistor is hard on

VT Ids

C

slide-28
SLIDE 28

Tutorial 7 April 2008

28

No gain at Vdd/2

As Vdd decreases with process

shrink

– Gate threshold does not decrease to minimise leakage

A gate input is either HIGH

– Output pulled down

Or Low

– Output pulled up

A metastable gate is neither

– Both transistors can be off – gm very low

Synchronization time constant τ =

C/gm

Vdd Ground

Ids

Vdd

Ids

Ground

Ids Ids

Vdd/2

slide-29
SLIDE 29

Tutorial 7 April 2008

29

Low Vdd, low temperature

Both transistors off, gm → 0, τ → ∞ at Vdd < 0.6V Low temperature gives higher threshold so even worse Does not track logic

Tau vs Vdd 100 200 300 400 500 600 700 0.5 1 1.5 Vdd in V T a u in p s Tau at 27 C Tau at -25 C FO4 inverter at 27 C

slide-30
SLIDE 30

Tutorial 7 April 2008

30

Vdd tolerant circuit

Turn on p-types when latch is metastable

– Extra current gives high gm in n-types – Normally low power

gm depends mainly on n-types

– fast

Extra current When metastable Weak p Keepers

Wide n for good gm

slide-31
SLIDE 31

Tutorial 7 April 2008

31

Effect of extra current

Tau at 0.6V down from >700ps to < 100ps Tracks logic, so does not limit performance

New Circuit

50 100 150 200 250 300 0.5 1 1.5 2 Vdd, V Tau, ps Tau at 27 C Tau at -25 C FO4 inverter at 27 C

slide-32
SLIDE 32

Tutorial 7 April 2008

32

Outline

What’s the problem? Why does it matter? Synchronizer and arbiter circuits Noise, and its effects Latency, and how to overcome it Metastability measurement 1

– Simple measurements

Arbitration Metastability measurement 2

– Second order effects (Which may matter)

slide-33
SLIDE 33

Tutorial 7 April 2008

33

Does noise affect τ ?

Probability of escape from metastability does not

change with gaussian noise (Couranz and Wann 1975)

Trajectories

  • 0.7
  • 0.5
  • 0.3
  • 0.1

0.1 0.3 0.5 0.7 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Time Volts

slide-34
SLIDE 34

Tutorial 7 April 2008

34

Does noise affect τ ?

Probability of escape from metastability does not

change with gaussian noise (Couranz and Wann 1975)

Trajectories

  • 0.7
  • 0.5
  • 0.3
  • 0.1

0.1 0.3 0.5 0.7 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Time Volts

slide-35
SLIDE 35

Noise can change the input time

Tutorial 7 April 2008

35

Or maybe not..…

Q Q Clock D

∆tin

D Clock

∆tin -> 0

slide-36
SLIDE 36

Tutorial 7 April 2008

36

The normal case

Probability

Probability of initial difference due to noise component P1(v) tn Probability of initial difference due to input clock data overlap P0(v) T >> tn Convolution Result of convolution P(v)

Time

slide-37
SLIDE 37

Tutorial 7 April 2008

37

The malicious input

tn Probability of initial difference due to noise component P1(v) Probability of initial difference with zero input clock data overlap P0(v) T << tn

Probability

Result of convolution P(v)

Time

slide-38
SLIDE 38

Tutorial 7 April 2008

38

Noise measurement

Probability of an output 1 as a function of input voltage difference 0.0000 0.5000 1.0000

  • 0.0030
  • 0.0010

0.0010 0.0030 Input mV Probability

A measurement of approximately 1.7mV RMS at the input corresponds to about 0.6mV total between latch nodes

mV C kT 7 . 4 ≈

This is equivalent to about 0.1ps Typically this leads to a synchronization time of about 11τ longer than the simple case for a malicious input.

slide-39
SLIDE 39

Tutorial 7 April 2008

39

Outline

What’s the problem? Why does it matter? Synchronizer and arbiter circuits Noise, and its effects Latency, and how to overcome it Metastability measurement 1

– Simple measurements

Arbitration Metastability measurement 2

– Second order effects (Which may matter)

slide-40
SLIDE 40

Tutorial 7 April 2008

40

Request and Acknowledge

Data Available D Q D Q Read Clocks REQ D Q D Q Write Clocks Read done ACK

DATA

slide-41
SLIDE 41

Tutorial 7 April 2008

41

Latency

It takes one - two receive clocks to

synchronise the request

Then one – two write clocks to acknowledge it Significant latency (1-3 clocks) Poor data rate (2 – 6 Clocks)

slide-42
SLIDE 42

Tutorial 7 April 2008

42

FIFO

Can improve data rate by using a FIFO But not latency (which gets worse) FIFO is asynchronous (usually RAM + read and write

pointers)

D Q D Q Read Clock 2 Data Available WRITE FIFO D Q D Q Free to write Write clock 1 Write Data Read done Full Not Empty READ

DATA DATA

Write clock 2 Read Clock 1

slide-43
SLIDE 43

Tutorial 7 April 2008

43

Timing regions can have predictable relationships

Locked

– Two clocks are not the same but phase linked, The relationship is known as mesochronous. – Two clocks from same source – Linked by PLL – One produced by dividing the other – Some asynchronous systems – Some GALS

Not locked together

– Phase difference can drift in an unbounded manner. This relationship is called plesiochronous – Two clocks same frequency, but different oscillators. – As above, same frequency ratio

slide-44
SLIDE 44

Tutorial 7 April 2008

44

Don’t synchronise when you don’t need to

If the two clocks are locked together, you don’t need

a synchroniser, just an asynchronous FIFO big enough to accommodate any jitter/skew

FIFO must never overflow, so there is latency

REQ IN Read done ACK IN REQ OUT ACK OUT FIFO

DATA DATA

Write Data Available

slide-45
SLIDE 45

Tutorial 7 April 2008

45

Mesochronous data exchange

Intermediate X register used to retime data Need to find a place where write data is stable, and read

register available

– Greenstreet 2004 Controller DATA In DATA Out Write Clock Read Clock R W X

slide-46
SLIDE 46

Tutorial 7 April 2008

46

Finding the place to clock X

provided that tc > 2(th + ts) at least one place is always available for

data transfer, but we lose one cycle.

– Write before read, or – Read before write

th ts Write Clock Read Clock OK RW OK WR th ts th ts

slide-47
SLIDE 47

Tutorial 7 April 2008

47

Pre synchronizing

If the phase can vary with time (Plesiochronous), synchronization still need not cause large latencies

Read Clock Write Clock Detect conflict (metastability issue)

Delay read clock

d d

Potential conflict zone

Predicted conflict Synchronization problem known in advance

slide-48
SLIDE 48

Tutorial 7 April 2008

48

Conflict prediction

Predict when clocks are going to conflict and delay

synchronization

Dike’s conflict detector

WCLK WCLK

MUTEX MUTEX R1 R1 R2 R2 G1 G1 G2 G2

RCLK RCLK

d d d d

conflict conflict

MUTEX MUTEX R1 R1 R2 R2 G1 G1 G2 G2

RCLK RCLK

d d d d

conflict region

slide-49
SLIDE 49

Tutorial 7 April 2008

49

Clock delay synchronizer (Ginosar 2004)

DATA DATA REG REG RCLK RCLK

conflict region

t tKO 1 1 conflict conflict detector detector WCLK WCLK SYNC SYNC

KO

d d d d t tKO

KO

RCLK RCLK

slide-50
SLIDE 50

Tutorial 7 April 2008

50

Pre synchronizer latency

Nominally 0 – 1 clock cycle Relies on accurately predicting conflicts Clocks must remain stable over

synchronisation time.

Always lose tko of next computation stage Alternative: shift all conflicts to next read

cycle

– On average this loses 2d – 2d must be big enough to cover any clock drift/jitter over synchronization time

slide-51
SLIDE 51

Tutorial 7 April 2008

51

Speculation

Mostly, the synchronizer does not need 30τ to

settle

Only e-13 (0.00023%) need more than 13τ Why not go ahead anyway, and try again if

more time was needed

slide-52
SLIDE 52

Tutorial 7 April 2008

52

Low latency synchronization

Data Available, or Free to write are produced early. If they prove to be in error, synchronization failed. Read Fail or Write Fail flag is then raised and the

action can be repeated.

WRITE FIFO Data Available Read Fail Write Fail Write Data Read done Free to write Full Not Empty READ

DATA DATA Speculative synchronizer Speculative synchronizer

Write clock Read Clock

slide-53
SLIDE 53

Tutorial 7 April 2008

53

Q Flop

  • With CLK low, both outputs are low
  • With CLK high, Q becomes equal to D only after metastability
  • Q and Qbar are both low until metastability resolved
  • We can detect events that take longer than a half cycle

D

Q Q

Gnd

CLK

slide-54
SLIDE 54

Tutorial 7 April 2008

54

Was it OK?

  • FF#1 is set after a half cycle - 2τ, FF#2 after a half cycle, FF#3 at a full

cycle

  • Latency is normally half a cycle = 15 τ, but synchroniser fails often
  • By the time we look at the Read Fail signal ( a full cycle = 30τ) all

signals are stable

Not Empty DATA Data Available Read Fail Q Q F Fl lo

  • p

p D D CLK Q Q Read Clock Final Synch Speculative Synch Early Synch D Q #1 D Q #2 D Q #3 QBAR D Q #4 Q 2 2τ τ

slide-55
SLIDE 55

Tutorial 7 April 2008

55

When to recover

Early FF1 Half Cycle – 2/13τ Speculative FF2 Half Cycle/15τ Fail FF3, 4 End of Cycle/30τ Comment ? ? metastable? Unrecoverable error, Probability low. No data was available 1 1 Stable at the end of the cycle, but the speculative output may have been metastable. Return to original state 1 1 Normal data Transfer

slide-56
SLIDE 56

Tutorial 7 April 2008

56

Speculative Synchronisation latency

Recovery means restoring any corrupted

registers, and may take some time, BUT

Probability of recovery operation is e-13, so

little time lost on average.

Can reduce average synchronization latency

from one cycle to a half cycle

slide-57
SLIDE 57

Tutorial 7 April 2008

57

Comments

Synchronization/arbitration requires special circuit

elements

They’re not digital! If there’s a real choice, and bounded time you will

have failures.

The MTBF can be made longer than the life of the

universe

Design gets more difficult with small dimensions Latency is a problem, but not insuperable. Synchronizers are not deterministic.

slide-58
SLIDE 58

Tutorial 7 April 2008

58

Outline

What’s the problem? Why does it matter? Synchronizer and arbiter circuits Noise, and its effects Latency, and how to overcome it Metastability measurement 1

– Simple measurements

Arbitration Metastability measurement 2

– Second order effects (Which may matter)

slide-59
SLIDE 59

Tutorial 7 April 2008

59

Testing synchronizers

Data 10.01 MHz Clock 10MHz Q Output 100pS

D Q #1

  • Data and Clock are asynchronous
  • Q only changes if Data and clock

edges are within 100ps (1 in 1000) Osc 1 Osc 2 Scope Trigger

slide-60
SLIDE 60

Tutorial 7 April 2008

60

Event histogram

Trigger from Q going high Observe clock, so scale is negative Log scale of events because

D Q #1 Number

  • f events

Q to clock delay

t = Clock to Q time

Log(Number

  • f events)

Q to clock delay

τ /

.

t d c w Elapsed Elapsed

e f f T T MTBF T Events

= =

slide-61
SLIDE 61

Tutorial 7 April 2008

61

Experimental measurement set-up

  • Two asynchronous oscillators are used to drive the data and

the clock inputs of a D-type edge triggered Flip-Flop.

  • With the slight difference in the frequency of the two oscillators,

the clock rising edge may or may not produce a change in the Q output.

  • Oscillators should produce constant probability of Data – Clock

change with time (But may not: Cantoni 2007)

slide-62
SLIDE 62

Tutorial 7 April 2008

62

Altera FPGA measurements

An Altera FLEX10K70 used here, manufactured in a 0.45µm

CMOS process.

The events collected over a period of 4 hours. To calculate the value of τ (resolution time constant), the

histogram of the trace density can be plotted in semi-log scale.

slide-63
SLIDE 63

Tutorial 7 April 2008

63

Altera FPGA plot

1 1 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 .0 0 E + 0 0 2 .0 0 E - 1 0 4 .0 0 E - 1 0 6 .0 0 E - 1 0 8 .0 0 E - 1 0 1 .0 0 E - 0 9 1 .2 0 E - 0 9 1 .4 0 E - 0 9 1 .6 0 E - 0 9 T im e E v e n t S e rie s 1 M e ta sta b le re g io n D e te rm inis tic S ynchro no us

The X-axis represent time from a triggering Q output back to the

clock edge. Therefore increasing metastability time is shown from right to left.

  • Here τ = 120ps
slide-64
SLIDE 64

Tutorial 7 April 2008

64

Points to consider

This type of measurement depends on

– Uniform distribution of clock data overlaps – Often not true because the oscillators affect each

  • ther (Cantoni 2007)

Uses an expensive oscilloscope to do the

histograms

– You don’t HAVE to use one. Counters and delays will do

The theory only applies to simple FFs

– FFs need to be predesigned, or laid out in a small area

slide-65
SLIDE 65

Tutorial 7 April 2008

65

Measurements in a bistable element

D CLK

Q

A D-type edge triggered Flip-Flop constructed using NAND

gates on the Altera FPGA.

The master and the slave were placed very close to each other.

slide-66
SLIDE 66

Tutorial 7 April 2008

66

Components are close, but not in same cell

Routing delays play significant role in this experiment. Long metastability times due to the feedback loop delays .

slide-67
SLIDE 67

Tutorial 7 April 2008

67

Measurements in a bistable element (cont.)

1 1 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 2 .4 5 E -0 8 2 .5 5 E -0 8 2 .6 5 E -0 8 2 .7 5 E -0 8 2 .8 5 E -0 8 2 .9 5 E -0 8 3 .0 5 E -0 8 3 .1 5 E -0 8 T im e Events S e rie s 1

From the histogram, a damped oscillation in the deterministic

region can be observed.

The value of τ is in the order of 5 nanoseconds, making this

particular design unsuitable for any application.

Circuits with feedback loops passing through LUTs can exhibit

  • scillation.
slide-68
SLIDE 68

Tutorial 7 April 2008

68

Too much delay

D CLK

Q It may not be easy to place elements close to each

  • ther

Extra delay can cause the loop to become unstable

slide-69
SLIDE 69

Tutorial 7 April 2008

69

Complex response

We put an extra gate

in the feedback loop

  • f the master FF here

So the output

  • scillates, and causes

ripples in the histogram

Time between cycles

is about 3ns, so you get lots of outputs at more than 20ns

Demo by Nikolaos

Minas

slide-70
SLIDE 70

Tutorial 7 April 2008

70

Outline

What’s the problem? Why does it matter? Synchronizer and arbiter circuits Noise, and its effects Latency, and how to overcome it Metastability measurement 1

– Simple measurements

Arbitration Metastability measurement 2

– Second order effects (Which may matter)

11:00 – 12:00

slide-71
SLIDE 71

Tutorial 7 April 2008

71

Arbitration

Complex systems may

require that some requests overtake others

Here three input

channels require access to a single output port

Each request may have

a different priority

Priority can be

topologically fixed, or determined by a function

Dynamic priority arbiter line 0 control line 1 control line 2 control P1 r1 g1 P2 r2 g2 P0 r0 g0 Data switch Output line Data control

slide-72
SLIDE 72

Tutorial 7 April 2008

72

Types of arbiter

Topologically fixed

– priorities determined by structure, e.g. daisy-chain

Start requests

  • rder of polling

~r1,r1 g1 d1 r2 g2d2 rn gndn

Static or dynamic priority

– determined by fixed hardware, or priority data supplied

slide-73
SLIDE 73

Tutorial 7 April 2008

73

Static or dynamic priority

Request lock register Control and Interface requests grants Priority logic priority busses

slide-74
SLIDE 74

Tutorial 7 April 2008

74

Metastability and priority

Lock the request pattern

– incoming requests cause Lock to go high – following MUTEX ensures that request wins or loses

Evaluate priorities with a fixed request

pattern

MUTEX

Lock r s l w

?

slide-75
SLIDE 75

Tutorial 7 April 2008

75

Static priority arbiter

s q r*

C

MUTEX

C

s* q r

MUTEX

C

s* q r

MUTEX

C

s* q r

G1 G2 G3 R1 R2 R3 Lock Lock Register Priority Module r1 r2 r3 s1 s2 s3

slide-76
SLIDE 76

Tutorial 7 April 2008

76

More than one request

Priority needed if requests are competing Shared resource free

– resolution required only if second request arrives before the lock signal due to first request

Shared resource busy

– Further requests may accumulate, and one may be higher priority

slide-77
SLIDE 77

Tutorial 7 April 2008

77

Two more requests

s q r*

C

MUTEX

C

s* q r

MUTEX

C

s* q r

MUTEX

C

s* q r

G1 G2 G3 R1 R2 R3 Lock Lock Register Priority Module r1 r2 r3 s1 s2 s3

slide-78
SLIDE 78

Tutorial 7 April 2008

78

Outline

What’s the problem? Why does it matter? Synchronizer and arbiter circuits Noise, and its effects Latency, and how to overcome it Metastability measurement 1

– Simple measurements

Arbitration Metastability measurement 2

– Second order effects (Which may matter)

slide-79
SLIDE 79

Measured Histogram

Tutorial 7 April 2008

79 time 0.6mv/0.1ps

0.6mV is about the level of thermal noise

  • n a node in 0.18µ
slide-80
SLIDE 80

Tutorial 7 April 2008

80

Why isn’t it straight?

Vout 1.2 1.4 1.6 1.8 2 100 200 300 400 500 ps Volts High Start 1.75V Low Start

  • The starting point makes a difference
  • Early events are more affected than late ones
slide-81
SLIDE 81

Tutorial 7 April 2008

81

Histogram of events

Model Response 0.00001 0.0001 0.001 0.01 0.1 0.2 0.4 0.6 0.8 Output time, ns Events Low Start High Start Slope

Probability of an event occurring within 10ps of

a particular output time

slide-82
SLIDE 82

Tutorial 7 April 2008

82

Metastability filters

Affect response Inverters usually have a threshold close to

the metastability level

Vdd/2 Vdd/2 Vt =Vdd/4 Vdd/2

slide-83
SLIDE 83

Tutorial 7 April 2008

83

MUTEX with low threshold output

R2 R1

Low Threshold Inverter 0.000001 0.00001 0.0001 0.001 0.01 0.1 1 10 0.2 0.4 0.6 0.8 1 1.2 ns Ev ents Events

Starts high, needs to go low to give output Threshold about 100 mV low

slide-84
SLIDE 84

Tutorial 7 April 2008

84

MUTEX with filter

Filter 0.000001 0.00001 0.0001 0.001 0.01 0.1 1 10 0.2 0.4 0.6 0.8 1 1.2 ns Events Events

R1 R2

Needs more than 1V difference to give output Slower, but slope more constant

slide-85
SLIDE 85

Tutorial 7 April 2008

85

What we know

Things we know

– Synchronizers are unreliable, the more there are the more unreliable the system – How to measure reliability up to a few hours

Things we know we don’t know

– What reliability is at 3 years – How to measure it – Complex circuits give complex results, the simple MTBF formula may not apply

Things we don’t know we don’t know

– What happens on the back edge of the clock

slide-86
SLIDE 86

74F5074 Histogram

Tutorial 7 April 2008

86

Slope, τ, is about 120ps (in fast region) Typical delay time (most events) is 4ns 99.9% of clock cycles do not cause useful events To get 1 event at 7ns requires hours

  • 4ns
  • 7ns
slide-87
SLIDE 87

Tutorial 7 April 2008

87

Increasing the number of events

Test FF is driven to metastability Every clock produces a metastable response Integrator ensures half outputs high, half low

10 MHz 10 MHz

Test Test FF FF

D D Q Q

Integrator Integrator

Variable Delay

Slave Slave FF FF

D D Q Q Fast 100ps variation

slide-88
SLIDE 88

Tutorial 7 April 2008

88

What you get

Clock to D (Input)

histogram

Q to Clock (Output)

histogram

200ps 3ns

slide-89
SLIDE 89

Tutorial 7 April 2008

89

Interpreting results

Total input events normalized 0.2 0.4 0.6 0.8 1 200 250 300 350 Input time, ps Total output events normalized 0.0 0.2 0.4 0.6 0.8 1.0 3.50 4.50 5.50 Output time, ns 5000 10000 15000 20000 25000 30000 35000 40000 50 150 250 350 D to Clock, ps Events

Input time distribution is not flat Proportion of total inputs causing events vs input time Mapping output times to input times

0 < Balance point > 1

Proportion of total output events vs output time

slide-90
SLIDE 90

Tutorial 7 April 2008

90

100ps variation

∆t is the time from the

“balance point” of ~200ps

Similar to original graph BUT

∆t not events

Much quicker to gather data Reliability results days not

minutes

∆t does not depend on fc and

fd or measurement time. Events do

1.00E-17 1.00E-16 1.00E-15 1.00E-14 1.00E-13 1.00E-12 1.00E-11 1.00E-10 1.00E-09 3.00 5.00 7.00 Q time, ns Delta t

d c t

f f MTBF ∆ = 1

slide-91
SLIDE 91

Tutorial 7 April 2008

91

Deep metastability

  • Minimum deviation is 7.6ps
  • 100/7.6 = 13 times as many events with small input times (weeks not

days)

  • They occur every 100ns, too fast for the scope
  • Only 1 in 1000 captured
  • Most events still produce early output times
  • Filter them out so that the event rate is much slower
  • Results years not weeks

Scope input

Test Test FF FF D D Q Q Early Early FF FF D D Q Q Late Late FF FF D D Q Q

Scope trigger t1 (early) t2 (late)

slide-92
SLIDE 92

Tutorial 7 April 2008

92

Results of all methods

1.0E-20 1.0E-19 1.0E-18 1.0E-17 1.0E-16 1.0E-15 1.0E-14 1.0E-13 1.0E-12 1.0E-11 1.0E-10 1.0E-09 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 Output time, ns Delta t 100ps input variation 7.6ps noise Deep metastability

1.00E-19 1.00E-18 1.00E-17 1.00E-16 1.00E-15 1.00E-14 1.00E-13 1.00E-12 1.00E-11 1.00E-10 1.00E-09 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10. Q to Clock time, ns Delta t 100ps input variation 7.6ps noise Deep metastability

74F5074 Schottky bipolar 74ACT74 CMOS

Reliability measurements to 10-20 seconds (MTBF ~ 11days) Done in 3 minutes

slide-93
SLIDE 93

Tutorial 7 April 2008

93

Results

days f f MTBF

d c t

11 10 . 10 . 10 1 1

7 7 20

= = ∆ =

We can measure reliabilities of weeks not hours in a

few minutes

To get to 3 years reliability (10-22 seconds input

  • verlap?) the experiment is run for 5 hours

– picoseconds 10-12, femtoseconds 10-15 , attoseconds 10-18 , zeptoseconds 10-21, yoctoseconds 10-24

More than two slopes on one sample, 350ps, 120ps

and 140ps

We can see output events at up to 10 ns

slide-94
SLIDE 94

Tutorial 7 April 2008

94

When the clock goes low

Clock goes high, master goes

metastable

1E-18 1E-17 1E-16 1E-15 1E-14 1E-13 5.0 6.0 7.0 8.0 9.0 10.0 ns Delta t No Back Edge 4.5 Back Edge 5.5 Back Edge

D Q S

Slave latch

D Q M

Master Latch Clock Clock Inverse Clock

D Q M

Back edge of clock causes increased delay

Master output arrives at slave

– Before slave clock high: transparent gate delay td – As slave clock goes high: metastable, slightly longer delay

slide-95
SLIDE 95

Tutorial 7 April 2008

95

Effect of clock low on 74F5074

1 – 3 ns additional delay

1.00E-21 1.00E-20 1.00E-19 1.00E-18 1.00E-17 1.00E-16 1.00E-15 1.00E-14 1.00E-13 1.00E-12 1.00E-11 5.00E- 09 6.00E- 09 7.00E- 09 8.00E- 09 9.00E- 09 1.00E- 08 1.10E- 08 1.20E- 08 Output time Input time 5ns pulse 4ns pulse No back edge

6 ns pulse 4 ns pulse

slide-96
SLIDE 96

Tutorial 7 April 2008

96

On-chip metastability measurement

Analog delay replaced by digital delay (VDL) Analog integrator replaced by counter 100 MHz 100 MHz

Test Test FF FF

D D Q Q

Integrator Integrator

Variable Delay

Slave Slave FF FF

D D Q Q VDL VDL

Up/Down Counter

slide-97
SLIDE 97

Tutorial 7 April 2008

97

Variable delay stage

Pair of current

starved inverters

Source current i

variable in steps

Delay changes

can be as low as 0.1ps

Vdd In Out i Gnd

slide-98
SLIDE 98

Tutorial 7 April 2008

98

On-chip Implementation

Controlling Circuit using standard cells based design Devices under test using full custom design

Layout of on-chip measurement circuit

slide-99
SLIDE 99

Tutorial 7 April 2008

99

Devices Under Test

Jamb Latch

slide-100
SLIDE 100

Tutorial 7 April 2008

100

Devices Under Test

Robust Synchronizer

slide-101
SLIDE 101

Tutorial 7 April 2008

101

Input

Deviation of clock

0ps at trigger

Around 8-9ps

elsewhere

Data deviation

around 9.2ps

slide-102
SLIDE 102

Tutorial 7 April 2008

102

Output

τ around 30ps Does not rely

  • n oscillators

being independent

slide-103
SLIDE 103

Tutorial 7 April 2008

103

Results

Measurement Results (ps) Vdd(v) Jamb Latch B Robust Synchronizer >10-14 s <10-14 s >10-14 s <10-14 s 1.8 19.44 35.55 15.27 34.92 1.7 21.75 37.29 16.53 35.76 1.6 25.64 40.93 19.38 38.25 1.5 28.77 52.36 20.29 43.07 1.4 36.22 66.17 23.75 50.36 1.35 45.43 75.35 28.51 58.19

τ (metastability time constant) vs Vdd

slide-104
SLIDE 104

Tutorial 7 April 2008

104

Measurement results

Reliability measurements extended from

– 10-15 s or MTBF = 16 min at 10MHz, to – 10-22 s or MTBF = 3 years

We can see variations in τ not previously seen Measurement is statistical, not affected by noise Not affected by oscillator linking Back edge of clock pulse is seen to be an important effect, can

be 0 – 15τ

Demo by Jun Zhou

slide-105
SLIDE 105

Tutorial 7 April 2008

105

To learn more

Bibliography: http://www.iangclark.net/metastability. html Book: Synchronization and arbitration in digital systems David J Kinniment Wiley 2007

slide-106
SLIDE 106

Tutorial 7 April 2008

106

slide-107
SLIDE 107

Tutorial 7 April 2008

107

Synchronizers and arbiters are part analog

Synchronizers depend on small signal

parameters

Synchronization time constant τ

– 1/gain bandwidth product = τ = C/gm – dV2/dt =dV1*gm/C

Vdd/2 - dV1 gm.dV Vdd/2 + dV2 C gm.dV

t

e K V

τ

.

1 ∼

slide-108
SLIDE 108

Tutorial 7 April 2008

108

FIFO control

Read Full Data Buffer Write Empty Write pointer Read pointer Data in Data out

  • A write advances the

Write pointer

  • A read advances the

read pointer

  • The write pointer is

kept n accesses ahead of the read pointer, other wise empty is indicated

  • There must be m

locations free behind the read pointer, otherwise full is indicated

slide-109
SLIDE 109

Tutorial 7 April 2008

109

Overlapping two synchronizers

Full cycle

needed before fail status is known

BUT Synchronizers

can be

  • verlapped to

maintain throughput

Fail Speculative Synchronizer Odd Data Available Odd Fail Odd Receive clock Not Empty Speculative Synchronizer Even Data Available Even Fail Even Receive clock Data Available Odd/Even Even/Odd MPX MPX

slide-110
SLIDE 110

Tutorial 7 April 2008

110

Measuring τ by Simulation

Short FF nodes together with small offset voltage,

then open switch and measure time constant

Fairly accurate for long term τ Not practical on some library devices, FPGAs

slide-111
SLIDE 111

Tutorial 7 April 2008

111

Quasi speed independent

Assumptions

s+ must occur before Lock+

– The physics of the MUTEX are such that if r+ is before Lock+, s+ must be asserted

The three inputs to the Lock bistable are

implemented as a single complex gate set.

– A faster non speed independent implementation in which the gate is separate is possible

slide-112
SLIDE 112

Tutorial 7 April 2008

112

Dynamic priority

s q r*

C Lock Register Priority Module

MUTEX

C

s* q r

R0-7 Lock r0-7 s0-7 Reset completion detector

res_done done

P0<0..3> P1<0..3> P7<0..3> G0-7 Valid Invalid Priority data

slide-113
SLIDE 113

Tutorial 7 April 2008

113

On-chip Implementation

Controlling Counters

slide-114
SLIDE 114

Tutorial 7 April 2008

114

Results

Input Time vs Output Time