DCCA 97 System a tic F o rm al V erication fo r F - - PowerPoint PPT Presentation

dcca 97 system a tic f o rm al v eri cation fo r f ault t
SMART_READER_LITE
LIVE PREVIEW

DCCA 97 System a tic F o rm al V erication fo r F - - PowerPoint PPT Presentation

DCCA 97 System a tic F o rm al V erication fo r F ault-T olerant Tim e-T riggered Algo rithm s John Rushb y Com puter Science Lab o rato ry SRI International Menlo P a rk CA USA F o rm al V


slide-1
SLIDE 1 DCCA 97
slide-2
SLIDE 2 System a tic F
  • rm
al V erication fo r F ault-T
  • lerant
Tim e-T riggered Algo rithm s John Rushb y Com puter Science Lab
  • rato
ry SRI International Menlo P a rk CA USA F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 1
  • f
24
slide-3
SLIDE 3 Overview
  • Many
fault-tolerant algo rithm s a re relatively easy to understand and to verify in an abstract, untim ed fo rm ulation
  • But
verications
  • f
im plem enta tions, with all their tim i ng pa ram e ters, a re quite com plex
  • So
split the p roblem into t w
  • pa
rts
  • V
erify abstract algo rithm fo r an untim e d synchronous system m
  • del
? Must b e done fo r each algo rithm ? Relatively easy|and can itself b e split into t w
  • pa
rts
  • V
erify tim e-tri ggered im plem ent ati
  • n
  • f
the untim ed m
  • del
? Can b e done
  • nce-and-fo
r-all ? Is the m a in topic
  • f
this pap er
  • Provides
sim ple path from veried design to im plem ent ation F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 2
  • f
24
slide-4
SLIDE 4 Synchronous System s
  • Kno
wn upp er b
  • unds
  • n
  • Tim
e required fo r nonfault y p ro cesso rs to p erfo rm
  • p
erations
  • Messages
dela ys in the absence
  • f
faults
  • Assum
ptions a re valid fo r em b edded real-tim e control system s
  • The
classical p roblem s
  • f
fault-tolerant distributed system s can b e solved under these assum ptions
  • Consensus
(Byzantine Agreem ent)
  • Group
Mem b ership
  • Etc.
Whereas they cannot b e solved in asynchronous system s
  • F
  • cus
here is exclusively
  • n
synchronous system s F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 3
  • f
24
slide-5
SLIDE 5 F
  • rm
al Synchronous System Mo del
  • Algo
rithm s execute in a series
  • f
rounds, num b ered 0; 1; . . .
  • Each
round has t w
  • phases
Com m unication Phase: each p ro cesso r sends m e ssages to (som e
  • r
all)
  • ther
p ro cesso rs
  • Messages
sent, and where to, dep end
  • n
current state
  • m
sg p (s; q ) is the m essage sent b y p to q when p's state is s Com putation Phase: each p ro cesso r up dates its state
  • New
state dep ends
  • n
p revious state and
  • n
m essa ges received during com m unica tion phase
  • trans
p (s; i) is p's new state, when its current state is s and the set
  • f
m essages received is i F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 4
  • f
24
slide-6
SLIDE 6 Synchronous System Mo del: Op eration
  • Pro
cesso rs
  • p
erate in lo ckstep
  • All
p erfo rm the com m unicat ion phase
  • f
the current round
  • Then
the com putat ion phase
  • Then
m
  • ve
  • n
to the next round, and so
  • n
  • Com
putation and m e ssage transm ission happ en instantaneously and atom i cally
  • Pro
cesso rs a re p erfectly synchronized and p erfo rm their actions sim ultaneously
  • No
sense
  • f
real tim e (hence untim ed system m
  • del)
F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 5
  • f
24
slide-7
SLIDE 7 Exam ple: Oral Messages Algo rithm fo r Consensus, OM(1) T ransm itt er p ro cesso r has a value to b e com m unicat ed reliably to three
  • r
m
  • re
receivers in the p resence
  • f
  • ne
a rbitra ry fault Round 0: Com m unication Phase: The transm i tte r sends its value to the receivers; receivers send no m essages Com putation Phase: Each receiver sto res the value received from the transm it ter in its state Round 1: Com m unication Phase: Each receiver sends value sto red in its state to all
  • ther
receivers; transm itt er sends nothing Com putation Phase: Each receiver decides
  • n
the m a jo rit y value am
  • ng
those received from the
  • ther
receivers and that (sto red in its state) received from the transm i tte r F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 6
  • f
24
slide-8
SLIDE 8 Im plem e nting Algo rithm s fo r Synchronous System s Have to deal with the realit y that events a re not instantaneous, atom ic, and sim ult aneous
  • Com
m unicat ions and com puta tions tak e tim e
  • Tim
eouts needed to detect failed com m unications
  • Pro
cesso rs a re not p erfectly synchronized
  • And
run at dierent rates Tw
  • app
roaches Event triggered: p ro cesso rs react to incom ing m essages; set tim eouts
  • n
  • utgoing
m essages Tim e triggered: p ro cesso rs p erfo rm actions acco rding to a com m
  • n
schedule, driven b y their
  • wn
internal clo cks
  • Preferred
fo r critical app'ns: SAFEbus, TTP , Shink ansen F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 7
  • f
24
slide-9
SLIDE 9 Tim e-T riggered System Mo del

computation communication communication computation sched(r) dur(r) sched(r+1) P(r) D(r)

F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 8
  • f
24
slide-10
SLIDE 10 Issues in V erifying the Tim e- T riggered Im plem entation
  • Pro
cesso r clo cks a re not p erfectly synchronized
  • One
p ro cesso r m a y send m essage b efo re
  • r
after another
  • ne
exp ects it; m a y not even b e
  • n
the sam e round
  • Therefo
re require a b
  • und
  • n
synchronization sk ew
  • Can
b e ensured b y clo ck synchronization algo rithm s
  • Pro
cesso r clo cks do not run at the sam e rate
  • Durations
  • f
the phases m a y dier
  • n
dierent p ro cesso rs
  • Therefo
re require that go
  • d
p ro cesso rs' clo cks run at rates within som e b
  • und
  • f
each
  • ther
  • Unp
redictable dela ys in m essage transm ission
  • Message
m a y a rrive after com m unicat ions phase has ended
  • Therefo
re require upp er b
  • und
  • n
nonfault y m essage dela ys
  • Need
to a rrange pacing and tim e
  • uts
so that it all w
  • rks
F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 9
  • f
24
slide-11
SLIDE 11 Clo cks
  • Each
p ro cesso r has a clo ck, that reads clo cktim e
  • Clo
cktim es denoted b y upp er-case letters (T ,
  • etc.),
  • There
is an abstract, universal, tim e called realtim e
  • Realtim
es denoted b y lo w er-case letters (t,
  • etc.)
  • C
p (t) is the clo cktim e
  • n
p's clo ck at realtim e t F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 10
  • f
24
slide-12
SLIDE 12 Clo ck Assum ptions Monotonicit y: Nonfault y clo cks a re m
  • notonic
increasing functions: t 1 < t 2 ) C p (t 1 ) < C p (t 2 ) Clo ck Drift Rate: Nonfault y clo cks drift from realtim e at a rate b
  • unded
b y a sm al l p
  • sitive
quantit y
  • (t
ypically
  • <
10 6 ): (1
  • )(t
1
  • t
2 )
  • C
p (t 1 )
  • C
p (t 2 )
  • (1
+ )(t 1
  • t
2 ) Clo ck Synchronization: The clo cks
  • f
nonfault y p ro cesso rs a re synchronized within som e sm all clo cktim e b
  • und
: jC p (t)
  • C
q (t)j
  • Achieving
these requires ca re in im plem ent ati
  • n,
since som e clo ck synchronization algo rithm s violate m
  • notonicit
y . Ho w ever, m
  • notonicit
y can alw a ys b e achieved, with no loss
  • f
p recision F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 11
  • f
24
slide-13
SLIDE 13 Tim e-T riggered System Mo del Each p ro cesso r
  • Sta
rts round r at clo cktim e sched (r ) b y its lo cal clo ck
  • Sends
its m essages D (r ) clo cktim e units into the round
  • Sta
rts com putat ion phase P (r ) clo cktim e units into the round
  • So
duration
  • f
r 'th com m unicati
  • n
phase is P (r )
  • Finishes
the round after dur (r ) clo cktim e units
  • dur
(r ) = sched (r + 1)
  • sched
(r )
  • So
duration
  • f
r 'th com putation phase is dur (r )
  • P
(r ) Additional Assum ption Maxim um Dela y: m e ssages a re received within
  • realtim
e units F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 12
  • f
24
slide-14
SLIDE 14 Constraints 1. dur (r ) > P (r ) > D (r ) >
  • The
com m unicat ion phase is
  • f
p
  • sitive
duration
  • The
com putati
  • n
phase sta rts after the m essages a re sent and is
  • f
p
  • sitive
duration 2. D (r )
  • The
dela y b efo re m e ssages a re sent is greater than the clo ck sk ew (so m essages do not a rrive while the receiving p ro cesso r is still in the p revious round) 3. P (r ) > D (r ) +
  • +
(1 + )
  • The
com m unicat ion phase m ust last long enough that all m essages have tim e to reach their destination p ro cesso r while it is still in its com m unica tion phase F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 13
  • f
24
slide-15
SLIDE 15 F ault Mo del
  • F
aults a re m
  • deled
as changes in the m sg p and trans p functions
  • Will
p rove that untim ed m
  • del
and tim e- triggered im plem e ntat ion have sam e b ehavio r, given sam e m sg p and trans p functions, fo r any such functions
  • Thus,
if an algo rithm is p roved fault tolerant in the untim ed m
  • del
with resp ect to a fault m
  • del
that can b e exp ressed as p erturbations to the m sg p and trans p functions, then im plem e ntat ion inherits those fault-tolerance p rop erties
  • Ho
w ever, im plem enta tion adm i ts new faults
  • Loss
  • f
clo ck synchronization
  • Sha
red buses (babbling idiot fault m
  • de)
Must tak e ca re to m inim ize these and to ensure that those not m ask ed a re transfo rm e d into sim plest
  • f
the m
  • deled
faults F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 14
  • f
24
slide-16
SLIDE 16 Co rresp
  • ndence
b et w een Rounds
  • W
ant to ensure that untim ed synchronous m
  • del
and its tim e
  • triggered
im plem enta tion p ro duce sam e b ehavio r
  • i.e.,
p rove that state
  • f
the system at the sta rt
  • f
each round is the sam e in b
  • th
m
  • del
and im plem e ntat ion
  • But
when do es a round sta rt in the im plem e ntat ion?
  • Dene
the global sta rt fo r round r to b e the realtim e g s(r ) when the p ro cesso r with the slo w est clo ck b egins round r
  • Then
g s(r ) satises the constraints: 8q : C q (g s(r ))
  • sched
(r ); and 9p : C p (g s(r )) = sched (r ) (intuitively , p is the p ro cesso r with the slo w est clo ck) F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 15
  • f
24
slide-17
SLIDE 17 Co rrectness Theo rem : Given the sam e initial states and sam e m sg p and trans p functions, the state
  • f
each p ro cesso r in the untim ed synchronous system at the sta rt
  • f
the r 'th round is the sam e as its state at tim e g s(r ) in the tim e-t riggered im plem entat ion Pro
  • f:
By induction|see pap er fo r details F
  • rm
a l V erication: Has b een fo rm ally sp ecied and m echanica lly veried using SRI's verication system , PVS
  • F
  • rm
a l verication to
  • k
ab
  • ut
a da y
  • Allo
w ed easy generalization from xed
  • sets
D and P to round-sp ecic D (r ) and P (r )
  • See
long version
  • f
pap er|available
  • n
the W eb at http://www.csl.sri.com/dcca97 .html
  • PVS
sp ecication and p ro
  • f
les available there also F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 16
  • f
24
slide-18
SLIDE 18 Synchronous Algo rithm s as F unctional Program s
  • Theo
rem establishes co rrectness
  • f
tim e- triggered im plem e ntat ions fo r synchronous algo rithm s
  • But
fo rm a l verication
  • f
a synchronous algo rithm can still b e quite dicult
  • Rounds
and phases have an
  • p
erational cha racter that is a wkw a rd to rep resent in fo rm a l logic
  • F
unctional p rogram s a re m uch easier
  • So
establish a system a tic transfo rm a tion b et w een synchronous system s and functional p rogram s.
  • Describ
e b y exam ple: OM(1) F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 17
  • f
24
slide-19
SLIDE 19 Sp ecication
  • f
OM(1) as a F unctional Program First step is to m
  • del
sending
  • f
m essages
  • F
unction send (r ; v ; p; q ) rep resents sending
  • f
a m essage with value v from p ro cesso r p to p ro cesso r q in round r
  • V
alue
  • f
the function is the m e ssage received b y q
  • If
p and q a re nonfault y , this value is v : nonfault y (p) ^ nonfault y (q ) ) send (r ; v ; p; q ) = v ;
  • Otherwise
it dep ends
  • n
the fault m
  • des
considered
  • Here
it is left entirely unconstrained (Byzantine fault m
  • del)
F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 18
  • f
24
slide-20
SLIDE 20 Sp ecication
  • f
OM(1) as a F unctional Program (ctd. 1) T is the transm itt er, v its value, and q an a rbitra ry receiver Round 0, com m unication phase: T sends v to each q : send (0; v ; T ; q ) Round 0, com putation phase: do nothing (instead
  • f
sto ring value received, q sends it to itself in next phase) Round 1, com m unication phase: Each q sends the value received in the rst round to each receiver p (including itself ): send (1; send (0; v ; T ; q ); q ; p) F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 19
  • f
24
slide-21
SLIDE 21 Sp ecication
  • f
OM(1) as a F unctional Program (Ctd. 2) Round 1, com putation phase: p gathers all the m essages just received and votes them
  • \Gathers"
rep resented b y
  • abstraction:
q : send (1; send (0; v ; T ; q ); q ; p) (i.e., a function that, when applied to q , returns the value that p received from q )
  • m
aj (caucus ; votes ) tak es a function votes from p ro cesso rs to values, and returns the m a jo rit y value if
  • ne
exists, am
  • ng
the p ro cesso rs in caucus;
  • therwise
som e functionally determ i ned value
  • Then
p's decision is given b y m aj (rcvrs ; q : send (1; send (0; v ; T ; q ); q ; p)) where rcvrs is the set
  • f
all receiver p ro cesso rs. F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 20
  • f
24
slide-22
SLIDE 22 Sp ecication
  • f
OM(1) as a F unctional Program (Ctd. 3) Rep resented as the (higher-o rder) function O M 1: O M 1(T ; v )(p) = m a j (rcvrs ; q : send (1; send (0; v ; T ; q ); q ; p)) O M 1(T ; v )(p) is the decision reached b y each receiver p when the (p
  • ssibly
fault y) transm i tte r T sends the value v Prop erties required Agreem ent: nonfault y receivers agree, even if fault y transm it ter sends dierent values nonfault y (p) ^ nonfault y (q ) ) O M 1(T ; v )(p) = O M 1(T ; v )(q ) V alidit y: when the transm itt er is nonfault y , all nonfault y receivers get the co rrect value nonfault y (T ) ^ nonfault y (p) ) O M 1(T ; v )(p) = v F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 21
  • f
24
slide-23
SLIDE 23 V erication
  • f
OM(1) as a F unctional Program
  • The
great advantage
  • f
this rep resentation is that it is exp ressed in regula r (higher-o rder) logic and highly autom at ed theo rem p roving can b e used to verify algo rithm p rop erties
  • F
  • r
exam ple: OM(1) Agreem e nt: PVS can p rove the n = 4 instance autom a tica lly , requires just eight com m ands to p rove the general case V alidit y: PVS can p rove the general case autom ati cally
  • It
w
  • uld
b e m uch ha rder to do m echanized fo rm al verication fo r
  • riginal
rep resentation
  • f
OM(1) as a synchronous algo rithm
  • And
verication
  • f
an event-driven fo rm ula tion
  • f
OM(1) b y Lam p
  • rt
and Merz required signicant eo rt F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 22
  • f
24
slide-24
SLIDE 24 Sum m a ry

transformation Synchronous System Functional Program Required Properties Time-Triggered Implementation formal verification systematic

  • ne-time

verification

F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 23
  • f
24
slide-25
SLIDE 25 Conclusions and F uture W
  • rk
  • This
app roach reduces and system a tizes the eo rt required to verify (som e) tim e-t riggered algo rithm s
  • Sim
plicit y
  • f
the fo rm ula tion and p ro
  • f
  • f
the theo rem suggests that tim e-tri ggered system s a re the natural realization
  • f
the synchronous system m
  • del
  • F
uture w
  • rk:
  • F
  • rm
a lize and verify the transfo rm a tion b et w een synchronous system s and functional p rogram s
  • Apply
to m
  • re
dicult algo rithm s ? Currently w
  • rking
  • n
a group m em b er ship algo rithm sim ila r to that in TTP F
  • rm
al V erication
  • f
Tim e
  • T
rigg ered Algo rithm s 24
  • f
24