João M. Lourenço, José C. Cunha and Vitor Duarte CITI / Universidade Nova de Lisboa joao.lourenco@fct.unl.pt
1
Debugging Highly-Parallel Programs Joo M. Loureno , Jos C. Cunha and - - PowerPoint PPT Presentation
Debugging Highly-Parallel Programs Joo M. Loureno , Jos C. Cunha and Vitor Duarte CITI / Universidade Nova de Lisboa joao.lourenco@fct.unl.pt 1 Why do programs have errors? Problem Problem solved! Devise a Write a computational
1
Problem Problem solved! Write a computer program Devise a computational solution
Problem Problem solved! Write a computer program Devise a computational solution Interleaving errors
Byzantine Performance Interleaving Synchronization Ordering Sequential errors
Violations of precedence or mutual exclusion relations Ordering failures and deadlocks Unwanted side effects caused by non-reentrant code and shared data Yields a correct result, although it takes longer than acceptable
Non fail-stop errors
Multicore system
running
0, ei 1, …, ei f
k) produces the local state sk
P1 P2 P3 e1
1
e1
2
e1
3
e2
3
e2
2
e2
1
e3
1
e3
2
e3
3
e2
4
e1
4
e2
5
P1 P2 P3 e1
1
e1
2
e1
3
e2
3
e2
2
e2
1
e3
1
e3
2
e3
3
e2
4
e1
4
e2
5
P1 P2 P3 e1
1
e1
2
e1
3
e2
3
e2
2
e2
1
e3
1
e3
2
e3
3
e2
4
e1
4
e2
5
F1 F2 Inconsistent cut Consistent cut
Inconsistent global state Consistent global state
P1 P2 P3 e1
1
e1
2
e1
3
e2
3
e2
2
e2
1
e3
1
e3
2
e3
3
e2
4
e1
4
e2
5
F1 F2
20 30 40 50 60 51 61 52 62 05 15 25 00 10 03 04 14 24 34 35 45 55 65 01 11 02 12 13 21 31 41 42 32 22 23 33 43 53 44 54 63 64 P1 P2 e1
1
e1
2
e1
3
e2
3
e2
2
e2
1
e2
4
e1
4
e2
5
e1
5
e1
6
00 10 03 04 14 24 34 35 45 55 65 01 11 02 12 13 21 31 41 42 32 22 23 33 43 53 44 54 63 64 00 03 04 14 24 34 35 45 55 65 01 02 00 10 65 11 21 31 41 42 43 53 63 64 05 15 25 20 30 40 50 60 51 61 52 62 7 states 6 states 30 states
Process (P1) Process (P2) Process (PN) Events / / states
internal & interaction events
Local histories Run
arbitrary total order
Consistent run Global history
union
Cut
subset
Consistent cut Frontier of a consistent cut parallel computation
casual precedence constraints
Observation
permutation
Consistent
Process (P1) Process (P2) Process (PN) Events / / states
internal & interaction events
Local histories Run
arbitrary total order
Consistent run Global history
union
Cut
subset
Consistent cut Frontier of a consistent cut parallel computation
casual precedence constraints
Observation
permutation
Consistent
Program state Program execution Developer perspective
to obtain reproducible behavior to analyze alternative paths to evaluate correctness properties
interactive debugging
state based debugging
program states
trace, replay and debugging
deterministic re-execution repeatable
combined testing, steering and debugging
systematic state exploration alternative
global predicate detection
global program properties
consistency