affect ALL software Miscompilation Bug int a, c, d, e = 1, f; int - - PowerPoint PPT Presentation

affect all software miscompilation bug
SMART_READER_LITE
LIVE PREVIEW

affect ALL software Miscompilation Bug int a, c, d, e = 1, f; int - - PowerPoint PPT Presentation

affect ALL software Miscompilation Bug int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; $ gcc O0 test.c ; ./a.out else if (h) break ; $ gcc O2 test.c ; ./a.out } Floating


slide-1
SLIDE 1
slide-2
SLIDE 2

affect ALL software

slide-3
SLIDE 3

Miscompilation Bug

$ gcc –O0 test.c ; ./a.out $ gcc –O2 test.c ; ./a.out Floating point exception (core dumped) int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; else if (h) break; } } int main () { fn1 (); return 0; }

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61383

slide-4
SLIDE 4

Crashing Bug

$ clang –O0 test.c $ clang –O1 test.c clang: Assertion failed. clang: error: Aborted (core dumped) int a; struct S0 { int f0; int f1; int f2; }; void fn1 () { int b = -1; struct S0 f[1]; if (a) { f[0] = f[b]; } } int main () { fn1 (); return 0; }

https://llvm.org/bugs/show_bug.cgi?id=18615

slide-5
SLIDE 5
  • Generate valid test programs
  • No undefined behavior
  • Determine the semantics of test programs
  • No referencing compilers
slide-6
SLIDE 6

(EMI)

generates valid, “equivalent” programs from existing programs

*: V. Le, M. Afshari, and Z. Su. Compiler validation via equivalence modulo inputs. PLDI ‘14

slide-7
SLIDE 7

program P input I

slide-8
SLIDE 8

program P

  • utput O

input I

executed unexecuted

slide-9
SLIDE 9

…..

EMI

  • utput O

input I

slide-10
SLIDE 10

…..

EMI

  • utput O

input I

  • equiv. w.r.t I
slide-11
SLIDE 11
  • Randomly removes unexecuted code
  • Limitation
  • Limited number of variants
  • Limited control- and data-flow diversity
  • Random generation

Naïve EMI Instantiation

ã

(*) V. Le, M. Afshari, and Z. Su. Compiler validation via equivalence modulo inputs. PLDI ‘14

slide-12
SLIDE 12
  • Better mutation: deletion + injection
  • Generates unlimited and diverse variants
  • Guided generation: MCMC sampling
  • Exposes deep compiler bugs

MCMC: Markov Chain Monte Carlo

slide-13
SLIDE 13

What to Inject?

<context, statement>

stmt-extractor

existing code

Context: conditions to apply a statement

  • Used variables, functions, types, goto labels
  • Other properties (e.g., inserted loc must be in a loop)
slide-14
SLIDE 14

How to Inject?

  • utput O

input I <σs, s>

σ ⊨ σs σ

<context, statement>

slide-15
SLIDE 15
  • utput O

input I

? ?

slide-16
SLIDE 16

Goal: generate more diverse variants

  • ptimization problem
slide-17
SLIDE 17

Program Distance wℎ𝑓𝑠𝑓 𝑒 𝐵, 𝐶 = 1 − 𝐵 ∩ 𝐶 𝐵 ∪ 𝐶 𝑗𝑡 𝐾𝑏𝑑𝑑𝑏𝑠𝑒 𝑒𝑗𝑡𝑢𝑏𝑜𝑑𝑓

∆ 𝑄, 𝑅 = 𝛽 ∗ 𝑒 𝑄𝑂𝑝𝑒𝑓𝑡, 𝑅𝑂𝑝𝑒𝑓𝑡 + 𝛾 ∗ 𝑒 𝑄𝐹𝑒𝑕𝑓𝑡, 𝑅𝐹𝑒𝑕𝑓𝑡 − 𝛿 ∗ |𝑄 − 𝑅|

slide-18
SLIDE 18

Sampling High-value EMI Variants

slide-19
SLIDE 19

Sampling High-value EMI Variants

slide-20
SLIDE 20

Sampling High-value EMI Variants

slide-21
SLIDE 21

Sampling High-value EMI Variants

slide-22
SLIDE 22

Sampling High-value EMI Variants

slide-23
SLIDE 23

Sampling High-value EMI Variants

….

slide-24
SLIDE 24

$ gcc –O0 test.c ; ./a.out $ gcc –O2 test.c ; ./a.out Floating point exception (core dumped) int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; else if (h) break; } } int main () { fn1 (); return 0; }

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61383

slide-25
SLIDE 25

int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; else c = 1; } } int main () { fn1 (); return 0; } int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; else if (h) break; } } int main () { fn1 (); return 0; } ==DB Entry== requires_loop i: int

  • if (i) break;

ã å ã å

Athena

slide-26
SLIDE 26

int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; else if (h) break; } } int main () { fn1 (); return 0; }

PRE: Partial Redundancy Elimination PRE: loop invariant

slide-27
SLIDE 27

int a, c, d, e = 1, f; int fn1 () { int h; int g = 1 % f; for (; d < 1; d = e) { h = (f == 0) ? 0 : g ; if (f < 1) c = 0; else if (h) break; } } int main () { fn1 (); return 0; }

LIM: Loop Invariant Motion LIM: hoist (1 % f)

$ gcc –O0 test.c ; ./a.out $ gcc –O2 test.c ; ./a.out Floating point exception (core dumped)

slide-28
SLIDE 28

$ clang –O0 test.c $ clang –O1 test.c clang: Assertion failed. clang: error: Aborted (core dumped) int a; struct S0 { int f0; int f1; int f2; }; void fn1 () { int b = -1; struct S0 f[1]; if (a) { f[0] = f[b]; } } int main () { fn1 (); return 0; }

https://llvm.org/bugs/show_bug.cgi?id=18615

slide-29
SLIDE 29

int a; struct S0 { int f0; int f1; int f2; }; void fn1 () { int b = -1; struct S0 f[1]; if (a) { f[0].f0 = b; } } int main () { fn1 (); return 0; } int a; struct S0 { int f0; int f1; int f2; }; void fn1 () { int b = -1; struct S0 f[1]; if (a) { f[0] = f[b]; } } int main () { fn1 (); return 0; }

Athena

=======DB Entry====== g: struct (int x int x int) [1] c: int

  • g[0] = g[c];

ã å

slide-30
SLIDE 30

int a; struct S0 { int f0; int f1; int f2; }; void fn1 () { int b = -1; struct S0 f[1]; if (a) { f[0] = f[b]; } } int main () { fn1 (); return 0; }

https://llvm.org/bugs/show_bug.cgi?id=18615 Assertion Violation: negative index

slide-31
SLIDE 31
  • Two machines running in 19 months
  • Seed programs: Csmith[1]
  • Hard to reduce real-world projects
  • Statement database: seed program
  • Real-world code cannot be inserted into Csmith seeds

effectively

[1] X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. PLDI ‘11

slide-32
SLIDE 32

69 3

TOTAL BUGS

Fixed Confirmed 40 32

COMPILERS

GCC LLVM 27 32 5

BUG TYPES

Wrong Crash Perf

19 months

slide-33
SLIDE 33
  • Developers fixed our bugs (69/72)
  • 17/40 GCC bugs are P1 (highest priority)
  • 3 GCC bugs linked to real-world projects
  • GCC
  • QtWebKit
  • glibc
slide-34
SLIDE 34

Run Athena and Orion in parallel on 15 bugs in 1 week

Bug ID Affected Versions Affected Opt Levels Seed SLOC Variant SLOC Database Rows Recovered Bugs Generated Variants gcc-59903 4.8, 4.9

  • O3

4,694 6,238 1,723 14 23,479 gcc-60116 4.8, 4.9

  • Os

11,596 11,843 3,092 367 20,082 gcc-60382 4.8, 4.9

  • O3

6,151 21,903 1,989 19 21,267 gcc-61383 4.8, 4.9, 4.10

  • O2, -O3

3,298 3,567 1,272 106 32,981 gcc-61452 4.8, 4.9, 4.10, 5.0

  • O1, -Os

3,308 3,474 885 49,158 gcc-61917 4.9, 4.10, 5.0

  • O3

11,820 11,226 3,066 2 32,562 gcc-64495 4.8, 4.9, 4.10, 5.0

  • O3

2,767 1,951 517 4 45,896 gcc-64663 4.6, 4.7, 4.8, 4.9, 4.10, 5.0

  • O1, -Os, -O2, -O3

11,118 12,160 2,875 26,626 llvm-20494 3.2, 3.3, 3.4, 3.5

  • O2, -O3

8,080 11,009 1,683 2,660 24,588 llvm-20680 3.5, 3.6

  • O3

6,250 7,584 1,753 22 23,438 llvm-21512 3.5, 3.6

  • O1, -Os, -O2, -O3

8,455 5,087 3,081 988 21,882 llvm-22086 3.5, 3.6

  • Os, -O2, -O3

5,220 8,495 1,711 29,279 llvm-22338 3.5, 3.6, 3.7

  • O2, -O3

2,923 7,197 1,302 13 19,469 llvm-22382 3.2, 3.3, 3.4, 3.5, 3.6, 3.7

  • Os, -O2, -O3

4,813 2,147 1,432 29,805 llvm-22704 3.6, 3.7

  • O1, -Os, -O2, -O3

3,684 23,250 981 12 28,740

slide-35
SLIDE 35

Baseline: coverage of 100 seeds (GCC 34.9%, LLVM 23.5%)

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Orion 10 Athena 10 Orion 25 Athena 25 Orion 50 Athena 50 Orion 100 Athena 100 Coverage Improvements (%) Orion & Athena Configurations (# variants) GCC LLVM

slide-36
SLIDE 36

seed

Orion’s space Athena’s space

slide-37
SLIDE 37

Questions?

slide-38
SLIDE 38

GCC LLVM TOTAL Fixed 39 30 69 Not-Yet-Fixed 1 2 3 WorksForMe 3 3 Duplicate 3 4 7 Invalid 1 1 TOTAL 44 39 83