Specification-less Semantic Bug Detection
Zhendong Su
ETH Zurich
Specification-less Semantic Bug Detection Zhendong Su ETH Zurich - - PowerPoint PPT Presentation
Specification-less Semantic Bug Detection Zhendong Su ETH Zurich What is the key mission of Computer Science? To help people turn creative ideas into working systems Software research is central to this mission A lot of progress to celebrate
Zhendong Su
ETH Zurich
practical & technical
qValidating compilers qValidating database engines qValidating object detection systems
qValidating compilers qValidating database engines qValidating object detection systems
All critical infrastructures & specification-less validation
4 15 19 2 4 6 8 10 12 14 16 18 20
LLVM GCC Linux
LoC (million)
$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)
http://llvm.org/bugs/show_bug.cgi?id=14972
qGeneration
u How to generate different, yet equivalent tests?
qValidation
u How to check that tests are indeed equivalent?
qBoth are long-standing hard issues
qRelax equiv. wrt a given input i
uMust: P(i) = Pk(i) on input i uOkay: P(j) ≠ Pk(j) on all input j ≠ i
qExploit close interplay between
uDynamic program execution on some input uStatic compilation for all input
program P
input I
executed unexecuted
…..
O I
O I O I O I
…..
O I
O I O I O I
equivalent modulo I
…..
O I
O’ ¹ O
I
qGeneration (easy)
u How to generate different, yet equivalent tests?
qValidation (easy)
u How to check that tests are indeed equivalent?
$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)
$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out
$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out
unexecuted
$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)
$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)
$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)
GVN: load struct using 32-bit load
$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)
GVN: load struct using 32-bit load SRoA: read past the struct’s end
è
undefined behavior
$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)
GVN: load struct using 32-bit load
remove
SRoA: read past the struct’s end
è
undefined behavior
$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out ^C
$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out ^C PRE: loop invariant
LIM
$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out ^C
integer overflow
$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out ^C
$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out
no longer a loop invariant
Athena (OOPSLA’15)
Prune & inject dead code
Hermes (OOPSLA’16)
Mutate live code
Orion (PLDI’14)
Prune dead code
GCC LLVM TOTAL Reported 841 781 1,622 Fixed 612 419 1,031
“… thanks to Zhendong Su and his team whose fuzz testing prevented many bugs going into the release …”
“Zhendong Su … for reporting numerous bugs” “Chengnian Sun … for reporting numerous bugs” “Qirun Zhang … for reporting numerous bugs”
https://gcc.gnu.org/onlinedocs//gcc/Contributors.html
PostgreSQL
Who has heard about/used these Database Management Systems?
“SQLite is the most used database engine in the world. SQLite is built into all mobile phones and most computers and comes bundled inside countless other applications that people use every day.”
https://www.sqlite.org
animal description picture Cat A cute toast cat Dog Cute dog pic Cat Cat plants (cute!)
animal_pictures
Relational Data Model
animal description picture Cat A cute toast cat Dog Cute dog pic Cat Cat plants (cute!)
animal_pictures
A database schema describes the tables (relations) in the database
Relational Data Model
animal_pictures
Structured Query Language (SQL) is a declarative DSL to query and manipulate data
SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'
Relational Data Model
Database Database Management System (DBMS)
SELECT * FROM <table> WHERE <cond>
Client Application
row1 <cond> row2 <cond> row3
¬<cond>
row1 <cond> row2 <cond>
Aim: Detect logic bugs in DBMS
Database Database Management System (DBMS)
SELECT * FROM <table> WHERE <cond>
Client Application
row1 <cond> row2 <cond> row3
¬<cond>
row1 <cond> row2 <cond>
Database Database Management System (DBMS) Client Application
row1 <cond> row2 <cond> row3
¬<cond>
SELECT * FROM <table> WHERE <cond>
row1 <cond> row3 ¬<cond>
CREATE TABLE t1(c1, c2, c3, c4, PRIMARY KEY (c4, c3)); INSERT INTO t1(c3) VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (NULL), (1), (0); UPDATE t1 SET c2 = 0; INSERT INTO t1(c1) VALUES (0), (0), (NULL), (0), (0); ANALYZE t1; UPDATE t1 SET c3 = 1; SELECT DISTINCT * FROM t1 WHERE t1.c3 = 1;
ANALYZE gathers statistics about tables, which are then used for query planning
CREATE TABLE t1(c1, c2, c3, c4, PRIMARY KEY (c4, c3)); INSERT INTO t1(c3) VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (NULL), (1), (0); UPDATE t1 SET c2 = 0; INSERT INTO t1(c1) VALUES (0), (0), (NULL), (0), (0); ANALYZE t1; UPDATE t1 SET c3 = 1; SELECT DISTINCT * FROM t1 WHERE t1.c3 = 1;
A bug in the skip-scan
this logic bug
c1 c2 c3 c4 NULL 1 NULL NULL 1 NULL NULL NULL 1 NULL
Expected result set
c1 c2 c3 c4 NULL 1 NULL
Actual result set
SQLite (~150,000 LOC) has 662 times as much test code as source code SQLite is extensively fuzzed (e.g., by Google’s OS-Fuzz Project) SQLite’s test cases achieve 100% branch test coverage SQLite’s performs anomaly testing (out-
https://www.sqlite.org/testing.html
Pivoted Query Synthesis (PQS)
>100 bugs in widely used DBMS
Column0 Column1 Column2 … … …
Valuei,0 Valuei,1 Valuei,2
… … …
Pivot Row
Randomly Generate Database
To explore “all possible database states” we randomly create databases
Randomly Generate Database Select Pivot Row
Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row animal description picture Cat Cat plants (cute!)
SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'
Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained DBMS
SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'
result set pivot row
pivot row ∈ result set
Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained DBMS
SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'
result set pivot row
pivot row ∉ result set
Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained
Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained
Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained
Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained
How do we generate this query?
Generate an expression that yields TRUE for the pivot row SELECT picture, description FROM animal_pictures WHERE
Randomly Generate Expression Evaluate Expression
Modify expression to yield TRUE Use in WHERE clause
animal description picture
https://www.sqlite.org/syntax/expr.html
animal_pictures
We first generate a random expression
animal = 'Cat' AND description LIKE '%cute%'
AND = LIKE animal 'Cat' descri ption '%cute%'
animal = 'Cat' AND description LIKE '%cute%'
Evaluate the tree based
AND = LIKE animal 'Cat' descri ption '%cute%'
AND = LIKE animal 'Cat' descri ption 'Cat' '%cute%'
Constant nodes return their assigned literal values
'%cute%'
AND = LIKE animal 'Cat' descri ption 'Cat plants (cute!)' 'Cat' 'Cat' '%cute%'
Column references return the values from the pivot row
'%cute%'
AND = LIKE animal 'Cat' descri ption 'Cat plants (cute!)' 'Cat' TRUE TRUE 'Cat' '%cute%'
Compound nodes compute their result based on their children
'%cute%'
AND = LIKE animal 'Cat' descri ption 'Cat plants (cute!)' 'Cat' TRUE TRUE TRUE 'Cat' '%cute%' '%cute%'
SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'
AND = LIKE animal 'Cat' descri ption 'Cat plants (cute!)' 'Cat' TRUE TRUE TRUE 'Cat' '%cute%'
What about when the expression does not evaluate to TRUE?
'%cute%'
= animal 'Dog' 'Cat' FALSE 'Dog'
What about when the expression does not evaluate to TRUE?
animal = 'Dog'
switch (result) { case TRUE: result = randexpr; case FALSE: result = NOT randexpr; case NULL: result = randexpr ISNULL; }
switch (result) { case TRUE: result = randexpr; case FALSE: result = NOT randexpr; case NULL: result = randexpr ISNULL; } animal = 'Dog'
FALSE
switch (result) { case TRUE: result = randexpr; case FALSE: result = NOT randexpr; case NULL: result = randexpr ISNULL; } NOT(animal = 'Dog')
TRUE
SELECT picture, description FROM animal_pictures WHERE NOT(animal = 'Dog')
PostgreSQL
We tested these (and other DBMS) in a period of 3-4 months
DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14
99 real bugs: code fixes or verified as bugs
Real Bugs
The SQLite developers quickly responded to all
DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14 Real Bugs
DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14
All MySQL bug reports were verified quickly
Real Bugs
DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14
MySQL’s trunk is unavailable, and it has a long release cycle
Real Bugs
DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14
We found the fewest bugs in PostgreSQL and not all could be easily addressed
Real Bugs
DBMS Containment Error SEGFAULT SQLite 46 17 2 MySQL 14 10 1 PostgreSQL 1 7 1 Sum 61 34 4
Real Bugs
DBMS Containment Error SEGFAULT SQLite 46 17 2 MySQL 14 10 1 PostgreSQL 1 7 1 Sum 61 34 4
Containment Oracle
Our Containment oracle allowed us to detect most errors
Real Bugs
CREATE TABLE t0(c1 TEXT PRIMARY KEY) WITHOUT ROWID; CREATE INDEX i0 ON t0(c1 COLLATE NOCASE); INSERT INTO t0(c1) VALUES ('A'); INSERT INTO t0(c1) VALUES ('a');
An index is an auxiliary data structure that should not affect the query’s result
Real Bugs Containment Oracle
CREATE TABLE t0(c1 TEXT PRIMARY KEY) WITHOUT ROWID; CREATE INDEX i0 ON t0(c1 COLLATE NOCASE); INSERT INTO t0(c1) VALUES ('A'); INSERT INTO t0(c1) VALUES ('a');
c1
'A' 'a'
Real Bugs Containment Oracle
CREATE TABLE t0(c1 TEXT PRIMARY KEY) WITHOUT ROWID; CREATE INDEX i0 ON t0(c1 COLLATE NOCASE); INSERT INTO t0(c1) VALUES ('A'); INSERT INTO t0(c1) VALUES ('a');
c1
'A' 'a'
SELECT * FROM t0;
c1
'A' û
SQLite failed to fetch 'a'!
Real Bugs Containment Oracle
CREATE TABLE t0(c0 INT PRIMARY KEY, c1 INT); CREATE TABLE t1(c0 INT) INHERITS (t0);
c0 c1 c0 c1
t0 t1
Real Bugs Containment Oracle
CREATE TABLE t0(c0 INT PRIMARY KEY, c1 INT); CREATE TABLE t1(c0 INT) INHERITS (t0); INSERT INTO t0(c0, c1) VALUES(0, 0);
c0 c1 c0 c1
t0 t1
Real Bugs Containment Oracle
CREATE TABLE t0(c0 INT PRIMARY KEY, c1 INT); CREATE TABLE t1(c0 INT) INHERITS (t0); INSERT INTO t0(c0, c1) VALUES(0, 0); INSERT INTO t1(c0, c1) VALUES(0, 1);
c0 c1 1 c0 c1 1
t0 t1
The inheritance relationship causes the row to be inserted both in t0 and t1
Real Bugs Containment Oracle
CREATE TABLE t0(c0 INT PRIMARY KEY, c1 INT); CREATE TABLE t1(c0 INT) INHERITS (t0); INSERT INTO t0(c0, c1) VALUES(0, 0); INSERT INTO t1(c0, c1) VALUES(0, 1);
c0 c1 1 c0 c1 1
t0 t1 SELECT c0, c1 FROM t0 GROUP BY c0, c1;
c0 c1
PostgreSQL failed to fetch the row 0 | 1
Real Bugs Containment Oracle
CREATE TABLE t0(c0 TINYINT); INSERT INTO t0(c0) VALUES(NULL);
c0 NULL
t0
Real Bugs Containment Oracle
CREATE TABLE t0(c0 TINYINT); INSERT INTO t0(c0) VALUES(NULL);
c0 NULL
t0
SELECT * FROM t0 WHERE NOT(t0.c0 <=> 2035382037);
c0
The MySQL-specific equality operator <=> malfunctioned for large numbers
FALSE Real Bugs Containment Oracle
DBMS Containment Error SEGFAULT SQLite 46 17 2 MySQL 14 10 1 PostgreSQL 1 7 1 Sum 61 34 4
We also found many bugs that trigger DB errors
Real Bugs Error Oracle
CREATE TABLE t1 (c0, c1 REAL PRIMARY KEY); INSERT INTO t1(c0, c1) VALUES (TRUE, 9223372036854775807), (TRUE, 0); UPDATE t1 SET c0 = NULL; UPDATE OR REPLACE t1 SET c1 = 1; SELECT DISTINCT * FROM t1 WHERE (t1.c0 IS NULL);
The INSERT and UPDATE statements corrupted the database
Database disk image is malformed
Real Bugs Error Oracle
DBMS Containment Error SEGFAULT SQLite 46 17 2 MySQL 14 10 1 PostgreSQL 1 7 1 Sum 61 34 4
We found only a low number of crash bugs, likely because DBMS are fuzzed extensively
Real Bugs SEGFAULTs
Half of all bugs can be reproduced with only 4 SQL statements
Real Bugs
SQLite developers assigned severity levels
Severity Level # Critical 14 Severe 8 Important 14
Auto-Driving Systems Surveillance Camera Surveillance Camera Medical Image Processing
“autonomous driving system failed to recognize a white truck against a bright sky” believed to be due to its failure to recognize the pedestrian in dark clothing
Object Detection
elephant; elephant
Instance Segmentation
elephant; elephant
Semantics Segmentation
grass; elephant; tree Multiple Objects No objects; just pixels
eleplant eleplantEntire Image elephant
Image Classification
Object detection in automated driving systems
Object Detection (localization + classification)
Object Detection
elephant; elephant
Instance Segmentation
elephant; elephant
Semantics Segmentation
grass; elephant; tree Multiple Objects No objects; just pixels
eleplant eleplantEntire Image elephant
Image Classification
Object detection in automated driving systems
Object Detection (localization + classification)
Focused on by existing DNN testing work
Object Detection
elephant; elephant
Instance Segmentation
elephant; elephant
Semantics Segmentation
grass; elephant; tree Multiple Objects No objects; just pixels
eleplant eleplantEntire Image elephant
Image Classification
Object detection in automated driving systems
Object Detection (localization + classification)
6.5 USD for this figure Focused on by existing DNN testing work
Instance Segmentation what to insert where to insert
Imagek Objects of “Bird” Cluster1 Cluster1 Cluster1 Clusterm Objects of “Person” Cluster1 Cluster1 Cluster1 Clustern Object Pool
…
+
Selection
Object Image Refinement & Selection Object Insertion
Image Dataset pick one background image Paste
Birdm Personn Object Refinement & Selection Object Pool Objects of “bird” Birdm Objects of “Person” Personn
…
qSMT solvers (Z3 & CVC4) qSMC tools (CPAchecker, CBMC & Seahorn) qAndroid mobile apps (going beyond crashes) qMachine translation systems qSmart contracts q…
Finding a/the right balance between human & machine collaboration
Thank you!
Finding a/the right balance between human & machine collaboration