Specification-less Semantic Bug Detection Zhendong Su ETH Zurich - - PowerPoint PPT Presentation

specification less semantic bug detection
SMART_READER_LITE
LIVE PREVIEW

Specification-less Semantic Bug Detection Zhendong Su ETH Zurich - - PowerPoint PPT Presentation

Specification-less Semantic Bug Detection Zhendong Su ETH Zurich What is the key mission of Computer Science? To help people turn creative ideas into working systems Software research is central to this mission A lot of progress to celebrate


slide-1
SLIDE 1

Specification-less Semantic Bug Detection

Zhendong Su

ETH Zurich

slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4

What is the key mission of Computer Science?

slide-5
SLIDE 5

To help people turn creative ideas into working systems

slide-6
SLIDE 6

Software research is central to this mission

slide-7
SLIDE 7

A lot of progress to celebrate for!

slide-8
SLIDE 8

To help people turn creative ideas into working systems

slide-9
SLIDE 9

P ⊨ " ?

slide-10
SLIDE 10

P ⊨ " ?

but … where is " ?

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13

A dilemma

Need ! to show P ⊨ !

slide-14
SLIDE 14

A dilemma

Need ! to show P ⊨ ! … but ! is not available

slide-15
SLIDE 15

A dilemma

Need ! to show P ⊨ ! … but ! is not available One of the greatest challenges!

slide-16
SLIDE 16

A dilemma

Need ! to show P ⊨ ! … but ! is not available One of the greatest challenges!

practical & technical

slide-17
SLIDE 17

3 mitigation examples

qValidating compilers qValidating database engines qValidating object detection systems

slide-18
SLIDE 18

3 mitigation examples

qValidating compilers qValidating database engines qValidating object detection systems

All critical infrastructures & specification-less validation

slide-19
SLIDE 19

Validate Production Compilers

slide-20
SLIDE 20

Compiler complexity

4 15 19 2 4 6 8 10 12 14 16 18 20

LLVM GCC Linux

LoC (million)

slide-21
SLIDE 21

LLVM bug 14972

$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)

slide-22
SLIDE 22

Developer comment

“... very, very concerning when I got to the root cause, and very annoying to fix …”

http://llvm.org/bugs/show_bug.cgi?id=14972

slide-23
SLIDE 23

Vision P ≡

P1 P2 P3 Pk Pn …

slide-24
SLIDE 24

Key challenges

qGeneration

u How to generate different, yet equivalent tests?

qValidation

u How to check that tests are indeed equivalent?

qBoth are long-standing hard issues

slide-25
SLIDE 25
  • Equiv. modulo inputs

qRelax equiv. wrt a given input i

uMust: P(i) = Pk(i) on input i uOkay: P(j) ≠ Pk(j) on all input j ≠ i

qExploit close interplay between

uDynamic program execution on some input uStatic compilation for all input

slide-26
SLIDE 26

Profile

program P

  • utput O

input I

executed unexecuted

slide-27
SLIDE 27

Mutate

…..

O I

O I O I O I

slide-28
SLIDE 28

EMI

…..

O I

O I O I O I

equivalent modulo I

slide-29
SLIDE 29

Find bugs

…..

O I

O’ ¹ O

I

slide-30
SLIDE 30

Revisit challenges

qGeneration (easy)

u How to generate different, yet equivalent tests?

qValidation (easy)

u How to check that tests are indeed equivalent?

slide-31
SLIDE 31

LLVM bug 14972

$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)

slide-32
SLIDE 32

Seed file

$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out

slide-33
SLIDE 33

Seed file

$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out

unexecuted

slide-34
SLIDE 34

Transformed file

$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)

slide-35
SLIDE 35

Reduced file

$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)

slide-36
SLIDE 36

LLVM bug autopsy

$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)

GVN: load struct using 32-bit load

slide-37
SLIDE 37

LLVM bug autopsy

$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)

GVN: load struct using 32-bit load SRoA: read past the struct’s end

è

undefined behavior

slide-38
SLIDE 38

LLVM bug autopsy

$ clang –m32 –O0 test.c ; ./a.out $ clang –m32 –O1 test.c ; ./a.out Aborted (core dumped)

GVN: load struct using 32-bit load

remove

SRoA: read past the struct’s end

è

undefined behavior

slide-39
SLIDE 39

GCC bug 58731

$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out ^C

slide-40
SLIDE 40

GCC bug autopsy

$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out ^C PRE: loop invariant

slide-41
SLIDE 41

GCC bug autopsy

LIM

$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out ^C

slide-42
SLIDE 42

GCC bug autopsy

integer overflow

$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out ^C

slide-43
SLIDE 43

Seed program

$ gcc –O0 test.c ; ./a.out $ gcc –O3 test.c ; ./a.out

no longer a loop invariant

slide-44
SLIDE 44

Athena (OOPSLA’15)

Prune & inject dead code

Hermes (OOPSLA’16)

Mutate live code

Orion (PLDI’14)

Prune dead code

slide-45
SLIDE 45

Bug counts

GCC LLVM TOTAL Reported 841 781 1,622 Fixed 612 419 1,031

  • ISSTA’15: Stress-testing link-time optimization
  • ICSE’16: Analyzing compilers’ diagnostic support
  • PLDI’17: Skeletal program enumeration (SPE)
slide-46
SLIDE 46

LLVM 3.9 & 4.0 Release Notes

“… thanks to Zhendong Su and his team whose fuzz testing prevented many bugs going into the release …”

slide-47
SLIDE 47

GCC’s list of contributors

“Zhendong Su … for reporting numerous bugs” “Chengnian Sun … for reporting numerous bugs” “Qirun Zhang … for reporting numerous bugs”

https://gcc.gnu.org/onlinedocs//gcc/Contributors.html

slide-48
SLIDE 48

Validate Database Engines

slide-49
SLIDE 49

Database engines (DBMS)

PostgreSQL

Who has heard about/used these Database Management Systems?

slide-50
SLIDE 50

DBMS widely used

“SQLite is the most used database engine in the world. SQLite is built into all mobile phones and most computers and comes bundled inside countless other applications that people use every day.”

https://www.sqlite.org

slide-51
SLIDE 51

animal description picture Cat A cute toast cat Dog Cute dog pic Cat Cat plants (cute!)

animal_pictures

Relational Data Model

slide-52
SLIDE 52

animal description picture Cat A cute toast cat Dog Cute dog pic Cat Cat plants (cute!)

animal_pictures

A database schema describes the tables (relations) in the database

Relational Data Model

slide-53
SLIDE 53

animal_pictures

Structured Query Language (SQL) is a declarative DSL to query and manipulate data

SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'

Relational Data Model

slide-54
SLIDE 54

DBMS

Database Database Management System (DBMS)

SELECT * FROM <table> WHERE <cond>

Client Application

ü

row1 <cond> row2 <cond> row3

¬<cond>

row1 <cond> row2 <cond>

slide-55
SLIDE 55

Goal

Aim: Detect logic bugs in DBMS

slide-56
SLIDE 56

DBMS

Database Database Management System (DBMS)

SELECT * FROM <table> WHERE <cond>

Client Application

ü

row1 <cond> row2 <cond> row3

¬<cond>

row1 <cond> row2 <cond>

slide-57
SLIDE 57

DBMS

Database Database Management System (DBMS) Client Application

û

row1 <cond> row2 <cond> row3

¬<cond>

SELECT * FROM <table> WHERE <cond>

row1 <cond> row3 ¬<cond>

slide-58
SLIDE 58

Example SQLite bug

CREATE TABLE t1(c1, c2, c3, c4, PRIMARY KEY (c4, c3)); INSERT INTO t1(c3) VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (NULL), (1), (0); UPDATE t1 SET c2 = 0; INSERT INTO t1(c1) VALUES (0), (0), (NULL), (0), (0); ANALYZE t1; UPDATE t1 SET c3 = 1; SELECT DISTINCT * FROM t1 WHERE t1.c3 = 1;

ANALYZE gathers statistics about tables, which are then used for query planning

slide-59
SLIDE 59

Example SQLite bug

CREATE TABLE t1(c1, c2, c3, c4, PRIMARY KEY (c4, c3)); INSERT INTO t1(c3) VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (NULL), (1), (0); UPDATE t1 SET c2 = 0; INSERT INTO t1(c1) VALUES (0), (0), (NULL), (0), (0); ANALYZE t1; UPDATE t1 SET c3 = 1; SELECT DISTINCT * FROM t1 WHERE t1.c3 = 1;

A bug in the skip-scan

  • ptimization caused

this logic bug

c1 c2 c3 c4 NULL 1 NULL NULL 1 NULL NULL NULL 1 NULL

Expected result set

c1 c2 c3 c4 NULL 1 NULL

Actual result set

slide-60
SLIDE 60

DBMS (very) well-tested

SQLite (~150,000 LOC) has 662 times as much test code as source code SQLite is extensively fuzzed (e.g., by Google’s OS-Fuzz Project) SQLite’s test cases achieve 100% branch test coverage SQLite’s performs anomaly testing (out-

  • f-memory, I/O error, power failures)

https://www.sqlite.org/testing.html

  • Small. Fast. Reliable. Choose any three.
slide-61
SLIDE 61

Pivoted Query Synthesis (PQS)

>100 bugs in widely used DBMS

slide-62
SLIDE 62

PQS idea

Column0 Column1 Column2 … … …

Valuei,0 Valuei,1 Valuei,2

… … …

Pivot Row

slide-63
SLIDE 63

Database generation

Randomly Generate Database

To explore “all possible database states” we randomly create databases

slide-64
SLIDE 64

Pivot row selection

Randomly Generate Database Select Pivot Row

slide-65
SLIDE 65

Query generation

Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row animal description picture Cat Cat plants (cute!)

SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'

slide-66
SLIDE 66

Verifying the result

Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained DBMS

ü

SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'

result set pivot row

pivot row ∈ result set

slide-67
SLIDE 67

Verifying the result

Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained DBMS

SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'

result set pivot row

û

pivot row ∉ result set

slide-68
SLIDE 68

Approach

Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained

slide-69
SLIDE 69

Approach

Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained

slide-70
SLIDE 70

Approach

Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained

slide-71
SLIDE 71

Approach

Randomly Generate Database Select Pivot Row Generate Query for the Pivot Row Verify that the Pivot Row is contained

How do we generate this query?

slide-72
SLIDE 72

How to generate queries?

Generate an expression that yields TRUE for the pivot row SELECT picture, description FROM animal_pictures WHERE

slide-73
SLIDE 73

How to generate queries?

Randomly Generate Expression Evaluate Expression

  • n Pivot Row

Modify expression to yield TRUE Use in WHERE clause

slide-74
SLIDE 74

Random exp. generation

animal description picture

https://www.sqlite.org/syntax/expr.html

animal_pictures

We first generate a random expression

slide-75
SLIDE 75

Random exp. generation

animal = 'Cat' AND description LIKE '%cute%'

AND = LIKE animal 'Cat' descri ption '%cute%'

slide-76
SLIDE 76

Random exp. generation

animal = 'Cat' AND description LIKE '%cute%'

Evaluate the tree based

  • n the pivot row

AND = LIKE animal 'Cat' descri ption '%cute%'

slide-77
SLIDE 77

Random exp. evaluation

AND = LIKE animal 'Cat' descri ption 'Cat' '%cute%'

Constant nodes return their assigned literal values

'%cute%'

slide-78
SLIDE 78

Random exp. evaluation

AND = LIKE animal 'Cat' descri ption 'Cat plants (cute!)' 'Cat' 'Cat' '%cute%'

Column references return the values from the pivot row

'%cute%'

slide-79
SLIDE 79

Random exp. evaluation

AND = LIKE animal 'Cat' descri ption 'Cat plants (cute!)' 'Cat' TRUE TRUE 'Cat' '%cute%'

Compound nodes compute their result based on their children

'%cute%'

slide-80
SLIDE 80

Random exp. evaluation

AND = LIKE animal 'Cat' descri ption 'Cat plants (cute!)' 'Cat' TRUE TRUE TRUE 'Cat' '%cute%' '%cute%'

slide-81
SLIDE 81

SELECT picture, description FROM animal_pictures WHERE animal = 'Cat' AND description LIKE '%cute%'

Query synthesis

slide-82
SLIDE 82

Random exp. evaluation

AND = LIKE animal 'Cat' descri ption 'Cat plants (cute!)' 'Cat' TRUE TRUE TRUE 'Cat' '%cute%'

What about when the expression does not evaluate to TRUE?

'%cute%'

slide-83
SLIDE 83

Random exp. evaluation

= animal 'Dog' 'Cat' FALSE 'Dog'

What about when the expression does not evaluate to TRUE?

animal = 'Dog'

slide-84
SLIDE 84

Random exp. rectification

switch (result) { case TRUE: result = randexpr; case FALSE: result = NOT randexpr; case NULL: result = randexpr ISNULL; }

slide-85
SLIDE 85

Random exp. rectification

switch (result) { case TRUE: result = randexpr; case FALSE: result = NOT randexpr; case NULL: result = randexpr ISNULL; } animal = 'Dog'

FALSE

slide-86
SLIDE 86

Random exp. rectification

switch (result) { case TRUE: result = randexpr; case FALSE: result = NOT randexpr; case NULL: result = randexpr ISNULL; } NOT(animal = 'Dog')

TRUE

slide-87
SLIDE 87

How to generate queries?

SELECT picture, description FROM animal_pictures WHERE NOT(animal = 'Dog')

slide-88
SLIDE 88

Tested DBMS

PostgreSQL

We tested these (and other DBMS) in a period of 3-4 months

slide-89
SLIDE 89

DBMS

slide-90
SLIDE 90

DBMS

slide-91
SLIDE 91

DBMS

slide-92
SLIDE 92

Bugs overview

DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14

99 real bugs: code fixes or verified as bugs

Real Bugs

slide-93
SLIDE 93

Bugs overview

The SQLite developers quickly responded to all

  • ur bug reports à we focused on this DBMS

DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14 Real Bugs

slide-94
SLIDE 94

Bugs overview

DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14

All MySQL bug reports were verified quickly

Real Bugs

slide-95
SLIDE 95

Bugs overview

DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14

MySQL’s trunk is unavailable, and it has a long release cycle

Real Bugs

slide-96
SLIDE 96

Bugs overview

DBMS Fixed Verified SQLite 65 MySQL 15 10 PostgreSQL 5 4 Sum 85 14

We found the fewest bugs in PostgreSQL and not all could be easily addressed

Real Bugs

slide-97
SLIDE 97

Oracles

DBMS Containment Error SEGFAULT SQLite 46 17 2 MySQL 14 10 1 PostgreSQL 1 7 1 Sum 61 34 4

Real Bugs

slide-98
SLIDE 98

Oracles

DBMS Containment Error SEGFAULT SQLite 46 17 2 MySQL 14 10 1 PostgreSQL 1 7 1 Sum 61 34 4

Containment Oracle

Our Containment oracle allowed us to detect most errors

Real Bugs

slide-99
SLIDE 99

Result: bug in SQLite3

CREATE TABLE t0(c1 TEXT PRIMARY KEY) WITHOUT ROWID; CREATE INDEX i0 ON t0(c1 COLLATE NOCASE); INSERT INTO t0(c1) VALUES ('A'); INSERT INTO t0(c1) VALUES ('a');

An index is an auxiliary data structure that should not affect the query’s result

Real Bugs Containment Oracle

slide-100
SLIDE 100

Result: bug in SQLite3

CREATE TABLE t0(c1 TEXT PRIMARY KEY) WITHOUT ROWID; CREATE INDEX i0 ON t0(c1 COLLATE NOCASE); INSERT INTO t0(c1) VALUES ('A'); INSERT INTO t0(c1) VALUES ('a');

c1

'A' 'a'

Real Bugs Containment Oracle

slide-101
SLIDE 101

Result: bug in SQLite3

CREATE TABLE t0(c1 TEXT PRIMARY KEY) WITHOUT ROWID; CREATE INDEX i0 ON t0(c1 COLLATE NOCASE); INSERT INTO t0(c1) VALUES ('A'); INSERT INTO t0(c1) VALUES ('a');

c1

'A' 'a'

SELECT * FROM t0;

c1

'A' û

SQLite failed to fetch 'a'!

Real Bugs Containment Oracle

slide-102
SLIDE 102

CREATE TABLE t0(c0 INT PRIMARY KEY, c1 INT); CREATE TABLE t1(c0 INT) INHERITS (t0);

c0 c1 c0 c1

t0 t1

Real Bugs Containment Oracle

Result: bug in PostgreSQL

slide-103
SLIDE 103

CREATE TABLE t0(c0 INT PRIMARY KEY, c1 INT); CREATE TABLE t1(c0 INT) INHERITS (t0); INSERT INTO t0(c0, c1) VALUES(0, 0);

c0 c1 c0 c1

t0 t1

Real Bugs Containment Oracle

Result: bug in PostgreSQL

slide-104
SLIDE 104

CREATE TABLE t0(c0 INT PRIMARY KEY, c1 INT); CREATE TABLE t1(c0 INT) INHERITS (t0); INSERT INTO t0(c0, c1) VALUES(0, 0); INSERT INTO t1(c0, c1) VALUES(0, 1);

c0 c1 1 c0 c1 1

t0 t1

The inheritance relationship causes the row to be inserted both in t0 and t1

Real Bugs Containment Oracle

Result: bug in PostgreSQL

slide-105
SLIDE 105

CREATE TABLE t0(c0 INT PRIMARY KEY, c1 INT); CREATE TABLE t1(c0 INT) INHERITS (t0); INSERT INTO t0(c0, c1) VALUES(0, 0); INSERT INTO t1(c0, c1) VALUES(0, 1);

c0 c1 1 c0 c1 1

t0 t1 SELECT c0, c1 FROM t0 GROUP BY c0, c1;

c0 c1

PostgreSQL failed to fetch the row 0 | 1

û

Real Bugs Containment Oracle

Result: bug in PostgreSQL

slide-106
SLIDE 106

Result: bug in MySQL

CREATE TABLE t0(c0 TINYINT); INSERT INTO t0(c0) VALUES(NULL);

c0 NULL

t0

Real Bugs Containment Oracle

slide-107
SLIDE 107

Result: bug in MySQL

CREATE TABLE t0(c0 TINYINT); INSERT INTO t0(c0) VALUES(NULL);

c0 NULL

t0

SELECT * FROM t0 WHERE NOT(t0.c0 <=> 2035382037);

c0

û

The MySQL-specific equality operator <=> malfunctioned for large numbers

FALSE Real Bugs Containment Oracle

slide-108
SLIDE 108

Oracles

DBMS Containment Error SEGFAULT SQLite 46 17 2 MySQL 14 10 1 PostgreSQL 1 7 1 Sum 61 34 4

We also found many bugs that trigger DB errors

Real Bugs Error Oracle

slide-109
SLIDE 109

SQLite3 bug

CREATE TABLE t1 (c0, c1 REAL PRIMARY KEY); INSERT INTO t1(c0, c1) VALUES (TRUE, 9223372036854775807), (TRUE, 0); UPDATE t1 SET c0 = NULL; UPDATE OR REPLACE t1 SET c1 = 1; SELECT DISTINCT * FROM t1 WHERE (t1.c0 IS NULL);

The INSERT and UPDATE statements corrupted the database

Database disk image is malformed

Real Bugs Error Oracle

slide-110
SLIDE 110

Oracles

DBMS Containment Error SEGFAULT SQLite 46 17 2 MySQL 14 10 1 PostgreSQL 1 7 1 Sum 61 34 4

We found only a low number of crash bugs, likely because DBMS are fuzzed extensively

Real Bugs SEGFAULTs

slide-111
SLIDE 111

Average #statements

Half of all bugs can be reproduced with only 4 SQL statements

Real Bugs

slide-112
SLIDE 112

Bug importance

SQLite developers assigned severity levels

Severity Level # Critical 14 Severe 8 Important 14

slide-113
SLIDE 113

Validate Object Detectors

slide-114
SLIDE 114

DL-based object detectors

Auto-Driving Systems Surveillance Camera Surveillance Camera Medical Image Processing

slide-115
SLIDE 115

“autonomous driving system failed to recognize a white truck against a bright sky” believed to be due to its failure to recognize the pedestrian in dark clothing

slide-116
SLIDE 116

Typical image analysis tasks

Object Detection

elephant; elephant

Instance Segmentation

elephant; elephant

Semantics Segmentation

grass; elephant; tree Multiple Objects No objects; just pixels

eleplant eleplant

Entire Image elephant

Image Classification

Object detection in automated driving systems

Object Detection (localization + classification)

slide-117
SLIDE 117

Typical image analysis tasks

Object Detection

elephant; elephant

Instance Segmentation

elephant; elephant

Semantics Segmentation

grass; elephant; tree Multiple Objects No objects; just pixels

eleplant eleplant

Entire Image elephant

Image Classification

Object detection in automated driving systems

Object Detection (localization + classification)

Focused on by existing DNN testing work

slide-118
SLIDE 118

Typical image analysis tasks

Object Detection

elephant; elephant

Instance Segmentation

elephant; elephant

Semantics Segmentation

grass; elephant; tree Multiple Objects No objects; just pixels

eleplant eleplant

Entire Image elephant

Image Classification

Object detection in automated driving systems

Object Detection (localization + classification)

6.5 USD for this figure Focused on by existing DNN testing work

slide-119
SLIDE 119

Basic idea

slide-120
SLIDE 120

Basic idea

slide-121
SLIDE 121

Examples

slide-122
SLIDE 122

Workflow

Instance Segmentation what to insert where to insert

Imagek Objects of “Bird” Cluster1 Cluster1 Cluster1 Clusterm Objects of “Person” Cluster1 Cluster1 Cluster1 Clustern Object Pool

+

Selection

Object Image Refinement & Selection Object Insertion

Image Dataset pick one background image Paste

Birdm Personn Object Refinement & Selection Object Pool Objects of “bird” Birdm Objects of “Person” Personn

slide-123
SLIDE 123
  • Extract objects from 1000 images from the COCO dataset
  • Take 500 images as the “background” to generate test cases

Evaluation subjects

slide-124
SLIDE 124

Result summary

slide-125
SLIDE 125

Other domains

qSMT solvers (Z3 & CVC4) qSMC tools (CPAchecker, CBMC & Seahorn) qAndroid mobile apps (going beyond crashes) qMachine translation systems qSmart contracts q…

slide-126
SLIDE 126
slide-127
SLIDE 127

Finding a/the right balance between human & machine collaboration

slide-128
SLIDE 128

Thank you!

Finding a/the right balance between human & machine collaboration