Det Detect ecting ng the he 1% 1%: Gr Grow
- wing
ng the he Sci Science ence
- f
- f Vul
Vulner nerabi ability y Di Discover scovery
Laurie Williams laurie_williams@ncsu.edu
Real people – Real Projects – Real Impact
1
Det Detect ecting ng the he 1% 1%: Gr Grow owing ng the he - - PowerPoint PPT Presentation
Det Detect ecting ng the he 1% 1%: Gr Grow owing ng the he Sci Science ence of of Vul Vulner nerabi ability y Di Discover scovery Laurie Williams laurie_williams@ncsu.edu Real people Real Projects Real Impact 1 2 3
Laurie Williams laurie_williams@ncsu.edu
Real people – Real Projects – Real Impact
1
2
3
Larry the Latent David the Detected Edwin the Exploitable Adam the Attack-prone
4
5
6
7
8
Funded by: In cooperation:
9
Stage Complications Where How Exploited Future
10
Stage Complications Where How Exploited Future
11
Stage Complications Where How Exploited Future
Neutral (8721) 78.9% Faulty but not vulnerable (1967) 17.8% Faulty and vulnerable (294) 2.7% Vulnerable but not faulty (69) 0.6% 12
Stage Complications Where How Exploited Future
13
Stage Complications Where How Exploited Future
Larry the Latent David the Detected
14
Stage Complications Where How Exploited Future
15
(all statistically significant) Metric Case study 1 (component- level) Case study 2 (file-level) Case study 3 (component- level) All SA alerts 0.2 0.2 0.2 Security SA alerts 0.2 0.2 0.2
Stage Complications Where How Exploited Future
attackers.”
complexity ...”
17/38
Stage Complications Where How Exploited Future
coupling, comment density and others
lines of code
metrics (e.g. betweenness, closeness)
18/38
Stage Complications Where How Exploited Future
19/38
Stage Complications Where How Exploited Future
20/38
Stage Complications Where How Exploited Future
“Given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to
Many eyes make all bugs shallow.”
Eric Raymond
Stage Complications Where How Exploited Future
21
The number of distinct developers who changed a given source code file Files changed by 6 or more developers were 4 times more likely to have a vulnerability, (p<0.001)
(…not quite what Linus’ Law says…)
Vulnerable files had more developers than neutral files (p<0.001)
In all three case studies…
Stage Complications Where How Exploited Future
22
/fs/exec.c Unfocused Contribution
Examined files changed by many developers who were working
… … … … … … … Used contribution network centrality (CNBetweenness) Vulnerable files had a higher CNBetweenness (p<0.001) than neutral files.
Stage Complications Where How Exploited Future
23
Stage Complications Where How Exploited Future
24
!
What you look at will likely be a vulnerability … … But many vulnerabilities will be missing.
Stage Complications Where How Exploited Future
25
Stage Complications Where How Exploited Future
26
Admin by default Empty password Hard-coded secret Invalid IP address binding Suspicious comment Use of HTTP without TLS Use of weak cryptography algorithm $power_username=‘admin’ password=>‘’ $power_password=‘admin’ $bind_host=‘0.0.0.0’
#FIXME(bogdando) remove these hacks after switched to systemd service.units
$quantum_auth_url = ‘http://127.0.0.1:35357/v2.0’
password => ht_md5($power_password)
27
Stage Complications Where How Exploited Future
5 10 15 20 25 30 GitHub Mozilla Openstack Wikimedia Proportion of Script (%)
AdminByDefault EmptyPassword HardCodedSecret InvalidIPAddressBinding SuspiciousComments HTTPWithoutTLS WeakCryptoAlgorithm
28
Stage Complications Where How Exploited Future
vulnerabilities.
Stage Complications Where How Exploited Future
29
(critical) file
files.
Stage Complications Where How Exploited Future
30
31
Stage Complications Where How Exploited Future
32
Discovery Technique Vulnerabilities Per Hour Tolven eCHR OpenEMR PatientOS Exploratory Manual Penetration Testing 0.00 0.40 .07 Systematic Manual Penetration Testing 0.94 0.55 0.55 Automated Penetration Testing 22.00 71.00 N/A Static Analysis 2.78 32.40 11.15
Stage Complications Where How Exploited Future
33
No single technique discovered every type of vulnerability. Very few individual vulnerabilities discovered with multiple discovery techniques.
Stage Complications Where How Exploited Future
34
Stage Complications Where How Exploited Future
Design flaw Implementation bug Systematic manual and exploratory penetration testing Automated penetration testing and static analysis
35
36
Stage Complications Where How Exploited Future
37
38
Stage Complications Where How Exploited Future
39
Stage Complications Where How Exploited Future
40
Stage Complications Where How Exploited Future
41
Stage Complications Where How Exploited Future
42
Stage Complications Where How Exploited Future
43
Code Coverage Vulnerability Coverage Windows (Binaries) 48.4% 94.8% Firefox (Source Code Files) 14.8% 85.6% Fedora (Packages) 8.9% 63.3%
Stage Complications Where How Exploited Future
Boundary Code (BC): percentage of code that appears on the boundary of a software system Boundary Vulnerabilities (BV): percentage of vulnerabilities on Boundary Code (BC) BC BV Ratio Windows 8 2014 4.5% 17.2% 3.8 2015 4.6% 18.6% 4.0 Windows 8.1 2014 4.6% 16.5% 3.6 2015 6.9% 23.7% 3.4 Windows 10 2014 3.4% 10.5% 3.1 2015 3.9% 25.1% 6.4
44
Stage Complications Where How Exploited Future
45
Stage Complications Where How Exploited Future
46
47/38
Stage Complications Where How Exploited Future
48/38
Stage Complications Where How Exploited Future
Stage Complications Where How Exploited Future
Larry the Latent David the Discovered
49
Over-sampling)
(but not the test data)
the vulnerable files
Stage Complications Where How Exploited Future
50
Stage Complications Where How Exploited Future
51
Discovery Technique Vulnerabilities Per Hour OpenMRS ?? ?? Exploratory Manual Penetration Testing Systematic Manual Penetration Testing Automated Penetration Testing Static Analysis
Stage Complications Where How Exploited Future
52
Stage Complications Where How Exploited Future
53
54/38
Stage Complications Where How Exploited Future
David the Detected Edwin the Exploitable Adam the Attack-prone How? Where?
55
56
57
trap-clipart-17.png
58
testing/
ce350dea3a4a
governments-10-simple-science-question-quiz
59
https://prosportstickers.219sign s.com/index.php?route=product /product&product_id=37152 https://encrypted- tbn0.gstatic.com/images ?q=tbn:ANd9GcQFnTWQ GJI6jLxeHmzDNqJCl2Rrg m2Fp5hiwZFBv3XBKOhG 1PC6 https://www.designbyhum ans.com/shop/sticker/mea n-fish/660022/ https://suzyssitcom.com/2013/ 08/can-you-do-the-heimlich-
http://www.brianbarber.com/illustra tion/ https://drawception.com/gam e/HM8CfM7pHD/sleepy-fish/
60
function SMOTE() while Majority > m do delete any Majority item while Minority < m do add something_like(any Minority item ) function something_like( X0 ) { X1, X2, … } = k nearest neighbors of X0 Z = any of X0 Y = interpolate( X0, Z) return Y function minkowski_distance(a, b, r) return ( ∑ abs(a.i - b.i)^r ) ^ (1/r) Q: How to do this better? A1: Tune the magic parameters of SMOTE <m,k,r>
61
Three empirical case studies
found yet) RHEL4 kernel PHP Wireshark Number of committers 557 84 19 Source code files 14,454 1,039 2,688 % files vulnerable 3% 6% 3% Pre-release version control log data 16 months 2 years 2 years Years of security data 5 years 3 years, 5 months 3 years, 5 months
62
63
*marked by developer as false positive or intentional
64
Stage Complications Where How Exploited Future
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Prevention Detection Response
26% 33% 48%
% usage
65
66/38
Stage Complications Where How Exploited Future
67/38
Stage Complications Where How Exploited Future
Stage Complications Where How Exploited Future
68