Learning Rules for Anomaly Detection (LERAD)
- f Hostile Network Traffic
Learning Rules for Anomaly Detection (LERAD) of Hostile Network - - PDF document
Learning Rules for Anomaly Detection (LERAD) of Hostile Network Traffic Matt Mahoney Overview Prior Work in Network Anomaly Detection The 1999 DARPA Intrusion Detection Evaluation Packet Header Anomaly Detection (PHAD) Application
Prior Work in Network Anomaly Detection The 1999 DARPA Intrusion Detection Evaluation Packet Header Anomaly Detection (PHAD) Application Layer Anomaly Detection (ALAD) Learning Rules for Anomaly Detection (LERAD)
N-grams (Forrest, 1996) State machine, neural networks (RST, Ghosh 1999)
User-programmed rules: firewalls, SNORT, Bro Learned rules: ADAM, NIDES, SPADE
Weeks 1 and 3: training: no attacks Week 2: training: 138 labeled instances of 32 attacks Weeks 4 and 5: test: 201 unlabeled instances (190 actual)
IDS must identify IP address of attacker or target and
Evaluated at 100 false alarms (10 per day) threshold. Blind (developers had no access to test data) Evaluated by DARPA May use both signature and anomaly detection May use both host and network based methods May restrict attacks by category, data type, or target
Examines Ethernet, IP, TCP, UDP, ICMP protocols 34 learned rules (trained on week 3)
Score = tn/r summed over packet
t = time since last anomaly (values never seen in training) n = number of training packets r = number of allowed values
Detects 72/189 attacks (54 without TTL)
Models incoming server TCP connections Conditional rules (5 forms, selected ad-hoc)
Score = tn/r Detects 59/189 attacks (70 with PHAD w/o TTL)
Like ALAD, but rule forms are derived from a sample of
If A1 = V1 and A2 = V2 and ... then Am = V1, V2, ... or Vr 23 attributes (Ai).
Score = tn/r Detects 112-118/190 attacks (average 114.8, or 60%) No improvement when merged with PHAD or ALAD.
Date Time DA1 DA0 DP SA3 SA2 SA1 SA0 SP DUR F1 F2 F3 Len W1 W2 03/15/1999 08:00:57 112 050 25 196 037 037 158 1111 0 .S .AP .AF 857 .^@EHLO .ju 03/15/1999 08:00:57 113 050 25 196 037 037 158 1113 0 .S .AP .AF 880 .^@EHLO .ju 03/15/1999 08:01:13 114 050 80 172 016 016 100 2971 4489 .S .AP .AP 872 .^@GET . 03/15/1999 08:01:13 114 050 80 172 016 016 100 2972 5693 .S .AP .AF 595 .^@GET . 03/15/1999 08:01:13 114 050 80 172 016 016 100 2973 12 .S .AP .AF 318 .^@GET ./w 03/15/1999 08:01:13 114 050 80 172 016 016 100 2974 118 .S .AP .AP 610 .^@GET ./
1 28882/2 if F2=.AP then F1 = .S .AS 2 14236/1 if DA0=100 then DA1 = 112 3 12854/1 if W3=.HTTP/1.0^M^ then W1 = .^@GET 4 12854/1 if W3=.HTTP/1.0^M^ then DP = 80 5 35455/3 if then DA1 = 113 112 114 6 34602/3 if F3=.AF then F1 = .S .AF .AS 7 10857/1 if SA3=172 then SA2 = 016 8 10857/1 if SA2=016 then SA1 = 016 9 10857/1 if SA2=016 then SA3 = 172 10 10642/1 if F1=.S F2=.AP W1=.^@EHLO then DP = 25 11 9914/1 if W3=.HELO then W7 = .RCPT 12 9914/1 if W5=.MAIL then W3 = .HELO 13 9914/1 if W3=.HELO then W1 = .^@EHLO 14 28882/3 if F2=.AP then F3 = .AP .AF .R 15 35455/4 if then F1 = .S .AF .AS .R 16 34602/4 if F3=.AF then F2 = .S .AP . .AS 17 7656/1 if W7=. then W8 = . 18 7645/1 if W5=. then W6 = . 19 7645/1 if W4=. then W7 = . 20 7596/1 if W3=. then W4 = . 21 7566/1 if DA1=114 W3=.HTTP/1.0^M^ then DA0 = 050 22 29549/4 if F1=.S then F2 = .S .AP . .A 23 35455/5 if then F2 = .S .AP . .AS .A 24 35455/5 if then F3 = .S .AP .AF .AS .R 25 12867/2 if W1=.^@GET then W3 = .HTTP/1.0^M^ .align= 26 12854/2 if W3=.HTTP/1.0^M^ then DA0 = 050 100 27 10105/2 if W7=.RCPT then W5 = .MAIL .RCPT 28 35455/8 if then SA3 = 196 172 197 194 195 135 192 152 29 12838/3 if DP=25 then W1 = .^@EHLO . .^@HELO 30 3992/1 if W3=.HTTP/1.0^M^ W7=.text/htm then W8 = .text/pla 31 7647/2 if W6=. then W5 = . .QUIT^M^ 32 7279/2 if SA0=050 then SA1 = 016 073 33 3521/1 if DA1=112 W3=.HTTP/1.0^M^ W6=.User-Age then W7 = .Mozilla/ 34 6824/2 if W6=.User-Age then W4 = .Connection: .Referer: 35 6823/2 if F2=.AP W6=.User-Age then W8 = .[en] .(X11; 36 18807/6 if DA1=112 then DA0 = 050 100 194 207 149 020 37 2998/1 if SA1=037 then SA0 = 158 38 29549/10 if F1=.S then DP = 113 25 23 80 135 21 79 22 515 139 39 35455/12 if then DA0 = 105 050 204 084 168 148 169 100 194 207 149 020 40 34602/12 if F3=.AF then DP = 113 25 23 80 21 20 79 22 1022 515 1023 139 41 35455/13 if then SA2 = 037 016 182 168 169 115 027 008 227 073 007 218 013 42 35455/13 if then DP = 113 25 23 80 135 21 20 79 22 1022 515 1023 139 43 35455/13 if then SA1 = 037 016 182 168 169 115 027 008 227 073 007 218 013 44 2695/1 if SA1=007 then SA2 = 007 45 2695/1 if SA3=194 SA2=007 then SA0 = 153 46 5223/2 if SA3=194 then SA0 = 021 153 47 7656/3 if W7=. then W3 = . .PASS .6667^M^ 48 6852/3 if W4=.Referer: then W5 = .http://w .http://m .http://h 49 2083/1 if SA1=013 then SA0 = 191 50 1888/1 if SA1=227 F1=.S then SA0 = 189 51 12885/7 if DP=80 then W4 = .HTTP/1.0^M^ .Connection: .Referer: . .Host: 52 53 35455/24 if then SA0 = 105 158 050 204 084 182 233 168 148 169 100 194 108 54 12854/10 if W3=.HTTP/1.0^M^ then W8 = .User-Age .[en] .text/pla .(X11; .I; 55 7109/6 if DA1=112 SA2=016 F3=.AF then DA0 = 050 100 194 207 149 020 56 12867/13 if W1=.^@GET then W6 = .User-Age .[en] .Connecti .Accept: .(X11; 57 10857/12 if SA2=016 then DA0 = 105 050 204 084 168 148 169 100 194 207 149 58 1805/2 if F1=.S W6=." then W2 = .^C .^@^@^@ 59 1798/2 if DP=23 F3=.AF then W4 = .^_ .# 60 5827/9 if DP=20 W5=. then DUR = 0 1 4 6 7 2 3 5 36 61 7656/13 if W8=. then W2 = . ., .anonhmous^M^ .anonymMus^M^ .anonyxous^M^ 62 7647/32 if W6=. then DUR = 0 23 1 12 108 4 30 6 9 21 24 7 14 22 2 3 11 15 27
FP 003 (52.43) W1?=.GET W3=.HTTP/1.0^M^ FP 001 (40.79) F1?=.AP F2=.AP TP portsweep 002 (99.99) DA1?=118 DA0=100 TP neptune 016 (99.62) F2?=.A F3=.AF TP dosnuke 005 (33.1) DA1?=115 FP 014 (99.93) F2=.AP F3?=.AR FP 001 (56.62) F1?=.AP F2=.AP TP queso 015 (55.42) F1?=.F
TP sendmail 029 (98.03) DP=25 W1?=.^@MAIL TP apache2 035 (50.01) F2=.AP W6=.User-Age W8?=.User-Age FP 048 (100) W4=.Referer: W5?=.http://1 FP 001 (42.99) F1?=.AP F2=.AP TP portsweep 024 (99.3) F3?=.F TP apache2 034 (50.04) W4?=.User-Agent: W6=.User-Age FP 031 (69.28) W5?=.z^X)^K W6=. TP dosnuke 022 (98.19) F1=.S F2?=.UAP TP tcpreset 001 (99.98) F1?=.AP F2=.AP
TP netbus 014 (100) F2=.AP F3?=.S TP ntinfoscan 001 (55.59) F1?=.AP F2=.AP FP 036 (99.56) DA1=112 DA0?=010 TP dosnuke 022 (56.87) F1=.S F2?=.UAP TP ncftp 005 (67.59) DA1?=118 TP dosnuke 022 (47.8) F1=.S F2?=.UAP FP 032 (99.6) SA1?=048 SA0=050 TP crashiis 025 (97.99) W1=.^@GET W3?=. TP mscan 002 (99.97) DA1?=118 DA0=100 FP 032 (100) SA1?=048 SA0=050
TP satan 040 (90.05) DP?=70 F3=.AF TP crashiis 025 (80.06) W1=.^@GET W3?=.
TP netcat_breakin 031 (79.24) W5?=.exit^ W6=. FP 054 (91.54) W3=.HTTP/1.0^M^ W8?=.applicat
TP ps 050 (99.24) SA1=227 SA0?=125 F1=.S
FP 051 (96.38) DP=80 W4?=.^
Algorithm is robust against useless data (date and time). Nearly all attack types are detected. U2R attacks detected by FTP uploads or anomalous IP
Only TCP is monitored. UDP attacks (udpstorm,
Hard to detect attacks (from original evaluation) are
Intrusions are anomalous because normal behavior does
Models are nonstationary: score = tn/r. Event frequency does not matter. Time since last event does matter. Space complexity is low (a few KB memory). Only near-certain events need to be modeled. Small number of rules. Small number of allowed values per rule.
Apply LERAD to UDP, ICMP, packet headers. Investigate with realistic data Training on week 2 cuts detections to 78/190 (67% of
Mix sniffed real data with training/test data On-line algorithm (no training/test phases) Improve application text modeling Language independent parsing Modeling binary data (DNS, etc.) Deeper into text (mail attachments, etc.)