1
Statistical Modeling of UNIX Statistical Modeling of UNIX Users and - - PowerPoint PPT Presentation
Statistical Modeling of UNIX Statistical Modeling of UNIX Users and - - PowerPoint PPT Presentation
Statistical Modeling of UNIX Statistical Modeling of UNIX Users and Processes With Users and Processes With Application to Computer Application to Computer Intrusion Detection Intrusion Detection Wen-Hua Ju 1 Acknowledgement
2
Acknowledgement Acknowledgement
Yehuda Vardi (Rutgers) Matthias Schonlau (RAND) William DuMouchel (AT&T Labs) Alan F. Karr (NISS) Allan Wilks (AT&T Labs) Daryl Pregibon (AT&T Labs)
3
How Statistician got involved … How Statistician got involved …
- Refine techniques, developed by AT&T Labs
Statistics Research, for detection of telephone fraud to detection of intrusion into networked computer systems.
- But …
– Multiple intruder motives – Hard-to-quantify losses – Massive data
- Something simpler: Characterization of and
differentiation among users of a computer system
4
Outline Outline
- Experiments and Data
– UNIX users – UNIX processes
- Models for finite-state discrete stochastic
processes
– Hybrid High-order Markov Chain – Rarity of Occurrence
- Results and Discussion
5
Computer Intrusion And Intrusion Detection Computer Intrusion And Intrusion Detection
- Computer Intrusion:
A sequence of related actions by a malicious adversary that results in the occurrence of unauthorized security threats to a target computing or networking domain. Edward Amoroso (1999)
6
Experiments And Data Experiments And Data
- UNIX Users: Detecting Masquerades
– Command sequences (AT&T Labs) – Collected by the UNIX acct auditing mechanism
7
Experiments And Data Experiments And Data
- UNIX Users: Detecting Masquerades
– 70 users, 15,000 commands each
- 50 users: normal users (intrusion target)
- 20 users: masqueraders
– Simplifying assumption
- Block of 100 commands
– Blocks are randomly chosen from masqueraders and inserted to normal users – Data available at http://www.schonlau.net/intrusion.html
9
Experiments And Data Experiments And Data
- UNIX Processes:
– System-call traces (Computer Immune System Research, University of New Mexico) – Normal data: synthetic and live – Intrusion data: real intrusion
10
High-order Markov Chain Model High-order Markov Chain Model
- High-order vs. regular Markov model
- Problem: Huge Parameter Space
- Mixture Transition Distribution (MTD) (Raftery 85;
Raftery and Tavaré 94)
– Auto-regressive – Only one extra parameter is added to the model for each extra lag
11
High-order Markov Chain Model MTD Model High-order Markov Chain Model MTD Model
∑ ∑ ∑
= = = − −
= ≥ = ∀ = ≥ = = + + = = = = =
l i i i j i j i i j i i i l j j i l t i t i t
K j s s r s s r s s r l l t s s r s X s X s X P
j l
1 K 1 i 1 1
1 , ,... 1 , 1 ) | ( and ) | ( satisfy } { and )} | ( { where 2 , 1 ), | ( ) ,..., | (
1
λ λ λ λ ? R
12
High-order Markov Chain Model MTD Model: Parameter estimation via MLE High-order Markov Chain Model MTD Model: Parameter estimation via MLE
∑ ∑ ∑
= = =
=
K i K i l j i i j i i T
l j l
s s r s s N x x L
1 1 1 1
) | ( log ) ,..., ( ... ) ,..., ( log λ
- Direct maximization: Sequential quadratic
programming algorithm, but …
- Alternating maximization
– Fix r(.|.): easy – Fix λ: still too many parameters
∑ ∑ ∑
= − =
k l k k k k k k
k b l T a b a and where log
13
High-order Markov Chain Model MTD Model: MLE High-order Markov Chain Model MTD Model: MLE
) ,... ( , ) ,..., ( ) | ( ˆ , ˆ
1 l l j i i l i i j l k k k k k k k
i i s s N l T K s s r k K l T a b a a b
l j
∀ − = ∀ − = =
∑ ∑ ∑
=
λ It’s equivalent to solve the following linear system for b (or λ) Can be “solved” efficiently using EM algorithm in the sense of minimizing the K-L distance
14
High-order Markov Chain Model Application to Command Data High-order Markov Chain Model Application to Command Data
- Exhaustive Command Space (ECS) Model:
– Treat all commands as Markov chain states
- Partial Command Space (PCS) Model:
– Treat frequently used commands as Markov chain states, and use “other” to represent the rest
- Modification for “other”
– r (other | .) are small – r (. | other) are equal
- Using the parameter estimations as user profile
15
High-order Markov Chain Model Application to Command Data High-order Markov Chain Model Application to Command Data
- Hypothesis Testing as A Decision Rule
H0 : Command blocks are from user u H1 : Command blocks are NOT from user u
- Likelihood-ratio Like test
w X H R c c L R c c L R R c c X
u u u T v v T u v U U T u
> Λ Λ = Λ Λ
≠
if Reject ) ˆ , ˆ | ,..., ( ) ˆ , ˆ | ,..., ( max log ) ˆ ,..., ˆ , ˆ ,..., ˆ | ,..., (
1 1 1 1 1
16
Hybrid High-order Markov Chain Model Hybrid High-order Markov Chain Model
In case of no or not enough training data:
Independence model
∏ ∏
= =
= =
T t uc T t t T
t
q u c P u c c P
1 1 1
) user | ( ) user | ,... (
Estimate q’s using modified user/command
counts
17
Hybrid High-order Markov Chain Application to Command Data Hybrid High-order Markov Chain Application to Command Data
- Test statistics
> ≤ ≤ < = ′ = = Λ Λ =
∏ ∏
= = ≠ ≠ 2 2 2 2 2 1 2 1 2 1 2 2 1 1 1 1 2 1 1 1 1
ˆ if , ˆ if , ˆ ˆ if , ˆ ˆ ˆ max log ) user | ,..., ( ) ˆ , ˆ | ,..., ( ) ˆ , ˆ | ,..., ( max log ) user | ,..., ( τ ρ τ τ ρ τ ρ τ ρ τ ρ
u u u u u u u T i uc T i vc u v T u u u T v v T u v T u
X X X X X X X q q u c c X R c c L R c c L u c c X
i i
18
Hybrid High-order Markov Chain Application to Command Data Hybrid High-order Markov Chain Application to Command Data
- Hybrid test statistic
} ,..., { in
- f
# : if , if , / / if ,
1 2 2 2 1 2 1 2 1 1 1 2 2 1 1 T u u u u u
c c s s/T X s/T X T s X T s s/T X X
- ther
> ′ ≤ ≤ ′ − − + − − ≤ = ξ ξ ξ ξ ξ ξ ξ ξ ξ ξ
19
Rarity of Occurrence Model Rarity of Occurrence Model
- Motivation: Depend not only on frequency
– Schonlau and Theus (2000)
- Rarity of Command(s)
– Popular and frequently used – Popular but not frequently used – Rare or unique
- Define the rarity index of a command based on
the number of users who used this command
20
Rarity of Occurrence Model Rarity of Occurrence Model
- Rarity Index Example:
– Total 50 users – A command used by only 1 user: 50/50 – A command used by all 50 user: 1/50 – A command used by no users: ½(?) – Defined for both individual command and a short sequence of commands
21
Rarity of Occurrence Model Rarity of Occurrence Model
- Anomaly signal of user u’s short command
sequence (ck1,…,ckl) defined as the weighted rarity index
– Weight (+/-) depends on frequency – Case 1: User u has used Pu – Case 2: User u didn’t use Pu, but has used all the commands – Case 3: User u didn’t use all the commands
- Test score is defined as a weighted sum of
anomaly signals
22
Rarity of Occurrence Model Rarity of Occurrence Model
- Entropy model (only tried on the system call
data)
– Motivation – Shannon’s entropy of distribution {pi} – Small entropy indicates abnormality – Test score is defined as the sum of weighted entropies
Unix command result
x
26
Discussion Discussion
- Hybrid High-order Markov Chain Model
– Multi-layer defense scheme – Computation demand – Likelihood-ratio
- Rarity of Occurrence Model
– Good performance – Global Information are important
- Future study
– Utilizing more information – Relaxing experiment limitation – Other audit data format
27