Statistical Modeling of UNIX Statistical Modeling of UNIX Users and - - PowerPoint PPT Presentation

statistical modeling of unix statistical modeling of unix
SMART_READER_LITE
LIVE PREVIEW

Statistical Modeling of UNIX Statistical Modeling of UNIX Users and - - PowerPoint PPT Presentation

Statistical Modeling of UNIX Statistical Modeling of UNIX Users and Processes With Users and Processes With Application to Computer Application to Computer Intrusion Detection Intrusion Detection Wen-Hua Ju 1 Acknowledgement


slide-1
SLIDE 1

1

Statistical Modeling of UNIX Users and Processes With Application to Computer Intrusion Detection Statistical Modeling of UNIX Users and Processes With Application to Computer Intrusion Detection

Wen-Hua Ju

slide-2
SLIDE 2

2

Acknowledgement Acknowledgement

Yehuda Vardi (Rutgers) Matthias Schonlau (RAND) William DuMouchel (AT&T Labs) Alan F. Karr (NISS) Allan Wilks (AT&T Labs) Daryl Pregibon (AT&T Labs)

slide-3
SLIDE 3

3

How Statistician got involved … How Statistician got involved …

  • Refine techniques, developed by AT&T Labs

Statistics Research, for detection of telephone fraud to detection of intrusion into networked computer systems.

  • But …

– Multiple intruder motives – Hard-to-quantify losses – Massive data

  • Something simpler: Characterization of and

differentiation among users of a computer system

slide-4
SLIDE 4

4

Outline Outline

  • Experiments and Data

– UNIX users – UNIX processes

  • Models for finite-state discrete stochastic

processes

– Hybrid High-order Markov Chain – Rarity of Occurrence

  • Results and Discussion
slide-5
SLIDE 5

5

Computer Intrusion And Intrusion Detection Computer Intrusion And Intrusion Detection

  • Computer Intrusion:

A sequence of related actions by a malicious adversary that results in the occurrence of unauthorized security threats to a target computing or networking domain. Edward Amoroso (1999)

slide-6
SLIDE 6

6

Experiments And Data Experiments And Data

  • UNIX Users: Detecting Masquerades

– Command sequences (AT&T Labs) – Collected by the UNIX acct auditing mechanism

slide-7
SLIDE 7

7

Experiments And Data Experiments And Data

  • UNIX Users: Detecting Masquerades

– 70 users, 15,000 commands each

  • 50 users: normal users (intrusion target)
  • 20 users: masqueraders

– Simplifying assumption

  • Block of 100 commands

– Blocks are randomly chosen from masqueraders and inserted to normal users – Data available at http://www.schonlau.net/intrusion.html

slide-8
SLIDE 8
slide-9
SLIDE 9

9

Experiments And Data Experiments And Data

  • UNIX Processes:

– System-call traces (Computer Immune System Research, University of New Mexico) – Normal data: synthetic and live – Intrusion data: real intrusion

slide-10
SLIDE 10

10

High-order Markov Chain Model High-order Markov Chain Model

  • High-order vs. regular Markov model
  • Problem: Huge Parameter Space
  • Mixture Transition Distribution (MTD) (Raftery 85;

Raftery and Tavaré 94)

– Auto-regressive – Only one extra parameter is added to the model for each extra lag

slide-11
SLIDE 11

11

High-order Markov Chain Model MTD Model High-order Markov Chain Model MTD Model

∑ ∑ ∑

= = = − −

= ≥ = ∀ = ≥ = = + + = = = = =

l i i i j i j i i j i i i l j j i l t i t i t

K j s s r s s r s s r l l t s s r s X s X s X P

j l

1 K 1 i 1 1

1 , ,... 1 , 1 ) | ( and ) | ( satisfy } { and )} | ( { where 2 , 1 ), | ( ) ,..., | (

1

λ λ λ λ ? R

slide-12
SLIDE 12

12

High-order Markov Chain Model MTD Model: Parameter estimation via MLE High-order Markov Chain Model MTD Model: Parameter estimation via MLE

∑ ∑ ∑

= = =

        =

K i K i l j i i j i i T

l j l

s s r s s N x x L

1 1 1 1

) | ( log ) ,..., ( ... ) ,..., ( log λ

  • Direct maximization: Sequential quadratic

programming algorithm, but …

  • Alternating maximization

– Fix r(.|.): easy – Fix λ: still too many parameters

∑ ∑ ∑

= − =

k l k k k k k k

k b l T a b a and where log

slide-13
SLIDE 13

13

High-order Markov Chain Model MTD Model: MLE High-order Markov Chain Model MTD Model: MLE

) ,... ( , ) ,..., ( ) | ( ˆ , ˆ

1 l l j i i l i i j l k k k k k k k

i i s s N l T K s s r k K l T a b a a b

l j

∀ − = ∀ − = =

∑ ∑ ∑

=

λ It’s equivalent to solve the following linear system for b (or λ) Can be “solved” efficiently using EM algorithm in the sense of minimizing the K-L distance

slide-14
SLIDE 14

14

High-order Markov Chain Model Application to Command Data High-order Markov Chain Model Application to Command Data

  • Exhaustive Command Space (ECS) Model:

– Treat all commands as Markov chain states

  • Partial Command Space (PCS) Model:

– Treat frequently used commands as Markov chain states, and use “other” to represent the rest

  • Modification for “other”

– r (other | .) are small – r (. | other) are equal

  • Using the parameter estimations as user profile
slide-15
SLIDE 15

15

High-order Markov Chain Model Application to Command Data High-order Markov Chain Model Application to Command Data

  • Hypothesis Testing as A Decision Rule

H0 : Command blocks are from user u H1 : Command blocks are NOT from user u

  • Likelihood-ratio Like test

w X H R c c L R c c L R R c c X

u u u T v v T u v U U T u

>         Λ Λ = Λ Λ

if Reject ) ˆ , ˆ | ,..., ( ) ˆ , ˆ | ,..., ( max log ) ˆ ,..., ˆ , ˆ ,..., ˆ | ,..., (

1 1 1 1 1

slide-16
SLIDE 16

16

Hybrid High-order Markov Chain Model Hybrid High-order Markov Chain Model

In case of no or not enough training data:

Independence model

∏ ∏

= =

= =

T t uc T t t T

t

q u c P u c c P

1 1 1

) user | ( ) user | ,... (

Estimate q’s using modified user/command

counts

slide-17
SLIDE 17

17

Hybrid High-order Markov Chain Application to Command Data Hybrid High-order Markov Chain Application to Command Data

  • Test statistics

     > ≤ ≤ < = ′ =         =         Λ Λ =

∏ ∏

= = ≠ ≠ 2 2 2 2 2 1 2 1 2 1 2 2 1 1 1 1 2 1 1 1 1

ˆ if , ˆ if , ˆ ˆ if , ˆ ˆ ˆ max log ) user | ,..., ( ) ˆ , ˆ | ,..., ( ) ˆ , ˆ | ,..., ( max log ) user | ,..., ( τ ρ τ τ ρ τ ρ τ ρ τ ρ

u u u u u u u T i uc T i vc u v T u u u T v v T u v T u

X X X X X X X q q u c c X R c c L R c c L u c c X

i i

slide-18
SLIDE 18

18

Hybrid High-order Markov Chain Application to Command Data Hybrid High-order Markov Chain Application to Command Data

  • Hybrid test statistic

} ,..., { in

  • f

# : if , if , / / if ,

1 2 2 2 1 2 1 2 1 1 1 2 2 1 1 T u u u u u

c c s s/T X s/T X T s X T s s/T X X

  • ther

       > ′ ≤ ≤ ′ − − + − − ≤ = ξ ξ ξ ξ ξ ξ ξ ξ ξ ξ

slide-19
SLIDE 19

19

Rarity of Occurrence Model Rarity of Occurrence Model

  • Motivation: Depend not only on frequency

– Schonlau and Theus (2000)

  • Rarity of Command(s)

– Popular and frequently used – Popular but not frequently used – Rare or unique

  • Define the rarity index of a command based on

the number of users who used this command

slide-20
SLIDE 20

20

Rarity of Occurrence Model Rarity of Occurrence Model

  • Rarity Index Example:

– Total 50 users – A command used by only 1 user: 50/50 – A command used by all 50 user: 1/50 – A command used by no users: ½(?) – Defined for both individual command and a short sequence of commands

slide-21
SLIDE 21

21

Rarity of Occurrence Model Rarity of Occurrence Model

  • Anomaly signal of user u’s short command

sequence (ck1,…,ckl) defined as the weighted rarity index

– Weight (+/-) depends on frequency – Case 1: User u has used Pu – Case 2: User u didn’t use Pu, but has used all the commands – Case 3: User u didn’t use all the commands

  • Test score is defined as a weighted sum of

anomaly signals

slide-22
SLIDE 22

22

Rarity of Occurrence Model Rarity of Occurrence Model

  • Entropy model (only tried on the system call

data)

– Motivation – Shannon’s entropy of distribution {pi} – Small entropy indicates abnormality – Test score is defined as the sum of weighted entropies

slide-23
SLIDE 23

Unix command result

slide-24
SLIDE 24
slide-25
SLIDE 25

x

slide-26
SLIDE 26

26

Discussion Discussion

  • Hybrid High-order Markov Chain Model

– Multi-layer defense scheme – Computation demand – Likelihood-ratio

  • Rarity of Occurrence Model

– Good performance – Global Information are important

  • Future study

– Utilizing more information – Relaxing experiment limitation – Other audit data format

slide-27
SLIDE 27

27

Conclusion Conclusion