Yihua Liao, V. Rao Vemuri Mingxing Gong CISC850 Cyber Analytics - - PowerPoint PPT Presentation

yihua liao v rao vemuri
SMART_READER_LITE
LIVE PREVIEW

Yihua Liao, V. Rao Vemuri Mingxing Gong CISC850 Cyber Analytics - - PowerPoint PPT Presentation

Use of K-Nearest Neighbor classifier for intrusion detection Yihua Liao, V. Rao Vemuri Mingxing Gong CISC850 Cyber Analytics Outline Introduction Methodology Experiments Discussion & Conclusion Outline Introduction


slide-1
SLIDE 1

Use of K-Nearest Neighbor classifier for intrusion detection Yihua Liao, V. Rao Vemuri

Mingxing Gong

CISC850 Cyber Analytics

slide-2
SLIDE 2

Outline

  • Introduction
  • Methodology
  • Experiments
  • Discussion & Conclusion
slide-3
SLIDE 3

Outline

  • Introduction
  • Methodology
  • Experiments
  • Discussion & Conclusion
slide-4
SLIDE 4

Introduction

▪ High false alarm probability or low attack detection accuracy ▪ Two general approaches:

  • Misuse detection
  • Anomaly detection

▪ Local ordering vs. frequency of system calls

slide-5
SLIDE 5

Nearest Neighbour Rule

Consider a two class problem where each sample consists of two measurements (x,y). k = 1 k = 3 Compute the k nearest neighbours and assign the class by majority vote.

Reference: www.robots.ox.ac.uk/~dclaus/cameraloc/samples/nearestneighbour.ppt

slide-6
SLIDE 6

Outline

  • Introduction
  • Methodology
  • Experiments
  • Discussion & Conclusion
slide-7
SLIDE 7

Methodology

  • Apply text categorization methods to intrusion detection
slide-8
SLIDE 8

Methodology

  • Each document is represented by a vector of words
  • Weighting approach tf·idf (term frequency – inverse document

frequency)

  • The cosine similarity is defined as follows:
slide-9
SLIDE 9

Outline

  • Introduction
  • Methodology
  • Experiments
  • Discussion & Conclusion
slide-10
SLIDE 10

Experiments

  • DARPA data
  • Cross validation and 50 distinct system calls
slide-11
SLIDE 11

KNN classifier algorithm for anomaly detection

slide-12
SLIDE 12

KNN classifier performance

slide-13
SLIDE 13
  • The overall running time of the kNN method is O(N)
  • Integrate with signature verification

Anomaly Detection

slide-14
SLIDE 14

Frequency Weighting vs. tf·idf Weighting

slide-15
SLIDE 15

Frequency Weighting vs. tf·idf Weighting

slide-16
SLIDE 16

Outline

  • Introduction
  • Methodology
  • Experiments
  • Discussion & Conclusion
slide-17
SLIDE 17

Discussion

  • kNN Classifier advantages
  • Compared tf·idf weighting with the frequency weighting
  • Classification cost can be further reduced by only using most

influential system calls

slide-18
SLIDE 18

Conclusion

  • kNN Classifier is able to effectively detect intrusive program

behavior with low false positive rate

  • Further research is in process to investigate the reliability and

scaling properties of the kNN classifier method

slide-19
SLIDE 19

Reference

[1] www.robots.ox.ac.uk/~dclaus/cameraloc/samples/nearestneighbour.ppt [2] Yihua Liao, V. Rao Vemuri, ‘Use of K-Nearest Neighbor classifier for intrusion detection’, Computers & Security, Volume 21, Issue 5, 1 October 2002, Pages 439-448