[PPT] - Anomaly Detection Algorithms for Malware Traffic Analysis using PowerPoint Presentation

SLIDE 1

Anomaly Detection Algorithms for Malware Traffic Analysis using Tamper Resistant Features

Dr. Patrick McDaniel

Berkay Celik Fall 2015

SLIDE 2

Page

Introduction
Motivation
Related Work
Data
Approach
Experimental Results
Comparison with Previous Work
Conclusion and Discussion
References

2

SLIDE 3

Page

Malware Infection

3

Image credit: http://www.vblaze.com/

SLIDE 4

Page

Malware/Legitimate Communication

4

Packet Packet

Do features extracted from packet headers discriminate legitimate applications from malware traffic ?

How many packets should be aggregated for feature

extraction?

Which feature subset should be used for detection?

SLIDE 5

Page

Goal:

Focus on detecting malware heartbeat traffic
Features should be tamper resistant (i.e., not easy to fool

such as port numbers or flags in packet headers)

Malware traffic is rare, evaluation of anomaly detection

algorithms

5

To analyze and detect the network-level behavior of malware traffic after blending into the normal traffic:

SLIDE 6

Page

Current state of the art

Systems based on known signatures: Well studied, a

drawback of these systems is not detecting unknown malware traffic

Payload inspection: Vulnerable to privacy issues,

payload encryption and limitations in processing high- speed (multigigabit) networks

Feature

representation: Drawbacks in selecting “tamper-proof” features such as using port numbers, payload information, protocol specific information and unrealistic malware traffic features when modelling the traffic

Supervised

classification algorithms: The requirement

f

targeted anomalous samples is a disadvantage of these approaches

6

Related Work

SLIDE 7

Page

Legitimate Traffic features traces of

a small scale organization network recorded at University of Twente with around 35 employees and over 100 students

7

Dataset

Legitimate Traffic, 7753 x13 instances Malware Traffic 3513X13 instances

(as a total 16 different malware families)

Image credit: http://www.vblaze.com/

SLIDE 8

Page

Feature space (13 features, all continuous):

8

Flow duration: Difference between last packet time and first

packet time

Count of Payload (+): The count of all the packets with at

least a byte of data payload

Min data size (+): Minimum payload size observed
Mean of bytes (-): Data bytes divided by the total number of

packets

Initial Data Length (*): The total number of bytes sent in initial

window

RTT samples (*): Total number of RTT samples found in total

packets

Median and Variance of bytes (+): Median and variance of

total packet bytes

IP ratio(*): Ratio between the maximum packet size and

minimum packet size

Goodput(*):

Total number

f

frame bytes divided flow duration

SLIDE 9

Page

Feature selection:

9

These papers are the guidelines for the feature selection process:

Wei Li, Marco Canini, Andrew W Moore, and Raffaele Bolla.

Efficient application identification and the temporal and spatial stability of classification schema, Computer Networks, 2009

A. Moore, D. Zuev, and M. Crogan. Discriminators for use in

flow based classification. Queen Mary and Westfield College, Department

f

Computer Science, 2005

Terry

Nelms, Roberto Perdisci, and Mustaque Ahamad. Execscent: Mining for new C&C domains in live networks with adaptive control protocol templates. In USENIX Security,2013

SLIDE 10

Page

Approach

Overview of Framework

10

Steps to achieve the goal

SLIDE 11

Page

Approach

One-class support vector machine (OCSVM)
The distance to the kth nearest neighbor (k-NN)
K-means clustering by finding the distance from data to the

nearest cluster centre

Least squares anomaly detection (LSAD) based on the least

squares probabilistic classifier

11

Steps to achieve the goal Image from official Scikit-learn, One-class SVM

SLIDE 12

Page

Approach

Evaluation Metrics

AUC (Area Under Curve)
ROC curve when necessary
Further experiments for analysis of malware

traffic

Confusion matrix
False positive and false negative counts
Interpretation of PCA and K-means clustering

12

Steps to achieve the goal

SLIDE 13

Page

Experimental Setup:

Hyper parameters are set using the subset
f the training set
Stratified

k-fold cross validation (k is selected depending on the malware traffic size)

r

random sampling is applied depending

n

the number

f

malware instances

A paired t-test with significance level 0.05 to

report the differences of each algorithms' AUC values

13

Steps to achieve the goal

SLIDE 14

Page

Experimental Results

14

Steps to achieve the goal

SLIDE 15

Page

Experimental Results:

Avg. ROC plots the percentage of correctly

classified malicious samples (true positive rate) against the percentage of legitimate samples falsely classified as malicious (false positive rate)

15

Steps to achieve the goal (More details of ROC curve for each fold is given in report)

SLIDE 16

Page

Experimental Results:

ROC plots with cross validation the percentage of correctly

classified malicious samples (true positive rate) against the percentage of legitimate samples falsely classified as malicious (false positive rate)

16

Steps to achieve the goal Kaiten vs Neris malware (More details of ROC curve for each fold is given in report)

SLIDE 17

Page

Lessons Learned from initial results:

No single algorithm performs better than
thers
Detection Results decrease with the recent

evolution of malware families e.g., Zeus V1 to Zeus V2

Recent malware traffic gets stealthy, and

evades the detection (disguising traffic)

17

Steps to achieve the goal

SLIDE 18

Page

Understanding source of false negatives and false positives:

Number of malware flows classified as legitimate HTTP(S)

18

Steps to achieve the goal

Mean Values, std is in range +/- 0.53 for all families
Port numbers as a ground truth labels
C4.5 algorithm for classification

Number of legitimate HTTP(S) flows classified as malware

SLIDE 19

Page

Detailed Analysis:

Confusion Matrix after cross validation

19

Steps to achieve the goal

Base Classifier (majority class) vs. C4.5 algorithm

(More details are given in report)

SLIDE 20

Page

Network Behavior of Malware Families:

20

Steps to achieve the goal Log scale plot of incoming and outgoing ratio of packet bytes

Most similar HTTP traffic observed between malware

and legitimate traces, from constant packet ratio to varying packet ratio

SLIDE 21

Page

Analysis of Feature Space of Malware (Code Reuse):

21

Steps to achieve the goal

Feature Projection to two Dimensional Space using PCA and K-means Clustering

Tbot and Kaiten are close to each other, and form a single
cluster. However, Agabot is not as close as the other malware
families. Zeus V1, ZeusGameover, ZeusPonyloader, ZeusV2 and

Sality form in similar feature range, and most of their instances are assigned to the same clusters

SLIDE 22

Page

Recent papers:

Looks for the multiple source of information i.e., features extracted

from not only packets, but also IP addresses, DNS features, HTTP requests etc.

T. Nelms, R. Perdisci, and M. Ahamad. Execscent: Mining for new

C&C domains in live networks with adaptive control protocol

templates. In Proc. USENIX Security Symposium, 2013
Focusing on before infection phase, we assume that hosts

are already infected and generates traffic. More challenging...

L. Invernizzi, S.-J. Lee, S. Miskovic, M. Mellia, R. Torres, C.

Kruegel,

S. Saha, and G. Vigna. Nazca: Detecting malware distribution in

largescale networks. In Proc. Network and Distributed System Security Symposium (NDSS), 2014

Detection Accuracy is mostly high due to the use of tamper

proof features

Port numbers, flags and payload is used

22

Steps to achieve the goal

SLIDE 23

Page

Presented a framework that evaluates the detection performance of

malware heartbeat traffic after blending into legitimate applications

Our framework effectively discriminates most of the C&C heartbeat

traffic from legitimate traffic by only using tamper resistant features

f transport layer protocol
We observe substantial decrease in detection with the recent

malware families

Malware traffic is disguised in HTTP traffic to conduct an

evasion attack

Code reuse is common practice in malware families
Provide a discussion of importance of using tamper resistant

feature space, and multiple source of information to alleviate the false negatives by improving the underlying feature space

23

Steps to achieve the goal

Conclusion/Discussion

SLIDE 24

Page

Key Papers

F.Kocak, D. J. Miller, and G. Kesidis. Detecting anomalous latent classes in a batch of network traffic flows. In Proc. Information Sciences and Systems (CISS), 2014

24

Steps to achieve the goal Wei Li, Marco Canini, Andrew W Moore, and Raffaele Bolla. Efficient application identification and the temporal and spatial stability of classification schema, Computer Networks, 2009

Feature Selection: Methodology and Insights: Anomaly Detection Algorithms:

V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A
survey. In ACM Computing Surveys, 2009.

State of the art paper in this research area:

Gu, R. Perdisci, J. Zhang, W. Lee, et al. Botminer: Clustering analysis of network traffic for protocol-and structure-independent botnet

detection. In Proc. USENIX Security Symposium, 2008

SLIDE 25

QUESTIONS

Anomaly Detection Algorithms for Malware Traffic Analysis using Tamper Resistant Features

25