Malware Analysis Arun Lakhotia University of Louisiana at - - PowerPoint PPT Presentation

malware analysis
SMART_READER_LITE
LIVE PREVIEW

Malware Analysis Arun Lakhotia University of Louisiana at - - PowerPoint PPT Presentation

Malware Analysis Arun Lakhotia University of Louisiana at Lafayette, USA Presented at ISSISP 2017, CNRS Gif Sur Yvette 7/18/2017 ISSISP 2017- (C) Lakhotia 1 Introduction Professor of Computer Science Founder, CEO 7/18/2017 ISSISP 2017-


slide-1
SLIDE 1

7/18/2017 ISSISP 2017- (C) Lakhotia 1

Malware Analysis

Arun Lakhotia University of Louisiana at Lafayette, USA Presented at ISSISP 2017, CNRS Gif Sur Yvette

slide-2
SLIDE 2

Introduction

7/18/2017 ISSISP 2017- (C) Lakhotia 2

Professor of Computer Science Founder, CEO

slide-3
SLIDE 3

Geolocation

7/18/2017 ISSISP 2017- (C) Lakhotia 3

slide-4
SLIDE 4

7/18/2017 ISSISP 2017- (C) Lakhotia 4

Plan of talk

Malware detection in practice Binary Analysis Challenges in Binary Analysis

slide-5
SLIDE 5

7/18/2017 ISSISP 2017- (C) Lakhotia 5

Malware detection

In Practice

slide-6
SLIDE 6

What is Malware?

“Software that steals your data. Software that destroys your data. Software that abuses your machine.” @Pinkflawd

7/18/2017 ISSISP 2017- (C) Lakhotia 6

slide-7
SLIDE 7

Types of malware

Ransomware Botnets Password Stealers Remote-Access-Trojans (RATs) Click-jackers (stealing ad clicks) Banking Trojans SCADA disruptors

7/18/2017 ISSISP 2017- (C) Lakhotia 7

slide-8
SLIDE 8

How to determine something is malware?

Run it Observe if it

steals or destroys your data abuses your machine

7/18/2017 ISSISP 2017- (C) Lakhotia 8

slide-9
SLIDE 9

Determining malware in practice

Individually testing each program on every machine for maliciousness is not feasible In reality:

Someone observes some unexpected activity Traces activity to a program Passes it on to a security expert Expert analyzes to confirm Creates a ‘profile’ of the program Uses the ‘profile’ to detect other occurrences

  • f the malware

7/18/2017 ISSISP 2017- (C) Lakhotia 9

slide-10
SLIDE 10

7/18/2017 ISSISP 2017- (C) Lakhotia 10

Malware Detection Process (Theory)

Suspect Malicious: yes/no Profile Scanner Malicious: yes/no File/Message IN THE WILD AV LAB

slide-11
SLIDE 11

7/18/2017 ISSISP 2017- (C) Lakhotia 11

Virus (Malware) Identification

Anti-Virus

Signature

Virus Form - A

Antivirus scanners use extracted patterns, or “signatures” to identify known malware.

slide-12
SLIDE 12

7/18/2017 ISSISP 2017- (C) Lakhotia 12

Static Signature

Hex strings from virus variants

67 33 74 20 73 38 6D 35 20 76 37 61 67 36 74 20 73 32 6D 37 20 76 38 61 67 39 74 20 73 37 6D 33 20 76 36 61

Hex string for detecting virus

67 ?? 74 20 73 ?? 6D ?? 20 76 ?? 61

?? = wildcard

slide-13
SLIDE 13

7/18/2017 ISSISP 2017- (C) Lakhotia 13

Dynamic Signature

Monitor a running program to detect malicious behavior Examples

Analyze audit trails Look at patterns of system calls

Allows examination of only selected testcases

slide-14
SLIDE 14

Malware detection ecosystem has a lot of sharing

7/18/2017 ISSISP 2017- (C) Lakhotia 14

Malware Repositories

AV Vendor

End customers

AV Vendor

End customers

VirusTotal

slide-15
SLIDE 15

Suspect files, daily volume

7/18/2017 ISSISP 2017- (C) Lakhotia 15

slide-16
SLIDE 16

7/18/2017 ISSISP 2017- (C) Lakhotia 16

Multiple-Scanner Report

slide-17
SLIDE 17

7/18/2017 ISSISP 2017- (C) Lakhotia 17

Malware Detection Process (Practice)

Suspect Malicious: yes/no Profile Scanner Malicious: yes/no File/Message IN THE WILD AV LAB

slide-18
SLIDE 18

Malware Definition: In practice

X is a malware:

if it creates a huge hue and cry if P out of S AV scanners (on VT) say it is malware if some customer report it as suspect and a security analyst confirms

7/18/2017 ISSISP 2017- (C) Lakhotia 18

slide-19
SLIDE 19

How to perform Community Voting

Use Hash(X) instead of X. Hash(X) is malware if:

if P out of S AV scanners (on VT) say it is malware

Community Voting is very rigid. Cannot check for unseen malware.

7/18/2017 ISSISP 2017- (C) Lakhotia 19

slide-20
SLIDE 20

Other challenges related to Malware

Determine the objective of a malware Determine the actors/creators Disrupt botnets

7/18/2017 ISSISP 2017- (C) Lakhotia 20

slide-21
SLIDE 21

BINARY ANALYSIS

7/18/2017 ISSISP 2017- (C) Lakhotia 21

slide-22
SLIDE 22

Learn about you

Binary Analysis:

Level of knowledge: Level 1-5 (low-high) How much do you care? Level 1-5

7/18/2017 ISSISP 2017- (C) Lakhotia 22

slide-23
SLIDE 23

Binary Analysis – Why?

Debugging and Patching Legacy Migration Software Protection

Protecting IP

Software Cracking Malicious Detection

Binary with undesired/unknown behavior

7/18/2017 ISSISP 2017- (C) Lakhotia 23

slide-24
SLIDE 24

Binary Analysis Tools

STATIC Hex editor PE/ELF editors Disassembler Decompiler Data/control flow Abstract interpreter Specialized checkers

Buffer overflow Theorem provers

DYNAMIC Debugger Emulator Run-time monitors Network monitors Fuzzers MIXED – CONCOLIC

Combination of dynamic and static

7/18/2017 ISSISP 2017- (C) Lakhotia 24

slide-25
SLIDE 25

7/18/2017 ISSISP 2017- (C) Lakhotia 25

History of analysis tools

50+ years of program analysis (PA)

compilers, security analysis, …

25+ for reverse engineering (RE)

design recovery, reengineering, evolution, …

Fundamental theories, algorithms, methods

program decomposition, abstraction disassembly, flow graphs liveness, dependence, dominance, … clustering, abstraction, visualization, comparison

slide-26
SLIDE 26

26

  • bject

parse

Compiler processing

create control & data flow generate code

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-27
SLIDE 27

27

result disassemble

Binary analysis, adapted from source

extract procedures extract control & data flow verify property

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-28
SLIDE 28

28

Decomposing binaries

main() { Max(0xA, 0xB); Max(0xC, 0xD); } Max(int x, int y) { if (x > y) return 1; return 0; } L01: PUSH 0xA L02: PUSH 0xB L03: CALL L08 L04: PUSH 0xC L05: PUSH 0xD L06: CALL L08 L07: RET L08: MOV eax, [esp+4] L09: MOV ebx, [esp+8] L10: CMP eax, ebx L12: JG L14 L13: MOV eax, 0 L14: RET L15: MOV eax, 1 L16: RET

High Level Program Disassembled Binary Procedures are encapsulated No syntactic boundary for procedures Partition into procedures?

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-29
SLIDE 29

29

Analysis of Binary

L01: PUSH 0xA L02: PUSH 0xB L03: CALL L08 L04: PUSH 0xC L05: PUSH 0xD L06: CALL L08 L07: RET L08: MOV eax, [esp+4] L09: MOV ebx, [esp+8] L10: CMP eax, ebx L12: JG L14 L13: MOV eax, 0 L14: RET L15: MOV eax, 1 L16: RET MOV eax, [esp+4] MOV ebx, [esp+8] CMP eax, ebx JG L14 MOV eax, 1 RET MOV eax, 0 RET PUSH 0xA PUSH 0xB CALL L08 PUSH 0xC PUSH 0xD CALL L08 RET

Disassembled Program Interprocedural CFG Procedure 1 Procedure 2

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-30
SLIDE 30

30

Binary Analysis - Challenges

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-31
SLIDE 31

31

certify / reject disassemble

Typical analysis pipelines

extract procedures extract control & data flow verify property

VIRUS DATABASE

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-32
SLIDE 32

32

certify / reject disassemble

Problem: Not hardened

extract procedures extract control & data flow verify property

DATABASE

SILENT FAILURE! D I S A B L E D ! D I S A B L E D

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-33
SLIDE 33

33

Typical analysis pipeline

disassemble extract procedures extract control & data flow verify property certify / reject

DATABASE

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-34
SLIDE 34

34

Attack: Disassembly

decode machine instructions (byte seq)

disassemble extract procedures extract control & data flow verify property

401063: 5d pop %ebp 401 0106 064: c3 c3 ret et 401 0106 065: 55 55 pus ush h % %ebp bp 401066: 89 e5 mov %esp,%ebp 401 0106 068: 83 83 e ec 0 08 sub ub $ $0x8 x8,% ,%esp sp 401 0106 06b: eb eb 0 05 jmp mp 0x40 4010 1072 401 0106 06d: e8 e8 e ee f ff f ff f ff f cal all l 0 0x40 4010 1060 401072: e8 e9 ff ff ff call 0x401060 401 0107 077: c7 c7 4 45 f fc 0 00 0 00 0 00 0 00 00 mov

  • vl

l $ $0x0 x0,0 ,0xff fffff ffff ffc( c(%eb ebp) 401 0107 07e: 81 81 7 7d f fc e e7 7 03 3 00 0 00 00 cmp mpl l $ $0x3 x3e7 e7,0x 0xfff ffff ffff ffc(% (%ebp bp)

ORIG BYTES ASSEMBLY

401063: 5d pop %ebp 401 0106 064: c3 c3 ret et 401 0106 065: 55 55 pus ush h % %ebp bp 401066: 89 e5 mov %esp,%ebp 401 0106 068: 83 83 e ec 0 08 sub ub $ $0x8 x8,% ,%esp sp 401 0106 06b: eb eb 0 05 jmp mp 0x40 4010 1072 401 0106 06d: c7 c7 e ee f ff f ff f ff f e8 8 mov

  • v

$ $0xe xe8f 8ffff ffff, f,%e %esi si 401073: e9 ff ff ff c7 jmp 0xc8401077 401 0107 078: 45 45 inc nc % %ebp bp 401 0107 079: fc fc cld ld

malicious func jump over junk bad disassembly (no jump target)

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-35
SLIDE 35

M/o/vfusctor (by Chris Domas)

7/18/2017 ISSISP 2017- (C) Lakhotia 35

slide-36
SLIDE 36

7/18/2017 ISSISP 2017- (C) Lakhotia 36

slide-37
SLIDE 37

Attack: Defeat CFG Construction

7/18/2017 ISSISP 2017- (C) Lakhotia 37

slide-38
SLIDE 38

Transform code to data

7/18/2017 ISSISP 2017- (C) Lakhotia 38

slide-39
SLIDE 39

Defeat signatures: Packer, with encryption

7/18/2017 ISSISP 2017- (C) Lakhotia 39

slide-40
SLIDE 40

Packer - Limitation

Original code in clear text at some point

7/18/2017 ISSISP 2017- (C) Lakhotia 40

slide-41
SLIDE 41

Slip a VM under the program

Protectors – Virtual Machine

7/18/2017 ISSISP 2017- (C) Lakhotia 41

slide-42
SLIDE 42

42

Variants vs Family

2000 4000 6000 8000 10000 12000

Half Year Total Variants Total Family

Total Variants 994 1702 4496 7360 10866 10992 6784 Total Family 141 184 164 171 170 104 101 03-I 03-II 04-I 04-II 05-I 05-II 06-I

Source: Symantec Corp 2006

7/18/2017 ISSISP 2017- (C) Lakhotia

slide-43
SLIDE 43

7/18/2017 ISSISP 2017- (C) Lakhotia 43

Theoretical Challenge: Undecidability

Waiting for page to load

Do you hit ‘reload’ or do you wait?

Halting Problem

Write a program that answers:

Will the program P halt for any input?

No program can correctly answer this question for all programs

Virus (malware) detection problem

Write a program that answers:

Is program P a virus?

Problem is undecidable (Cohen 1984)

slide-44
SLIDE 44

7/18/2017 ISSISP 2017- (C) Lakhotia 44

Implications of Undecidability

Analysis problems are undecidable Precise solutions cannot be computed Solutions are approximated Play ‘safe’: over approximate or under approximate Catch: ‘Safe’ solutions leave hideouts for malware ‘Safe’ solution Precise solution Hideout for malware

slide-45
SLIDE 45

7/18/2017 ISSISP 2017- (C) Lakhotia 45

Obfuscation also has limits

Obfuscation increases:

Code size Runtime

Cannot be applied ad-infinitum Research challenge:

How to take advantage of limits of

  • bfuscation?
slide-46
SLIDE 46

7/18/2017 ISSISP 2017- (C) Lakhotia 46

Summary

Malware Detection Ecosystem Binary Analysis – Areas and Issues

Binary Analysis Challenges

Anti-AV Techniques

Transform, Hide