7/18/2017 ISSISP 2017- (C) Lakhotia 1
Malware Analysis Arun Lakhotia University of Louisiana at - - PowerPoint PPT Presentation
Malware Analysis Arun Lakhotia University of Louisiana at - - PowerPoint PPT Presentation
Malware Analysis Arun Lakhotia University of Louisiana at Lafayette, USA Presented at ISSISP 2017, CNRS Gif Sur Yvette 7/18/2017 ISSISP 2017- (C) Lakhotia 1 Introduction Professor of Computer Science Founder, CEO 7/18/2017 ISSISP 2017-
Introduction
7/18/2017 ISSISP 2017- (C) Lakhotia 2
Professor of Computer Science Founder, CEO
Geolocation
7/18/2017 ISSISP 2017- (C) Lakhotia 3
7/18/2017 ISSISP 2017- (C) Lakhotia 4
Plan of talk
Malware detection in practice Binary Analysis Challenges in Binary Analysis
7/18/2017 ISSISP 2017- (C) Lakhotia 5
Malware detection
In Practice
What is Malware?
“Software that steals your data. Software that destroys your data. Software that abuses your machine.” @Pinkflawd
7/18/2017 ISSISP 2017- (C) Lakhotia 6
Types of malware
Ransomware Botnets Password Stealers Remote-Access-Trojans (RATs) Click-jackers (stealing ad clicks) Banking Trojans SCADA disruptors
7/18/2017 ISSISP 2017- (C) Lakhotia 7
How to determine something is malware?
Run it Observe if it
steals or destroys your data abuses your machine
7/18/2017 ISSISP 2017- (C) Lakhotia 8
Determining malware in practice
Individually testing each program on every machine for maliciousness is not feasible In reality:
Someone observes some unexpected activity Traces activity to a program Passes it on to a security expert Expert analyzes to confirm Creates a ‘profile’ of the program Uses the ‘profile’ to detect other occurrences
- f the malware
7/18/2017 ISSISP 2017- (C) Lakhotia 9
7/18/2017 ISSISP 2017- (C) Lakhotia 10
Malware Detection Process (Theory)
Suspect Malicious: yes/no Profile Scanner Malicious: yes/no File/Message IN THE WILD AV LAB
7/18/2017 ISSISP 2017- (C) Lakhotia 11
Virus (Malware) Identification
Anti-Virus
Signature
Virus Form - A
Antivirus scanners use extracted patterns, or “signatures” to identify known malware.
7/18/2017 ISSISP 2017- (C) Lakhotia 12
Static Signature
Hex strings from virus variants
67 33 74 20 73 38 6D 35 20 76 37 61 67 36 74 20 73 32 6D 37 20 76 38 61 67 39 74 20 73 37 6D 33 20 76 36 61
Hex string for detecting virus
67 ?? 74 20 73 ?? 6D ?? 20 76 ?? 61
?? = wildcard
7/18/2017 ISSISP 2017- (C) Lakhotia 13
Dynamic Signature
Monitor a running program to detect malicious behavior Examples
Analyze audit trails Look at patterns of system calls
Allows examination of only selected testcases
Malware detection ecosystem has a lot of sharing
7/18/2017 ISSISP 2017- (C) Lakhotia 14
Malware Repositories
AV Vendor
End customers
AV Vendor
End customers
VirusTotal
Suspect files, daily volume
7/18/2017 ISSISP 2017- (C) Lakhotia 15
7/18/2017 ISSISP 2017- (C) Lakhotia 16
Multiple-Scanner Report
7/18/2017 ISSISP 2017- (C) Lakhotia 17
Malware Detection Process (Practice)
Suspect Malicious: yes/no Profile Scanner Malicious: yes/no File/Message IN THE WILD AV LAB
Malware Definition: In practice
X is a malware:
if it creates a huge hue and cry if P out of S AV scanners (on VT) say it is malware if some customer report it as suspect and a security analyst confirms
7/18/2017 ISSISP 2017- (C) Lakhotia 18
How to perform Community Voting
Use Hash(X) instead of X. Hash(X) is malware if:
if P out of S AV scanners (on VT) say it is malware
Community Voting is very rigid. Cannot check for unseen malware.
7/18/2017 ISSISP 2017- (C) Lakhotia 19
Other challenges related to Malware
Determine the objective of a malware Determine the actors/creators Disrupt botnets
7/18/2017 ISSISP 2017- (C) Lakhotia 20
BINARY ANALYSIS
7/18/2017 ISSISP 2017- (C) Lakhotia 21
Learn about you
Binary Analysis:
Level of knowledge: Level 1-5 (low-high) How much do you care? Level 1-5
7/18/2017 ISSISP 2017- (C) Lakhotia 22
Binary Analysis – Why?
Debugging and Patching Legacy Migration Software Protection
Protecting IP
Software Cracking Malicious Detection
Binary with undesired/unknown behavior
7/18/2017 ISSISP 2017- (C) Lakhotia 23
Binary Analysis Tools
STATIC Hex editor PE/ELF editors Disassembler Decompiler Data/control flow Abstract interpreter Specialized checkers
Buffer overflow Theorem provers
DYNAMIC Debugger Emulator Run-time monitors Network monitors Fuzzers MIXED – CONCOLIC
Combination of dynamic and static
7/18/2017 ISSISP 2017- (C) Lakhotia 24
7/18/2017 ISSISP 2017- (C) Lakhotia 25
History of analysis tools
50+ years of program analysis (PA)
compilers, security analysis, …
25+ for reverse engineering (RE)
design recovery, reengineering, evolution, …
Fundamental theories, algorithms, methods
program decomposition, abstraction disassembly, flow graphs liveness, dependence, dominance, … clustering, abstraction, visualization, comparison
26
- bject
parse
Compiler processing
create control & data flow generate code
7/18/2017 ISSISP 2017- (C) Lakhotia
27
result disassemble
Binary analysis, adapted from source
extract procedures extract control & data flow verify property
7/18/2017 ISSISP 2017- (C) Lakhotia
28
Decomposing binaries
main() { Max(0xA, 0xB); Max(0xC, 0xD); } Max(int x, int y) { if (x > y) return 1; return 0; } L01: PUSH 0xA L02: PUSH 0xB L03: CALL L08 L04: PUSH 0xC L05: PUSH 0xD L06: CALL L08 L07: RET L08: MOV eax, [esp+4] L09: MOV ebx, [esp+8] L10: CMP eax, ebx L12: JG L14 L13: MOV eax, 0 L14: RET L15: MOV eax, 1 L16: RET
High Level Program Disassembled Binary Procedures are encapsulated No syntactic boundary for procedures Partition into procedures?
7/18/2017 ISSISP 2017- (C) Lakhotia
29
Analysis of Binary
L01: PUSH 0xA L02: PUSH 0xB L03: CALL L08 L04: PUSH 0xC L05: PUSH 0xD L06: CALL L08 L07: RET L08: MOV eax, [esp+4] L09: MOV ebx, [esp+8] L10: CMP eax, ebx L12: JG L14 L13: MOV eax, 0 L14: RET L15: MOV eax, 1 L16: RET MOV eax, [esp+4] MOV ebx, [esp+8] CMP eax, ebx JG L14 MOV eax, 1 RET MOV eax, 0 RET PUSH 0xA PUSH 0xB CALL L08 PUSH 0xC PUSH 0xD CALL L08 RET
Disassembled Program Interprocedural CFG Procedure 1 Procedure 2
7/18/2017 ISSISP 2017- (C) Lakhotia
30
Binary Analysis - Challenges
7/18/2017 ISSISP 2017- (C) Lakhotia
31
certify / reject disassemble
Typical analysis pipelines
extract procedures extract control & data flow verify property
VIRUS DATABASE
7/18/2017 ISSISP 2017- (C) Lakhotia
32
certify / reject disassemble
Problem: Not hardened
extract procedures extract control & data flow verify property
DATABASE
SILENT FAILURE! D I S A B L E D ! D I S A B L E D
7/18/2017 ISSISP 2017- (C) Lakhotia
33
Typical analysis pipeline
disassemble extract procedures extract control & data flow verify property certify / reject
DATABASE
7/18/2017 ISSISP 2017- (C) Lakhotia
34
Attack: Disassembly
decode machine instructions (byte seq)
disassemble extract procedures extract control & data flow verify property
401063: 5d pop %ebp 401 0106 064: c3 c3 ret et 401 0106 065: 55 55 pus ush h % %ebp bp 401066: 89 e5 mov %esp,%ebp 401 0106 068: 83 83 e ec 0 08 sub ub $ $0x8 x8,% ,%esp sp 401 0106 06b: eb eb 0 05 jmp mp 0x40 4010 1072 401 0106 06d: e8 e8 e ee f ff f ff f ff f cal all l 0 0x40 4010 1060 401072: e8 e9 ff ff ff call 0x401060 401 0107 077: c7 c7 4 45 f fc 0 00 0 00 0 00 0 00 00 mov
- vl
l $ $0x0 x0,0 ,0xff fffff ffff ffc( c(%eb ebp) 401 0107 07e: 81 81 7 7d f fc e e7 7 03 3 00 0 00 00 cmp mpl l $ $0x3 x3e7 e7,0x 0xfff ffff ffff ffc(% (%ebp bp)
ORIG BYTES ASSEMBLY
401063: 5d pop %ebp 401 0106 064: c3 c3 ret et 401 0106 065: 55 55 pus ush h % %ebp bp 401066: 89 e5 mov %esp,%ebp 401 0106 068: 83 83 e ec 0 08 sub ub $ $0x8 x8,% ,%esp sp 401 0106 06b: eb eb 0 05 jmp mp 0x40 4010 1072 401 0106 06d: c7 c7 e ee f ff f ff f ff f e8 8 mov
- v
$ $0xe xe8f 8ffff ffff, f,%e %esi si 401073: e9 ff ff ff c7 jmp 0xc8401077 401 0107 078: 45 45 inc nc % %ebp bp 401 0107 079: fc fc cld ld
malicious func jump over junk bad disassembly (no jump target)
7/18/2017 ISSISP 2017- (C) Lakhotia
M/o/vfusctor (by Chris Domas)
7/18/2017 ISSISP 2017- (C) Lakhotia 35
7/18/2017 ISSISP 2017- (C) Lakhotia 36
Attack: Defeat CFG Construction
7/18/2017 ISSISP 2017- (C) Lakhotia 37
Transform code to data
7/18/2017 ISSISP 2017- (C) Lakhotia 38
Defeat signatures: Packer, with encryption
7/18/2017 ISSISP 2017- (C) Lakhotia 39
Packer - Limitation
Original code in clear text at some point
7/18/2017 ISSISP 2017- (C) Lakhotia 40
Slip a VM under the program
Protectors – Virtual Machine
7/18/2017 ISSISP 2017- (C) Lakhotia 41
42
Variants vs Family
2000 4000 6000 8000 10000 12000
Half Year Total Variants Total Family
Total Variants 994 1702 4496 7360 10866 10992 6784 Total Family 141 184 164 171 170 104 101 03-I 03-II 04-I 04-II 05-I 05-II 06-I
Source: Symantec Corp 2006
7/18/2017 ISSISP 2017- (C) Lakhotia
7/18/2017 ISSISP 2017- (C) Lakhotia 43
Theoretical Challenge: Undecidability
Waiting for page to load
Do you hit ‘reload’ or do you wait?
Halting Problem
Write a program that answers:
Will the program P halt for any input?
No program can correctly answer this question for all programs
Virus (malware) detection problem
Write a program that answers:
Is program P a virus?
Problem is undecidable (Cohen 1984)
7/18/2017 ISSISP 2017- (C) Lakhotia 44
Implications of Undecidability
Analysis problems are undecidable Precise solutions cannot be computed Solutions are approximated Play ‘safe’: over approximate or under approximate Catch: ‘Safe’ solutions leave hideouts for malware ‘Safe’ solution Precise solution Hideout for malware
7/18/2017 ISSISP 2017- (C) Lakhotia 45
Obfuscation also has limits
Obfuscation increases:
Code size Runtime
Cannot be applied ad-infinitum Research challenge:
How to take advantage of limits of
- bfuscation?
7/18/2017 ISSISP 2017- (C) Lakhotia 46