Mylobot, Detecting the Undetected Using Deep Learning
Yael Daihes
01
Mylobot, Detecting the Undetected Using Deep Learning Yael Daihes - - PowerPoint PPT Presentation
01 Mylobot, Detecting the Undetected Using Deep Learning Yael Daihes 02 WHO AM I Yael Daihes Security Data Science Team Lead @ Akamai Technologies My things - Botnets, Traffic, Data, Algorithms, Reading, Painting and a bit of Gaming (:
Yael Daihes
01
WHO AM I
Yael Daihes Security Data Science Team Lead @ Akamai Technologies My things - Botnets, Traffic, Data, Algorithms, Reading, Painting and a bit of Gaming (:
02
03
1 Mylobot - What is it? 2 What Is DGA? How Has the Defense Community Tackle the DGA Problem So Far? 3 How Did We Tackle this Issue? Overview of Our Detection System 4 Results in the Wild 5 Mylobot - As we see it
04
05
06
07
*Anti VM techniques *Anti-sandbox techniques *Anti-debugging techniques *Wrapping internal parts with an encrypted resource file *Code injection *Process hollowing *CNC communication delaying mechanism - 14 days before accessing its command and control servers
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- a-new-highly-sophisticated-never-seen-before-botnet- thats-out-in-the-wild/
Dear Bot master, what do I do next? Command and Control Send me a screen shot botmaster.com
08
Dear Bot master, what do I do next? Send me a screen shot Defense botmaster.com
09
10
Generated Domains Asdiuouoi.top NX NX Hakjhsdkjh.top Whjrhkejwh.biz Hjkwrjkhew.biz DNS Response 1.2.3.4 NX ... ...
Day 1
CNC Channel for Sunday
11
Generated Domains ycrxmen.com 5.6.7.8 NX dtswwomss.eu ljfsmaroqok.com gvzoutzukdzth.ru DNS Response NX NX ... ...
Day 2
CNC Channel for Monday
12
13
Example - Could be used for creating the domain names and add to a block list (given the seed) How to find the seed? Code for creating "Simda" domain names, created by reversing the binary and implementing the logic in Python Cool ways that involve checking what’s in the traffic and brute forcing possibilities with really strong computers
github.com/baderj/domain_generation_algorithms
Reverse the DGA code
14
15
Detect new domains of DGAs I know, but couldn't break Detect new DGAs never seen or reported before Bonus - Detect what I can break and is known 16
Maybe we should check if the characters used in the domain name are basically.. gibberish?
17
Attackers adapt..
18
OK OK OK.. hold up - we're smarter. How does the 2- characters distribution look like?
Hooray!
19
Or not..?
20
Sequence model (char by char) Should be able to distinguish between a sequence generated artificially (DGA) and a sequence from a natural language Learns the patterns (by training)
WHY DEEP LEARNING?
21 Predicting Domain Generation Algorithms with Long Short-Term Memory Networks, Endgame -https://arxiv.org/abs/1611.00791
Generated by the reverse codes or captured in the wild [1]
1.2 MILLION DGA DOMAINS
Captured from normal traffic
1.2 MILLION BENIGN DOMAINS
90% Accuracy
Data Results
Can't classify which malware family
[1]dgarchive.caad.fkie.fraunhofer.de
22
Some clusters are separable Some clusters are too intertwined
Visualization of Domains as seen by the Deep Learning model
23
facebook.com
gvzozukdzth.ru goooogle.com ycrxmen.com ljfsaroqok.com 0.0 0.999 0.91 0.0 0.999 0.98
09
gvzozukdzth.ru ycrxmen.com ljfsaroqok.com DNS Queries made by
window Deep Learning model response -"How likely this is a DGA, between 0-1?
What does the Deep Learning model think? Classify the domains detected
Attribute to the specific malware L O C K Y D Y R E U N K N O W N # 1
24
2.5 million domains detected and blocked daily 70 million DNS requests blocked daily ~0% False Positives ±8 unknown (zero day?) DGAs detected
25
26
DNS query m8.zdrussle.ru IP: x.x.x.x
1
Connect to C2s for grabbing and executing second stage
Stage 1 - Mylobot (Downloader)
DNS Server
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- a-new-highly-sophisticated-never-seen-before-botnet- thats-out-in-the-wild/
[2] blog.centurylink.com/mylobot- continues-global-infections/
27
2
Where do I need to go next? Go and get http://1.2.3.4/malware.gif IP: x.x.x.x
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- a-new-highly-sophisticated-never-seen-before-botnet- thats-out-in-the-wild/
[2] blog.centurylink.com/mylobot- continues-global-infections/
28
Connect to C2s for grabbing and executing second stage
Stage 1 - Mylobot (Downloader)
3
GET http://1.2.3.4/malware.gif IP: 1.2.3.4 Malware file (unknown)
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- a-new-highly-sophisticated-never-seen-before-botnet- thats-out-in-the-wild/
[2] blog.centurylink.com/mylobot- continues-global-infections/
29
Connect to C2s for grabbing and executing second stage
Stage 1 - Mylobot (Downloader)
Unknown malicious activity Reported to have been using Khalesi as second stage [1][2] Run second malware
Stage 2
4
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- a-new-highly-sophisticated-never-seen-before-botnet- thats-out-in-the-wild/
[2] blog.centurylink.com/mylobot- continues-global-infections/
30
DNS query m8.zdrussle.ru IP: x.x.x.x
1 2
Unknown malicious activity Reported to have been using Khalesi as second stage [1][2] Run second malware
3 4
DNS Server Where do I need to go next? Go and get http://1.2.3.4/malware.gif IP: x.x.x.x GET http://1.2.3.4/malware.gif IP: 1.2.3.4 Malware file (unknown)
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- a-new-highly-sophisticated-never-seen-before-botnet- thats-out-in-the-wild/
[2] blog.centurylink.com/mylobot- continues-global-infections/
31
Connect to C2s for grabbing and executing second stage
Stage 1 - Mylobot (Downloader) Stage 2
The first step "DNS query m8.zdrussle.ru" The domain name is generated by a DGA, and the deep learning model detected it ~1,400 domains detected m<number between 0 and 43>.<domain generated by a DGA>.com|in|biz|org|net|me|cc|ru
3
32
33
The first step "DNS query m8.zdrussle.ru" The domain name is generated by a DGA, and the deep learning model detected it ~8,000 domains detected, of what we understand to be variants. The four variants differ in the DGA pattern used m<number between 0 and 43>.<domain generated by a DGA>.com|in|biz|org|net|me|cc|ru green<number between 0 and 43>.<domain generated by a DGA>.com|ru| v1.<domain generated by a DGA>.com|ru|net|org|bz|in|biz|su|eu|cc
1 2 3 4
x<number between 0 and 43>.<domain generated by a DGA>.com|ru|net|org|bz|in|biz|su|eu|cc
4
34
As researched by us and by our findings
35
As researched by us and by our findings
671 more domains resolving
36
As researched by us and by our findings
37 02
For comparison, DNS queries a day as seen in Akamais traffic: Pykspa ~ 1Million Qsnatch ~1Million Emotet ~500k Gameover Zeus ~200k
38
Entities - could be single user or a NAT
39
What's DGA
Piece of code some malwares have that generate domain names for forming C&C channel
Defense System: DGA detection in traffic
Trained a Deep Learning model and use it over live traffic
What's DGA
Piece of code some malwares have that generate domain names
Mylobot
Super active newly seen botnet this system detected, look out!
40
Investigate new patterns and domains
Threat Intelligence - Network Perspective
Monitor DNS traffic We will publish IOCs DGA detection beasts are possible!
Mylobot -Countermeasures
41
@Yael_Daihes
42