Rockem Sockem Robots Bot Swatting Like The Pros Aaron Bedra - - PowerPoint PPT Presentation

rock em sock em robots
SMART_READER_LITE
LIVE PREVIEW

Rockem Sockem Robots Bot Swatting Like The Pros Aaron Bedra - - PowerPoint PPT Presentation

Rockem Sockem Robots Bot Swatting Like The Pros Aaron Bedra Principal Engineer, Groupon @abedra keybase.io/abedra "Well, there's a judge and a subject, and... the judge asks questions and, depending on the subject's answers,


slide-1
SLIDE 1

Rock’em Sock’em Robots

Bot Swatting Like The Pros

Aaron Bedra Principal Engineer, Groupon @abedra keybase.io/abedra

slide-2
SLIDE 2

"Well, there's a judge and a subject, and... the judge asks questions and, depending on the subject's answers, determines who he is talking with... what he is talking with, and, um... All you have to do is ask me a question."

  • - Alan Turing, The Imitation Game
slide-3
SLIDE 3

Asymmetric warfare

slide-4
SLIDE 4

The internet is powered by robots

slide-5
SLIDE 5
slide-6
SLIDE 6

We employ teams of people to help manage good robots

slide-7
SLIDE 7

But all robots are not created equal

slide-8
SLIDE 8

10.20.253.8 - - [08/Apr/2015:09:17:52 +0000] "POST /login HTTP/1.1" 200 267"-" “curl/ 7.35.0” "77.77.165.233"

slide-9
SLIDE 9

10.20.253.8 - - [08/Apr/2015:10:20:21 +0000] "POST /login HTTP/1.1" 200 267"-" "Mozilla/ 5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/ 20100101 Firefox/8.0" "77.77.165.233"

slide-10
SLIDE 10

Some robots are more trouble than they are worth

slide-11
SLIDE 11

How much of your traffic is bot related?

slide-12
SLIDE 12

How much of it should be?

slide-13
SLIDE 13

Who here does testing/tracking?

slide-14
SLIDE 14

How bad do these robots throw off your tests?

slide-15
SLIDE 15

What else are bots doing on your site?

slide-16
SLIDE 16

Let’s talk about common types

slide-17
SLIDE 17

Spiders

slide-18
SLIDE 18

The root of most things we will talk about

slide-19
SLIDE 19

They are often used inside of scrapers and scanners to find content

slide-20
SLIDE 20

But can be used on their own as well

slide-21
SLIDE 21

Trivial to build

slide-22
SLIDE 22

How to build a spider

  • Go to starting page
  • Gather all links on the page and put them into a

queue

  • Visit link in queue (gathering links and adding to

queue)

  • Repeat until queue is empty (or sentinel)
  • Keep a record of all links visited
slide-23
SLIDE 23

Spiders are usually easy to detect

slide-24
SLIDE 24

They deviate from typical behavior quickly

slide-25
SLIDE 25

5 % 5 % 4 % 27 % 59 %

GET POST HEAD PUT DELETE

slide-26
SLIDE 26

Simply sampling traffic and comparing for deviation can usually catch a spider

slide-27
SLIDE 27

Velocity can also be an indicator

slide-28
SLIDE 28

Scrapers

slide-29
SLIDE 29

They want your data

slide-30
SLIDE 30

Scenario 1: You provide an API

slide-31
SLIDE 31

Either stop them outright

  • r refer them to the API
slide-32
SLIDE 32

Scenario 2: You don’t and they shouldn’t be doing this

slide-33
SLIDE 33

Stop them

slide-34
SLIDE 34

Scenario 3: You don’t provide an API and you should

slide-35
SLIDE 35

Stop being lazy

slide-36
SLIDE 36

APIs are for machines, Web Interfaces are for Humans

slide-37
SLIDE 37

If there’s no reason for a machine, don’t allow it*

slide-38
SLIDE 38

Most of the time scrapers are dumb

slide-39
SLIDE 39

<!— <a href=“gotcha”></a> —>

slide-40
SLIDE 40

Start with simple

slide-41
SLIDE 41

Accept that a small portion

  • f really intelligent scrapers

will make it through

slide-42
SLIDE 42

Detection is similar to spiders

slide-43
SLIDE 43

In fact, a spider might precede a scraper

slide-44
SLIDE 44

But behavior deviation is still an acceptable detection mechanism

slide-45
SLIDE 45

Scanners

slide-46
SLIDE 46

Unlike scrapers and spiders, scanners are purely malicious

slide-47
SLIDE 47

They are looking for vulnerabilities in your application(s)

slide-48
SLIDE 48

They are also pretty easy to spot

slide-49
SLIDE 49

They deviate from normal behavior

slide-50
SLIDE 50

They submit obviously malicious data

slide-51
SLIDE 51

And they produce a lot of 404s

slide-52
SLIDE 52

You want to block these*

slide-53
SLIDE 53

WAFs can help

slide-54
SLIDE 54

But prefer running a WAF in passive mode

slide-55
SLIDE 55

Other

slide-56
SLIDE 56

Fraud, (D)DoS, Espionage, etc.

slide-57
SLIDE 57

Still falls in the “malicious” category

slide-58
SLIDE 58

But behaves differently

slide-59
SLIDE 59

Usually has a focused target

slide-60
SLIDE 60

Almost obviously so

slide-61
SLIDE 61

Detection is a little harder here, but still follows the previous rules

slide-62
SLIDE 62

What to look for

slide-63
SLIDE 63

Anomalies

slide-64
SLIDE 64

Anything that let’s you reject H0

slide-65
SLIDE 65

But first you have to define “normal”

slide-66
SLIDE 66

And what has to change to be “not normal”

slide-67
SLIDE 67

10.20.253.8 - - [08/Apr/2015:08:20:21 +0000] "POST /login HTTP/1.1" 200 267"-" "Mozilla/ 5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/ 20100101 Firefox/8.0" "77.77.165.233"

slide-68
SLIDE 68

10.20.253.8 - - [08/Apr/2015:08:20:22 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2085 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"

slide-69
SLIDE 69

10.20.253.8 - - [08/Apr/2015:08:20:23 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2083 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"

slide-70
SLIDE 70

10.20.253.8 - - [08/Apr/2015:08:20:24 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2085 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"

slide-71
SLIDE 71

What do you see?

slide-72
SLIDE 72

I see a carding attack

slide-73
SLIDE 73

!?!?

slide-74
SLIDE 74

10.20.253.8 - - [08/Apr/2015:08:20:21 +0000] "POST /login HTTP/1.1" 200 267"-" "Mozilla/ 5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/ 20100101 Firefox/8.0" "77.77.165.233"

Login Request

slide-75
SLIDE 75

10.20.253.8 - - [08/Apr/2015:08:20:22 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2085 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"

Add credit card to account #1 1 sec delay

slide-76
SLIDE 76

10.20.253.8 - - [08/Apr/2015:08:20:23 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2083 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"

1 sec delay Add credit card to account #2 FF 8 on Windows 7

  • r Bot?
slide-77
SLIDE 77

10.20.253.8 - - [08/Apr/2015:08:20:24 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2085 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"

1 sec delay Add credit card to account #3 FF 8 on Windows 7

  • r Bot?

Plovdiv Bulgaria

slide-78
SLIDE 78

10.20.253.8 - - [08/Apr/2015:08:20:24 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2085 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"

1 sec delay Add credit card to account #3 FF 8 on Windows 7

  • r Bot?

Plovdiv Bulgaria Doesn’t follow 302

slide-79
SLIDE 79

And this continues

slide-80
SLIDE 80

10,000 more times

slide-81
SLIDE 81

Behavior deviation

slide-82
SLIDE 82

Velocity

slide-83
SLIDE 83

Access pattern

slide-84
SLIDE 84

Time of day

slide-85
SLIDE 85

Geo Location

slide-86
SLIDE 86

HTTP verb distribution

slide-87
SLIDE 87

User Agent

slide-88
SLIDE 88

Header order

slide-89
SLIDE 89

Success rate

slide-90
SLIDE 90
slide-91
SLIDE 91

Going deeper

slide-92
SLIDE 92

“Of course machines can't think as people do. A machine is different from a person. Hence, they think differently.”

  • - Alan Turing, The Imitation Game
slide-93
SLIDE 93

What’s our goal?

slide-94
SLIDE 94

Block robots as quickly as possible

slide-95
SLIDE 95

Embed detection scripts in your applications

slide-96
SLIDE 96

They should gather information and POST back to you

slide-97
SLIDE 97

JS can do a lot

slide-98
SLIDE 98

developer.mozilla.org/en- US/docs/Web/API/ Navigator

slide-99
SLIDE 99

var ua = navigator.userAgent; var resolution = function () { var dimensions = (screen.height > screen.width) ? [screen.height, screen.width] : [screen.width, screen.height]; if (dimensions != "undefined") { return dimensions; } } var platform = function () { if (navigator.platform) { return navigator.platform; } }

slide-100
SLIDE 100

You can also use Flash

slide-101
SLIDE 101

The details that you gather can make it really easy to spot a bot

slide-102
SLIDE 102

If it doesn’t execute it’s probably a bot*

slide-103
SLIDE 103

But there’s a lot to examine

slide-104
SLIDE 104

User Agent

slide-105
SLIDE 105

Screen Resolution

slide-106
SLIDE 106

Cursor movement pattern

slide-107
SLIDE 107

What plugins are installed?

slide-108
SLIDE 108

Fingerprint(s)

slide-109
SLIDE 109

Store the fingerprints

  • f known bots
slide-110
SLIDE 110

github.com/Valve/ fingerprintjs

slide-111
SLIDE 111

Wrapping up

slide-112
SLIDE 112

We employ teams of people to manage the good robots

slide-113
SLIDE 113

Maybe it’s time to hire a team of people that manages the bad ones too

slide-114
SLIDE 114

We need to build systems that do this detection

slide-115
SLIDE 115
slide-116
SLIDE 116
slide-117
SLIDE 117

Reduce the noise

slide-118
SLIDE 118

Reduce the impact of attacks

slide-119
SLIDE 119

Improve confidence in your data

slide-120
SLIDE 120

References

  • github.com/repsheet
  • developer.mozilla.org/en-US/docs/Web/API/

Navigator

  • github.com/Valve/fingerprintjs
  • github.com/Valve/fingerprintjs2
slide-121
SLIDE 121

Questions?

Please remember to evaluate via the GOTO Guide App