Adversarial Machine Learning Daniel Lowd University of - - PowerPoint PPT Presentation

▶

Sep 10, 2023 161 likes •317 views

Adversarial Machine Learning Daniel Lowd University of Oregon Example: Spam Filtering From: spammer@example.com 1. Cheap mortgage now!!! Feature Weights cheap = 1.0

SLIDE 1

Adversarial ¡Machine ¡Learning ¡

Daniel ¡Lowd ¡ University ¡of ¡Oregon ¡

SLIDE 2

Example: ¡Spam ¡Filtering ¡

cheap ¡= ¡ ¡1.0 ¡ mortgage ¡= ¡ ¡1.5 ¡ Total ¡score ¡= ¡ ¡2.5 ¡

From: spammer@example.com Cheap mortgage now!!!

Feature ¡Weights ¡ > ¡1.0 ¡(threshold) ¡

1. ¡
2. ¡
3. ¡

Spam ¡

SLIDE 3

Example: ¡Spammers ¡Adapt ¡

cheap ¡= ¡ ¡1.0 ¡ mortgage ¡= ¡ ¡1.5 ¡ Eugene ¡= ¡-‑1.0 ¡ Oregon ¡= ¡-‑1.0 ¡ Total ¡score ¡= ¡ ¡0.5 ¡

From: spammer@example.com Cheap mortgage now!!! Eugene Oregon

Feature ¡Weights ¡ < ¡1.0 ¡(threshold) ¡

1. ¡
2. ¡
3. ¡

OK ¡

SLIDE 4

Are ¡Linear ¡Classifiers ¡Vulnerable? ¡

Adversary ¡wants ¡to ¡find ¡the ¡best ¡spam ¡email ¡that ¡will ¡go ¡ through ¡the ¡filter. ¡ In ¡general: ¡lowest-‑cost ¡instance ¡classified ¡as ¡negaUve, ¡ for ¡some ¡cost ¡funcUon ¡and ¡some ¡set ¡of ¡classifiers. ¡

X1 ¡ X2 ¡

+ ¡

‑ ¡

X1 ¡ X2 ¡

? ¡ ? ¡ ? ¡ ? ¡ ? ¡ ? ¡ ? ¡ ? ¡

+ ¡

‑ ¡

+ ¡

‑ ¡

SLIDE 5

AYacking ¡Linear ¡Classifiers ¡

 With ¡conUnuous ¡features, ¡find ¡opUmal ¡point ¡by ¡doing ¡line ¡

search ¡in ¡each ¡dimension: ¡

 With ¡binary ¡features, ¡take ¡a ¡negaUve ¡instance ¡(non-‑spam) ¡and ¡

reduce ¡its ¡cost ¡unUl ¡we ¡have ¡a ¡factor ¡of ¡2: ¡

X1 ¡ X2 ¡

xa ¡ xa x-

wi wj wk wl wm

c(x)

[Lowd ¡& ¡Meek, ¡2005] ¡

SLIDE 6

Experimental ¡Results ¡

RealisUc ¡spam ¡filter ¡trained ¡from ¡Hotmail ¡data. ¡
How ¡many ¡words ¡do ¡you ¡have ¡to ¡change ¡to ¡

get ¡the ¡median ¡spam ¡past ¡the ¡filter? ¡

How ¡many ¡queries ¡does ¡it ¡take? ¡

Attack type Naïve Bayes words (queries) Logistic reg. words (queries) Active 31* (23,000) 12* (9,000) Passive 112 (0) 149 (0)

[Lowd ¡& ¡Meek, ¡2005] ¡

SLIDE 7

Evading ¡Classifiers: ¡ Ongoing ¡Work ¡

Which ¡classes ¡of ¡non-‑linear ¡classifiers ¡can ¡we ¡efficiently ¡ evade, ¡and ¡under ¡what ¡assumpUons? ¡

X1 ¡ X2 ¡

C1 ¡ C2 ¡

X1 ¡ X2 ¡

SLIDE 8

Robust ¡Machine ¡Learning ¡

Scenario: ¡Adversary ¡knows ¡our ¡classifier ¡and ¡can ¡maliciously ¡ modify ¡data ¡to ¡aYack. ¡ Goal: ¡Select ¡the ¡best ¡classifier, ¡assuming ¡the ¡worst ¡adversarial ¡

manipulaUon. ¡(Zero-‑sum ¡Stackelberg ¡game.) ¡

SLIDE 9

Robust ¡Machine ¡Learning ¡

Previous ¡work: ¡Linear ¡classifiers ¡
Our ¡work: ¡RelaUonal ¡domains ¡

Examples: ¡Web ¡spam, ¡eBay ¡fraud, ¡etc. ¡

No No No No Yes No Yes

[Brin&Page98; ¡ChakrabarU&al98; ¡ Abernethy&al08] ¡

Image ¡credit: ¡[Chau&al06] ¡

SLIDE 10

Problem ¡FormulaUon ¡

Given: ¡A ¡graph ¡with ¡nodes, ¡aYributes, ¡and ¡edges. ¡ ¡

(e.g., ¡web ¡pages, ¡words, ¡and ¡links.) ¡

Assume: ¡Adversary ¡can ¡add ¡or ¡remove ¡up ¡to ¡k ¡

aYributes ¡(e.g., ¡words) ¡

[Torkamani ¡& ¡Lowd, ¡2013] ¡

Y2 ¡ X2,1 ¡ X2,2 ¡ X2,3 ¡ Y3 ¡ X3,1 ¡ X3,2 ¡ X3,3 ¡ Y1 ¡ X1,1 ¡ X1,2 ¡ X1,3 ¡ Y4 ¡ X4,1 ¡ X4,2 ¡ X4,3 ¡

SLIDE 11

Technical ¡Approach ¡

Start ¡with ¡associaUve ¡Markov ¡networks, ¡a ¡special ¡

case ¡of ¡a ¡structural ¡SVM. ¡

Modify ¡the ¡quadraUc ¡program ¡by ¡“plugging ¡in” ¡the ¡

adversary’s ¡worst-‑case ¡modificaUon. ¡

Result: ¡OpUmal ¡parameters ¡in ¡polynomial ¡Ume ¡

(for ¡an ¡assumed ¡model ¡of ¡the ¡adversary). ¡

¡[Taskar ¡et ¡al., ¡2004] ¡ [Torkamani ¡& ¡Lowd, ¡2013] ¡

SLIDE 12

Results: ¡PoliUcal ¡Blogs ¡ (Tuned ¡for ¡10% ¡adversary) ¡

5 10 15 20 25 10 20 30 40 50 60 70 80

Strength of adversary (%) Classification Error (%)

SVM SVMINV AMN CACC

[Torkamani ¡& ¡Lowd, ¡2013] ¡

SLIDE 13

Adversarial ¡RelaUonal ¡Learning: ¡ Ongoing ¡Work ¡

Non-‑associaUve ¡links ¡

(e.g., ¡fraudsters ¡and ¡accomplices) ¡

Adversaries ¡that ¡add ¡and ¡remove ¡links ¡

(e.g., ¡link ¡farms ¡on ¡the ¡Web) ¡

Real-‑world ¡evaluaUon ¡with ¡Web ¡spam ¡

SLIDE 14

Summary ¡

Machine ¡learning ¡is ¡increasingly ¡applied ¡to ¡

security ¡domains ¡where ¡adversaries ¡will ¡try ¡to ¡ defeat ¡it. ¡

To ¡assess ¡these ¡new ¡risks, ¡we ¡need ¡a ¡beYer ¡

understanding ¡of ¡ML ¡vulnerabiliUes. ¡

To ¡reduce ¡these ¡risks, ¡we ¡need ¡more ¡robust ¡

Adversarial ¡Machine ¡Learning ¡

Daniel ¡Lowd ¡ University ¡of ¡Oregon ¡

Example: ¡Spam ¡Filtering ¡

cheap ¡= ¡ ¡1.0 ¡ mortgage ¡= ¡ ¡1.5 ¡ Total ¡score ¡= ¡ ¡2.5 ¡

Feature ¡Weights ¡ > ¡1.0 ¡(threshold) ¡

Spam ¡

Example: ¡Spammers ¡Adapt ¡

cheap ¡= ¡ ¡1.0 ¡ mortgage ¡= ¡ ¡1.5 ¡ Eugene ¡= ¡-­‑1.0 ¡ Oregon ¡= ¡-­‑1.0 ¡ Total ¡score ¡= ¡ ¡0.5 ¡

Feature ¡Weights ¡ < ¡1.0 ¡(threshold) ¡

OK ¡

Are ¡Linear ¡Classifiers ¡Vulnerable? ¡

Adversary ¡wants ¡to ¡find ¡the ¡best ¡spam ¡email ¡that ¡will ¡go ¡ through ¡the ¡filter. ¡ In ¡general: ¡lowest-­‑cost ¡instance ¡classified ¡as ¡negaUve, ¡ for ¡some ¡cost ¡funcUon ¡and ¡some ¡set ¡of ¡classifiers. ¡

+ ¡

? ¡ ? ¡ ? ¡ ? ¡ ? ¡ ? ¡ ? ¡ ? ¡

+ ¡

+ ¡

AYacking ¡Linear ¡Classifiers ¡

search ¡in ¡each ¡dimension: ¡

reduce ¡its ¡cost ¡unUl ¡we ¡have ¡a ¡factor ¡of ¡2: ¡

xa ¡ xa x-

wi wj wk wl wm

c(x)

Experimental ¡Results ¡

get ¡the ¡median ¡spam ¡past ¡the ¡filter? ¡

Attack type Naïve Bayes words (queries) Logistic reg. words (queries) Active 31* (23,000) 12* (9,000) Passive 112 (0) 149 (0)

Evading ¡Classifiers: ¡ Ongoing ¡Work ¡

Which ¡classes ¡of ¡non-­‑linear ¡classifiers ¡can ¡we ¡efficiently ¡ evade, ¡and ¡under ¡what ¡assumpUons? ¡

C1 ¡ C2 ¡

Robust ¡Machine ¡Learning ¡

Scenario: ¡Adversary ¡knows ¡our ¡classifier ¡and ¡can ¡maliciously ¡ modify ¡data ¡to ¡aYack. ¡ Goal: ¡Select ¡the ¡best ¡classifier, ¡assuming ¡the ¡worst ¡adversarial ¡

Robust ¡Machine ¡Learning ¡

Examples: ¡Web ¡spam, ¡eBay ¡fraud, ¡etc. ¡

Problem ¡FormulaUon ¡

(e.g., ¡web ¡pages, ¡words, ¡and ¡links.) ¡

aYributes ¡(e.g., ¡words) ¡

Technical ¡Approach ¡

case ¡of ¡a ¡structural ¡SVM. ¡

adversary’s ¡worst-­‑case ¡modificaUon. ¡

(for ¡an ¡assumed ¡model ¡of ¡the ¡adversary). ¡

Results: ¡PoliUcal ¡Blogs ¡ (Tuned ¡for ¡10% ¡adversary) ¡

Adversarial ¡RelaUonal ¡Learning: ¡ Ongoing ¡Work ¡

(e.g., ¡fraudsters ¡and ¡accomplices) ¡

(e.g., ¡link ¡farms ¡on ¡the ¡Web) ¡

Summary ¡

security ¡domains ¡where ¡adversaries ¡will ¡try ¡to ¡ defeat ¡it. ¡

understanding ¡of ¡ML ¡vulnerabiliUes. ¡

ML ¡methods. ¡

cheap ¡= ¡ ¡1.0 ¡ mortgage ¡= ¡ ¡1.5 ¡ Eugene ¡= ¡-‑1.0 ¡ Oregon ¡= ¡-‑1.0 ¡ Total ¡score ¡= ¡ ¡0.5 ¡

Adversary ¡wants ¡to ¡find ¡the ¡best ¡spam ¡email ¡that ¡will ¡go ¡ through ¡the ¡filter. ¡ In ¡general: ¡lowest-‑cost ¡instance ¡classified ¡as ¡negaUve, ¡ for ¡some ¡cost ¡funcUon ¡and ¡some ¡set ¡of ¡classifiers. ¡

Which ¡classes ¡of ¡non-‑linear ¡classifiers ¡can ¡we ¡efficiently ¡ evade, ¡and ¡under ¡what ¡assumpUons? ¡

adversary’s ¡worst-‑case ¡modificaUon. ¡