SoBigData day @EUI, Florence, 11-10-2017
The rise of novel Twitter social spambots SoBigData day @EUI, - - PowerPoint PPT Presentation
The rise of novel Twitter social spambots SoBigData day @EUI, - - PowerPoint PPT Presentation
The rise of novel Twitter social spambots SoBigData day @EUI, Florence, 11-10-2017 Marinella Petrocchi IIT-CNR, Pisa, Italy SPAMBOTS & SOCIAL NETWORKS spambot AN OPEN PROBLEM Spambots (Semi-)automated accounts with (often) harmful
SPAMBOTS & SOCIAL NETWORKS AN OPEN PROBLEM
Spambots spambot (Semi-)automated accounts with (often) harmful intention Misinformation spreading, steal of personal data, manipulation of stock market, infiltration in political discourse
THE RISE OF THE SOCIAL BOTS
They escape detection techniques, by evolving: On Twitter: fake followers (till 2012) 1st evolution (2012-2014) current (?) wave (2015-2017) New spambots are almost indistinguishable from genuine accounts
- E. Ferrara, O. Varol, C. Davis, F. Menczer, and A. Flammini, “The rise of social bots,” Communica)ons of the ACM, vol. 59, no. 7, pp. 96–104, 2016
FAKE FOLLOWERS
NAIVE FAKE ACCOUNTS WERE EASY TO BUY
SOCIAL SPAMBOTS
The new wave
Undistinguishable from genuine accounts if analyzed one-by-one Analysis of the online behavior of large groups of users, with the goal of detecting possible spambots among them
SOCIAL SPAMBOTS
MODELING THE ONLINE BEHAVIOR OF USERS
The idea
Behaviour Sequence of actions performed by an account Digital DNA Each type of action is associated to a character (e.g., A, B, C) The online behaviour of an account is modeled as a sequence of characters (i.e., a string, similarly to biologic DNA) according to the sequence of actions performed by that account
Encoding T tweet, R retweet, P reply T R R R R P …RRTRPR Timeline of a Twitter account
MODELING THE ONLINE BEHAVIOR OF USERS
The idea
…RRTRPRTPRRPRTPRPTPRRTRPR …RPRTPTTRPTRPTPRRRRTPPRPP …TTTRRRPPTPRPTPRTRPTRRRTP …PRTRPRTPPPPRTPRRPRTPPRRT …TRTRPRTPRRPRTPRPTPTPPRTT …TRPPRTPPTRPPTPRRTTTPPRPR
DIGITAL DNA VS BIOLOGIC DNA
T tweet, R retweet, P reply …AGTCTCCATTTTCAGGTCGTA …GTTTAAGATCGCCTCATCACC …AGGCAATTCGCCTGAACTGG …AGTCTCGATCCTTTCCTCGTT …AAAATCGAACGCCTTGTCGG …ATTCTCCATCGCCTAAACAAC A adenine, G guanine, T thymine, C cytosine
…TRRRPRRTRRPRTPRPTPRRTRPR …RPRTPTTRRRPRRTPRRRRTPPRP …TTTRRRPRRRPRRTRTRPTRRRTP …PRTRPRTPPPPRTPRRRRRPRRTR
SIMILARITY BETWEEN DIGITAL DNA SEQUENCES
Intuition Automated accounts (spambots) have similar DNA sequences LCS (longest common substring) Longest substring between N sequences of digital DNA
RRRPRRT
(length: 7 characters)
Spambots characterization
- M. Arnold and E. Ohlebusch, “Linear Lme algorithms for generalizaLons
- f the longest common substring problem,” Algorithmica, vol. 60, no. 4,
- pp. 806–818, 2011
LCS: SPAMBOTS VS HUMANS
LCS: similarity measure
Spambots characterization
LCS: SPAMBOTS + HUMANS (MIXED GROUP)
Spambots detection
- 1. accounts with high
similarity
- 2. steep decrease in
similarity
- 3. accounts with low
similarity
DETECTION TECHNIQUES
Unsupervised approach
Spambots detection
- 2. Supervised
approach
DETECTION TECHNIQUES
Spambots detection
DATASETS
Evaluation datasets: 1. Mixed1 (1982 accounts): 50% Bot1, 50% human 2. Mixed2 (928 accounts): 50% Bot2, 50% human
Spambots detection
EVALUATION
- C. Yang, R. Harkreader, and G. Gu, “Empirical evaluaLon and new design for
fighLng evolving TwiVer spammers,” IEEE Transac)ons on Informa)on Forensics and Security, vol. 8, no. 8, pp. 1280–1293, 2013
- F. Ahmed, and M. Abulaish, “A generic staLsLcal approach for spam detecLon in online social networks,” Computer Communica)ons, vol.
36, no. 10, pp. 1120–1129, 2013
- Z. Miller, B. Dickinson, W. Deitrick, W. Hu, and A. H. Wang, “TwiVer spammer
detecLon using data stream clustering,” Informa)on Sciences, vol. 260, pp. 64– 73, 2014 Spambots detection
TAKE-HOME MESSAGES
- New evolutionary wave: social spambots
- Current techniques fail in detecting them
- Detection via digital DNA analysis: effective and efficient (lightweight
features – no graphs – linear complexity algorithms)
Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, Maurizio Tesconi: “The Paradigm Shi? of social spambots: Evidence, theories, and tools for the arms race”, WWW 2017 Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, Maurizio Tesconi: “Social Fingerprin)ng: Detec)on of spambots groups thorugh DNA inspired behavioral modeling” IEEE TransacLons on Dependable and Secure CompuLng, 2017 Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, Maurizio Tesconi: “ExploiLng digital DNA for the analysis of similariLes in TwiVer behaviours” IEEE Data Science and AnalyLcs, 2017
THANK YOU!
Questions?