Evaluating the Long-term Effects of Parameters on the - - PowerPoint PPT Presentation

evaluating the long term effects of parameters on the
SMART_READER_LITE
LIVE PREVIEW

Evaluating the Long-term Effects of Parameters on the - - PowerPoint PPT Presentation

Evaluating the Long-term Effects of Parameters on the Characteristics of the Tranco Top Sites Ranking Victor Le Pochat , Tom Van Goethem, Wouter Joosen CSET 2019, 12 August 2019 Security researchers rely on top websites rankings We perform a


slide-1
SLIDE 1

Evaluating the Long-term Effects

  • f Parameters on the Characteristics
  • f the Tranco Top Sites Ranking

Victor Le Pochat, Tom Van Goethem, Wouter Joosen

CSET 2019, 12 August 2019

slide-2
SLIDE 2

Security researchers rely on top websites rankings

“We perform a comprehensive analysis

  • n Alexa’s Top 1 Million websites”

“We collected the benign pages from the Alexa top 20K websites” “The list of websites we chose for our evaluation comes from the Alexa Top Sites service, the source widely used in prior research on Tor”

2

[Kon18, Kha18, Rim18]

slide-3
SLIDE 3

Impact of rankings is not well-known

› Unannounced changes to methods › Little agreement on most popular domains › Potentially very volatile › Easily manipulated › Unknown effects in composition

[LeP19, Sch18, Rwe19]

Rankings can have a large impact on research

3

slide-4
SLIDE 4

We proposed Tranco as a research-oriented ranking

› Transparent methods › Reproducible rankings › Improved properties

4

Daily updated default ranking + custom rankings

https://tranco-list.eu/

[Le Pochat et al. Tranco: a research-oriented top sites ranking hardened against manipulation. NDSS 2019]

slide-5
SLIDE 5

We now evaluate Tranco's properties and parameters

5

Comparison with existing rankings Anomalies Researcher assumptions Stability

slide-6
SLIDE 6

Comparison with existing rankings Researcher assumptions

We evaluate Tranco's properties and parameters

6

Anomalies Stability

slide-7
SLIDE 7

Tranco has some similarity with each component

7

slide-8
SLIDE 8

Tranco contains domains popular in Chrome

8

slide-9
SLIDE 9

Comparison with existing rankings Researcher assumptions

We evaluate Tranco's properties and parameters

9

Anomalies Stability

slide-10
SLIDE 10

Responsive domains guarantee a sufficient sample

10

slide-11
SLIDE 11

Some malicious domains are present, but can be filtered out using Google Safe Browsing

11

slide-12
SLIDE 12

Comparison with existing rankings Researcher assumptions

We evaluate Tranco's properties and parameters

12

Anomalies Stability

slide-13
SLIDE 13

Tranco is very stable compared to its components

13

slide-14
SLIDE 14

Aggregating over 30 days leads to balanced stability

14

slide-15
SLIDE 15

Smaller subsets see higher stability over one year

15

slide-16
SLIDE 16

Comparison with existing rankings Researcher assumptions

We evaluate Tranco's properties and parameters

16

Anomalies Stability

slide-17
SLIDE 17

Component rankings experience anomalies

17

slide-18
SLIDE 18

Tranco is somewhat affected, but impact is reduced

18

slide-19
SLIDE 19

We evaluate Tranco's properties and parameters

Comparison with existing rankings Anomalies Researcher assumptions Stability

19

slide-20
SLIDE 20

We evaluate Tranco's properties and parameters

20

Similar to component and external lists Anomalies Stability Researcher assumptions

slide-21
SLIDE 21

We evaluate Tranco's properties and parameters

21

Similar to component and external lists Mostly responsive and benign Anomalies Stability

slide-22
SLIDE 22

We evaluate Tranco's properties and parameters

22

Similar to component and external lists Mostly responsive and benign Aggregation improves stability Anomalies

slide-23
SLIDE 23

We evaluate Tranco's properties and parameters

Similar to component and external lists Impact of anomalies is reduced Mostly responsive and benign Aggregation improves stability

23

slide-24
SLIDE 24

We make researchers aware of Tranco's properties

› 30-day aggregation yields good stability trade-off › Apply filters where appropriate › Use full list to retain at least 1M domains › Properties improve slightly for smaller subsets › Properly reference the specific list used

24

Default parameters → representative set of domains

slide-25
SLIDE 25

https://tranco-list.eu/

Download the Tranco ranking:

25

slide-26
SLIDE 26

Thank you!

Victor.LePochat@cs.kuleuven.be @VictorLePochat

slide-27
SLIDE 27

References

1. [Kon18] Konoth, R.K., Vineti, E., Moonsamy, V., Lindorfer, M., Kruegel, C., Bos, H., and Vigna, G., “MineSweeper: An In-depth Look into Drive-by Cryptocurrency Mining and Its Defense,” in Proc. CCS, 2018, pp. 1714-1730. DOI: 10.1145/3243734.3243858 2. [Kha18] Kharraz, A., Robertson, W., and Kirda, E., “Surveylance: Automatically Detecting Online Survey Scams,” in

  • Proc. SP, 2018, pp. 70-86. DOI: 10.1109/SP.2018.00044

3. [Rim18] Rimmer, V., Preuveneers, D., Juarez, M., Van Goethem, T., and Joosen, W., Automated website fingerprinting through deep learning,” in Proc. NDSS, 2018. DOI: 10.14722/ndss.2018.23105 4. [LeP19] Le Pochat, V., Van Goethem, T., Tajalizadehkhoob, S., Korczyński, M., Joosen, W.: Tranco: a research-

  • riented top sites ranking hardened against manipulation. In: 26th Annual Network and Distributed System Security

Symposium, February 2019. https://doi.org/10.14722/ndss.2019.23386 5. [Sch18] Quirin Scheitle, Oliver Hohlfeld, Julien Gamba, Jonas Jelten, Torsten Zimmermann, Stephen D. Strowes, and Narseo Vallina-Rodriguez. A long way to the top: Significance, structure, and stability of Internet top lists. In Internet Measurement Conference, pages 478–493, 2018. 6. [Rwe19] Walter Rweyemamu, Tobias Lauinger, Christo Wilson, William Robertson, and Engin Kirda. Clustering and the weekend effect: Recommendations for the use of top domain lists in security research. In 20th International Conference

  • n Passive and Active Measurement, pages 161–177, 2019.

27