Analyzing the Click Dynamic of Jesper Holmstrom Daniel Jonsson - - PowerPoint PPT Presentation

analyzing the click dynamic of
SMART_READER_LITE
LIVE PREVIEW

Analyzing the Click Dynamic of Jesper Holmstrom Daniel Jonsson - - PowerPoint PPT Presentation

Do we Read what we Share? Analyzing the Click Dynamic of Jesper Holmstrom Daniel Jonsson News Articles Shared on Twitter Filip Polbratt Olav Nilsson Linnea Lundstrom Sebastian Ragnarsson Anton Forsberg Karl Andersson Niklas Carlsson Proc.


slide-1
SLIDE 1

Jesper Holmstrom Daniel Jonsson Filip Polbratt Olav Nilsson Linnea Lundstrom Sebastian Ragnarsson Anton Forsberg Karl Andersson Niklas Carlsson

  • Proc. IEEE/ACM ASONAM, Vancouver, Canada, Aug. 2019.

Do we Read what we Share? Analyzing the Click Dynamic of News Articles Shared on Twitter

slide-2
SLIDE 2

Motivation

  • News and information spread over social media can have big impact on

thoughts, beliefs, and opinions

  • Important to understand the sharing dynamics on these forums …
  • Most studies trying to capture these dynamics rely only on Twitter’s
  • pen APIs to measure how frequently articles are shared/retweeted
  • They do not capture how many users actually read the articles

linked in these tweets ... … here, we instead focus on the clicks leading to linked articles … … and measure + analyze these over time.

slide-3
SLIDE 3

Motivation

  • News and information spread over social media can have big impact on

thoughts, beliefs, and opinions

  • Important to understand the sharing dynamics on these forums …
  • Most studies trying to capture these dynamics rely only on Twitter’s
  • pen APIs to measure how frequently articles are shared/retweeted
  • They do not capture how many users actually read the articles

linked in these tweets ... … here, we instead focus on the clicks leading to linked articles … … and measure + analyze these over time.

slide-4
SLIDE 4

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

slide-5
SLIDE 5

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

tweets

slide-6
SLIDE 6

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

tweets

slide-7
SLIDE 7

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

tweets retweets

slide-8
SLIDE 8

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

tweets

slide-9
SLIDE 9
  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

Contributions at a glance

tweets

slide-10
SLIDE 10
  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

Contributions at a glance

tweets

slide-11
SLIDE 11
  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

Contributions at a glance

tweets

slide-12
SLIDE 12
  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

Contributions at a glance

tweets

slide-13
SLIDE 13
  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

Contributions at a glance

tweets News article

slide-14
SLIDE 14

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

tweets News article

slide-15
SLIDE 15

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

tweets

slide-16
SLIDE 16

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

time tweets clicks

slide-17
SLIDE 17

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

slide-18
SLIDE 18

Contributions at a glance

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

slide-19
SLIDE 19

Methodology

slide-20
SLIDE 20
  • Collection of Bitly links to 7 pre-selected news website
  • 20-minute blocks (with latest tweets) collected over 7 days
  • Longitudinal click statistics
  • Each block scheduled for collection every 2 hours for 5 days
  • Careful sample frequency selection
  • To stay within rate limits, not all links sampled each time
  • Three sets per block: “top”, “random”, and “rest”
  • Sets sampled at different frequencies
  • Size of sets designed to stay within rate limits (details/equations in paper)
  • Complementing tweet statistics

Methodology

slide-21
SLIDE 21
  • Collection of Bitly links to 7 pre-selected news website
  • 20-minute blocks (with latest tweets) collected over 7 days
  • Longitudinal click statistics
  • Each block scheduled for collection every 2 hours for 5 days
  • Careful sample frequency selection
  • To stay within rate limits, not all links sampled each time
  • Three sets per block: “top”, “random”, and “rest”
  • Sets sampled at different frequencies
  • Size of sets designed to stay within rate limits (details/equations in paper)
  • Complementing tweet statistics

Methodology

slide-22
SLIDE 22
  • Collection of Bitly links to 7 pre-selected news website
  • 20-minute blocks (with latest tweets) collected over 7 days
  • Longitudinal click statistics
  • Each block scheduled for collection every 2 hours for 5 days
  • Careful sample frequency selection
  • To stay within rate limits, not all links sampled each time
  • Three sets per block: “top”, “random”, and “rest”
  • Sets sampled at different frequencies
  • Size of sets designed to stay within rate limits (details/equations in paper)
  • Complementing tweet statistics

Methodology

slide-23
SLIDE 23
  • Collection of Bitly links to 7 pre-selected news website
  • 20-minute blocks (with latest tweets) collected over 7 days
  • Longitudinal click statistics
  • Each block scheduled for collection every 2 hours for 5 days
  • Careful sample frequency selection
  • To stay within rate limits, not all links sampled each time
  • Three sets per block: “top”, “random”, and “rest”
  • Sets sampled at different frequencies
  • Size of sets designed to stay within rate limits (details/equations in paper)
  • Complementing tweet statistics

Methodology

slide-24
SLIDE 24
  • Collection of Bitly links to 7 pre-selected news website
  • 20-minute blocks (with latest tweets) collected over 7 days
  • Longitudinal click statistics
  • Each block scheduled for collection every 2 hours for 5 days
  • Careful sample frequency selection
  • To stay within rate limits, not all links sampled each time
  • Three sets per block: “top”, “random”, and “rest”
  • Sets sampled at different frequencies
  • Size of sets designed to stay within rate limits (details/equations in paper)
  • Complementing tweet statistics

Methodology

slide-25
SLIDE 25
  • Collection of Bitly links to 7 pre-selected news website
  • 20-minute blocks (with latest tweets) collected over 7 days
  • Longitudinal click statistics
  • Each block scheduled for collection every 2 hours for 5 days
  • Careful sample frequency selection
  • To stay within rate limits, not all links sampled each time
  • Three sets per block: “top”, “random”, and “rest”
  • Sets sampled at different frequencies
  • Size of sets designed to stay within rate limits (details/equations in paper)
  • Complementing tweet statistics

Methodology

slide-26
SLIDE 26
  • Collection of Bitly links to 7 pre-selected news website
  • 20-minute blocks (with latest tweets) collected over 7 days
  • Longitudinal click statistics
  • Each block scheduled for collection every 2 hours for 5 days
  • Careful sample frequency selection
  • To stay within rate limits, not all links sampled each time
  • Three sets per block: “top”, “random”, and “rest”
  • Sets sampled at different frequencies
  • Size of sets designed to stay within rate limits (details/equations in paper)
  • Complementing tweet statistics

Methodology

slide-27
SLIDE 27
  • Collection of Bitly links to 7 pre-selected news website
  • 20-minute blocks (with latest tweets) collected over 7 days
  • Longitudinal click statistics
  • Each block scheduled for collection every 2 hours for 5 days
  • Careful sample frequency selection
  • To stay within rate limits, not all links sampled each time
  • Three sets per block: “top”, “random”, and “rest”
  • Sets sampled at different frequencies
  • Sets and their sizes designed to stay within rate limits (details/eqns. in paper)
  • Complementing tweet statistics

Methodology

slide-28
SLIDE 28
  • Collection of Bitly links to 7 pre-selected news website
  • 20-minute blocks (with latest tweets) collected over 7 days
  • Longitudinal click statistics
  • Each block scheduled for collection every 2 hours for 5 days
  • Careful sample frequency selection
  • To stay within rate limits, not all links sampled each time
  • Three sets per block: “top”, “random”, and “rest”
  • Sets sampled at different frequencies
  • Sets and their sizes designed to stay within rate limits (details/eqns. in paper)
  • Complementing tweet statistics

Methodology

slide-29
SLIDE 29

Cli licks over time

slide-30
SLIDE 30

Clicks over ti time

  • 80% of all observed clicks occur within the first 24 hours
  • Large variations across links
  • Tail of less popular links sees a more even spread of clicks
  • Suggests that studies focusing only on popular articles (or tweets)

may underestimate the duration of the news cycle and the time that many news articles are read after they are first published

slide-31
SLIDE 31

Clicks over ti time

  • 80% of all observed clicks occur within the first 24 hours
  • Large variations across links
  • Tail of less popular links sees a more even spread of clicks
  • Suggests that studies focusing only on popular articles (or tweets)

may underestimate the duration of the news cycle and the time that many news articles are read after they are first published

slide-32
SLIDE 32

Clicks over ti time

  • 80% of all observed clicks occur within the first 24 hours
  • Large variations across links
  • Tail of less popular links sees a more even spread of clicks
  • Suggests that studies focusing only on popular articles (or tweets)

may underestimate the duration of the news cycle and the time that many news articles are read after they are first published

slide-33
SLIDE 33

Clicks over ti time

  • 80% of all observed clicks occur within the first 24 hours
  • Large variations across links
  • Tail of less popular links sees a more even spread of clicks
  • Suggests that studies focusing only on popular articles (or tweets)

may underestimate the duration of the news cycle and the time that many news articles are read after they are first published

slide-34
SLIDE 34

Comparison wit ith tw tweet data

  • Correlations between retweets and clicks, but significant differences
  • Generally larger click volumes
  • Expected: Only subset of readers retweet what they read …
  • We may also miss earlier tweets (resulting in clicks)
  • Significant set of tweets with more retweets than clicks
  • Some people (or bots) retweet the links without actually clicking the link.
  • This is clearly not good, as human sanity checking is an important tool to

reduce the spreading of fake news

slide-35
SLIDE 35

Comparison wit ith tw tweet data

  • Correlations between retweets and clicks, but significant differences
  • Generally larger click volumes
  • Expected: Only subset of readers retweet what they read …
  • We may also miss earlier tweets (resulting in clicks)
  • Significant set of tweets with more retweets than clicks
  • Some people (or bots) retweet the links without actually clicking the link.
  • This is clearly not good, as human sanity checking is an important tool to

reduce the spreading of fake news

slide-36
SLIDE 36

Comparison wit ith tw tweet data

  • Correlations between retweets and clicks, but significant differences
  • Generally larger click volumes
  • Expected: Only subset of readers retweet what they read …
  • We may also miss earlier tweets (resulting in clicks)
  • Significant set of tweets with more retweets than clicks
  • Some people (or bots) retweet the links without actually clicking the link.
  • This is clearly not good, as human sanity checking is an important tool to

reduce the spreading of fake news

slide-37
SLIDE 37

Comparison wit ith tw tweet data

  • Correlations between retweets and clicks, but significant differences
  • Generally larger click volumes
  • Expected: Only subset of readers retweet what they read …
  • We may also miss earlier tweets (resulting in clicks)
  • Significant set of tweets with more retweets than clicks
  • Some people (or bots) retweet the links without actually clicking the link.
  • This is clearly not good, as human sanity checking is an important tool to

reduce the spreading of fake news

slide-38
SLIDE 38

Comparison wit ith tw tweet data

  • Correlations between retweets and clicks, but significant differences
  • Generally larger click volumes
  • Expected: Only subset of readers retweet what they read …
  • We may also miss earlier tweets (resulting in clicks)
  • Significant set of tweets with more retweets than clicks
  • Some people (or bots) retweet the links without actually clicking the link.
  • This is clearly not good, as human sanity checking is an important tool to

reduce the spreading of fake news

slide-39
SLIDE 39

Comparison wit ith tw tweet data

  • Clicks typically progress somewhat faster (at start) than retweets
  • Also some subtle difference (see paper) that indicates that retweet

data underestimate bias towards popular links/articles

  • These links are often highly shared in the beginning, but also

accumulate reads/clicks (at a much slower rate) later …

slide-40
SLIDE 40

Comparison wit ith tw tweet data

  • Clicks typically progress somewhat faster (at start) than retweets
  • Also some subtle difference (see paper) that indicates that retweet

data underestimate bias towards popular links/articles

  • These links are often highly shared in the beginning, but also

accumulate reads/clicks (at a much slower rate) later …

slide-41
SLIDE 41

Comparison wit ith tw tweet data

  • Clicks typically progress somewhat faster (at start) than retweets
  • Also some subtle difference (see paper) that indicates that retweet

data underestimate bias towards popular links/articles

  • These links are often highly shared in the beginning, but also

accumulate reads/clicks (at a much slower rate) later …

slide-42
SLIDE 42

Comparison wit ith tw tweet data

  • Clicks typically progress somewhat faster (at start) than retweets
  • Also some subtle difference (see paper) that indicates that retweet

data underestimate bias towards popular links/articles

  • These links are often highly shared in the beginning, but also

accumulate reads/clicks (at a much slower rate) later …

slide-43
SLIDE 43

Im Impact of f age

  • “Older” articles accumulating clicks at a much more uniform rate over

the measurement duration

  • For “older” articles, most of the clicks are associated with articles that

do not appear to fade (as quickly) in popularity

  • The opposite is true for the “overall” set and for “younger” articles
  • Again, the “older” set contains more long-term popular articles
slide-44
SLIDE 44

Im Impact of f age

  • “Older” articles accumulating clicks at a much more uniform rate over

the measurement duration

  • For “older” articles, most of the clicks are associated with articles that

do not appear to fade (as quickly) in popularity

  • The opposite is true for the “overall” set and for “younger” articles
  • Again, the “older” set contains more long-term popular articles
slide-45
SLIDE 45

Im Impact of f age

  • “Older” articles accumulating clicks at a much more uniform rate over

the measurement duration

  • For “older” articles, most of the clicks are associated with articles that

do not appear to fade (as quickly) in popularity

  • The opposite is true for the “overall” set and for “younger” articles
  • Again, the “older” set contains more long-term popular articles

“flatter” “earlier peak”

slide-46
SLIDE 46

Im Impact of f age

  • “Older” articles accumulating clicks at a much more uniform rate over

the measurement duration

  • For “older” articles, most of the clicks are associated with articles that

do not appear to fade (as quickly) in popularity

  • The opposite is true for the “overall” set and for “younger” articles
  • Again, the “older” set contains more long-term popular articles

“flatter” “earlier peak” “flatter” “earlier peak”

slide-47
SLIDE 47

Validation year-old data (f (from 2017)

  • Identify invariants
  • The early peaks, the skew towards a subset of highly popular links, and

the differences between links to articles of different age appear invariant

May 2018 May 2017

slide-48
SLIDE 48

Validation year-old data (f (from 2017)

  • Identify invariants
  • The early peaks, the skew towards a subset of highly popular links, and

the differences between links to articles of different age appear invariant

May 2018 May 2017

Overall

slide-49
SLIDE 49

Validation year-old data (f (from 2017)

  • Identify invariants
  • The early peaks, the skew towards a subset of highly popular links, and

the differences between links to articles of different age appear invariant

May 2018 May 2017

Per-age group

slide-50
SLIDE 50

Age and lo long-term churn

slide-51
SLIDE 51

Age-dependent popularity skew

  • While some differences within the two age-based sub-classes, the

most substantial differences are between the categories themselves

  • For the two “oldest” classes, the CCDFs shows relatively straight-line

behavior, suggesting a power-law-like popularity skew

  • For the “younger” articles, there is relatively higher popularity churn
slide-52
SLIDE 52

Age-dependent popularity skew

  • While some differences within the two age-based sub-classes, the

most substantial differences are between the categories themselves

  • For the two “oldest” classes, the CCDFs shows relatively straight-line

behavior, suggesting a power-law-like popularity skew

  • For the “younger” articles, there is relatively higher popularity churn

“young” (≤ 1 week)

slide-53
SLIDE 53

Age-dependent popularity skew

  • While some differences within the two age-based sub-classes, the

most substantial differences are between the categories themselves

  • For the two “oldest” classes, the CCDFs shows relatively straight-line

behavior, suggesting a power-law-like popularity skew

  • For the “younger” articles, there is relatively higher popularity churn

“old” (> 1 week) “old” (> 1 week)

slide-54
SLIDE 54

Age-dependent popularity skew

  • While some differences within the two age-based sub-classes, the

most substantial differences are between the categories themselves

  • For the two “oldest” classes, the CCDFs shows relatively straight-line

behavior, suggesting a power-law-like popularity skew

  • For the “younger” articles, there is relatively higher popularity churn
slide-55
SLIDE 55

Age-dependent popularity skew

  • While some differences within the two age-based sub-classes, the

most substantial differences are between the categories themselves

  • For the two “oldest” classes, the CCDFs shows relatively straight-line

behavior, suggesting a power-law-like popularity skew

  • For the “younger” articles, there is relatively higher popularity churn
slide-56
SLIDE 56

Age-dependent churn

  • Increasing churn among both the “younger” and “older” articles
  • In contrast, for YouTube videos, long-term popularity has been found to

reduce the churn over time [Borghol et al., 2011]

  • A short initial interval have been found to be a good predictor of the

clicks over the remainder of the time period

slide-57
SLIDE 57

Age-dependent churn

  • Increasing churn among both the “younger” and “older” articles
  • In contrast, for YouTube videos, long-term popularity has been found to

reduce the churn over time [Borghol et al., 2011]

  • A short initial interval have been found to be a good predictor of the

clicks over the remainder of the time period

slide-58
SLIDE 58

Age-dependent churn

  • Increasing churn among both the “younger” and “older” articles
  • In contrast, for YouTube videos, long-term popularity has been found to

reduce the churn over time [Borghol et al., 2011]

  • A short initial interval have been found to be a good predictor of the

clicks over the remainder of the time period

slide-59
SLIDE 59

Age-dependent churn

Even the clicks observed over a very short interval (e.g., 2 hrs) provides better insight into the actual information reach over a longer time period (e.g., 120 hrs) than the retweet do (even if using the same 120 hrs)

slide-60
SLIDE 60

Age-dependent churn

Even the clicks observed over a very short interval (e.g., 2 hrs) provides better insight into the actual information reach over a longer time period (e.g., 120 hrs) than the retweet do (even if using the same 120 hrs)

slide-61
SLIDE 61

Age-dependent churn

Even the clicks observed over a very short interval (e.g., 2 hrs) provides better insight into the actual information reach over a longer time period (e.g., 120 hrs) than the retweet do (even if using the same 120 hrs)

… compare with retweets vs clicks …

slide-62
SLIDE 62

Li Life-time cli licks

  • “Younger” links gain on the lifetime clicks observed for the “older”

links, closing the gap

  • However, the gap is still substantial at the end of the 120 hour period
  • This highlighting that the “older” category includes many links to long-

term popular articles

slide-63
SLIDE 63

Li Life-time cli licks

  • “Younger” links gain on the lifetime clicks observed for the “older”

links, closing the gap

  • However, the gap is still substantial at the end of the 120 hour period
  • This highlighting that the “older” category includes many links to long-

term popular articles

“Younger” catching up

slide-64
SLIDE 64

Li Life-time cli licks

  • “Younger” links gain on the lifetime clicks observed for the “older”

links, closing the gap

  • However, the gap is still substantial at the end of the 120 hour period
  • This highlighting that the “older” category includes many links to long-

term popular articles

“Younger” catching up

slide-65
SLIDE 65

Li Life-time cli licks

  • “Younger” links gain on the lifetime clicks observed for the “older”

links, closing the gap

  • However, the gap is still substantial at the end of the 120 hour period
  • This highlighting that the “older” category includes many links to long-

term popular articles

“Younger” catching up

slide-66
SLIDE 66

Li Life-time cli licks

  • “Younger” links gain on the lifetime clicks observed for the “older”

links, closing the gap

  • However, the gap is still substantial at the end of the 120 hour period
  • This highlighting that the “older” category includes many links to long-

term popular articles

slide-67
SLIDE 67

Per-site-based analysis

slide-68
SLIDE 68

In Invariants despite sig ignificant dif ifferences

  • Age-based invariants (across websites) despite

significant differences observed between the different websites (e.g., age, speed clicks are

  • btains, and click distribution)
  • Our age-based conclusions are consistent for

each of the news websites individually, further validating our previous claims

slide-69
SLIDE 69

In Invariants despite sig ignificant dif ifferences

  • Age-based invariants (across websites) despite

significant differences observed between the different websites (e.g., age, speed clicks are

  • btains, and click distribution)
  • Our age-based conclusions are consistent for

each of the news websites individually, further validating our previous claims

slide-70
SLIDE 70

In Invariants despite sig ignificant dif ifferences

  • Age-based invariants (across websites) despite

significant differences observed between the different websites (e.g., age, speed clicks are

  • btains, and click distribution)
  • Our age-based conclusions are consistent for

each of the news websites individually, further validating our previous claims

slide-71
SLIDE 71

Conclusions

slide-72
SLIDE 72

Conclusions and summary ry

  • Two main contributions
  • A novel longitudinal measurement framework
  • The first analysis of how the number of clicks changes over time
  • Example observations from temporal analysis
  • Noticeable differences in the relative number of clicks vs. retweets
  • ccurring at different parts of the news cycle
  • Retweet data often underestimates biases towards clicking popular

links/articles

  • Significant differences in the clicks-per-tweets ratio, including

(alarmingly) many links with more retweets than clicks

  • Significant age biases, including relatively high initial click rates for

articles younger than a week and much more stable click rates for

  • lder and long-term popular articles
  • Insights into how age-dependent popularity skews and age-dependent

churn impact the clicks observed by different classes of links

  • We validate our findings (and identify invariants) using both data from

May 2017 and a per-website-based analysis

slide-73
SLIDE 73

Niklas Carlsson (niklas.carlsson@liu.se)

Do we Read what we Share? Analyzing the Click Dynamic of News Articles Shared on Twitter

Jesper Holmstrom, Daniel Jonsson, Filip Polbratt, Olav Nilsson, Linnea Lundstrom, Sebastian Ragnarsson, Anton Forsberg, Karl Andersson, and Niklas Carlsson