Roberto Gonzlez & Rubn Cuevas, UC3M Reza Motamedi & Reza - - PowerPoint PPT Presentation

roberto gonz lez rub n cuevas uc3m reza motamedi reza
SMART_READER_LITE
LIVE PREVIEW

Roberto Gonzlez & Rubn Cuevas, UC3M Reza Motamedi & Reza - - PowerPoint PPT Presentation

Google+ or Google-? Dissecting the evolution of the New OSN in its first year Roberto Gonzlez & Rubn Cuevas, UC3M Reza Motamedi & Reza Rejaie, Univ. Oregon Angel Cuevas, Institut Telecom Sud Paris (now UC3M) Rubn Cuevas


slide-1
SLIDE 1

Google+ or Google-? Dissecting the evolution of the New OSN in its first year

Roberto González & Rubén Cuevas, UC3M Reza Motamedi & Reza Rejaie, Univ. Oregon Angel Cuevas, Institut Telecom Sud Paris (now UC3M)

Rubén Cuevas rcuevas@it.uc3m.es Universidad Carlos III de Madrid

Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-2
SLIDE 2

Motivation

  • Social Media market has rapidly grown

and reach a maturity

– Facebook and Twitter have a dominant position – Savvy Users

  • In this scenario:

can a new OSN get a significant piece

  • f the OSN market ?

2 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-3
SLIDE 3

Motivation

  • Google+ (G+) is an interesting candidate

to address the previous question

  • Some specificities of our case of study:

– G+ mixes features from both Twitter and Facebook in order to attract users from both OSNs – It is supported by a major Internet player (Google)

3 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-4
SLIDE 4

Our starting point

G+ = “Ghost Town”?

  • r

G+ = “An story of an amazing success”?

4 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-5
SLIDE 5

Our goal

  • Let’s try doing an objective analysis
  • i.e., analyze…

– the evolution of the size of the different components of the network – the evolution of the activity in the OSN – The evolution of the connectivity properties

  • … over an enough long and

representative period of time

5 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-6
SLIDE 6

Outline

  • 1. Google+ background
  • 2. Measurement Methodology & Datasets
  • 3. Macro-level structure & its evolution
  • 4. Public Activity & its evolution
  • 5. Connectivity Properties & its evolution
  • 6. Conclusion

6 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-7
SLIDE 7

Google+ Background

  • Unidirectional relationships (like TW)
  • Control on the visibility of a post (like FB)

– Post = text + attachments (photo, video)

  • Reactions to a post:

– Comment, Reshare or Plusone (+1)

  • Each user a profile with 17 fields

– Each field can be public, private or empty

  • User id space:

– User-id: 21 integers digit – Not clear strategy/Sparsely populated

7 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-8
SLIDE 8

Measurement Methodology & Datasets

8 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-9
SLIDE 9

Measurement Methodology Capturing LCC

  • Largest Connected Component (LCC)
  • BFS-based
  • List of friends, List of followers, Profile
  • 21 instances of our crawler + 1 coordinator

– Each one is responsible for a region of the id-space – The coordinator assigns the learnt user- ids to the corresponding crawler instance

  • ~ 800K users/hour -> Whole LCC in 7-10 days

9 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-10
SLIDE 10

Measurement Methodology Random sample of users

  • We leverage the G+ search API

– Receives a keyword (e.g. surname) as input – Return up to 1000 users including that keyword in its name/surname

  • For popular names (> 1000 registered users)

– Selective answer with well connected and active users

  • For mid-popular/unpopular (< 1000 registered users)

– Return all the users

  • We use the US census to provide mid/low popular

surnames as input, and only consider as valid those surnames for which the API returns less than 1K users

10

10 10

2

10

4

10

6

10

8

0.2 0.4 0.6 0.8 1

  • Num. Followers

CDF

Search API unpopular (<1000) Search API popular (>1000) LCC (Reference) 10 10

1

10

2

10

3

10

4

0.2 0.4 0.6 0.8 1

  • Num. Friends

CDF

Search API unpopular (<1000) Search API popular (>1000) LCC (Reference)

Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-11
SLIDE 11

Measurement Methodology Capturing Users’ Public Activity

  • User’s activity

– User’s posts – Num. attracted reactions per post

  • We use the G+ API

– For all users in LCC Sep 2012 – User’s activity between G+ release (Jun 28th 2011) and our measurement starting date (Sep 7th 2012) -> 437 days – 68 days

11 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-12
SLIDE 12

LCC Datasets

Name #nodes #edges Start Date Duration (days) LCC-Dec* 35.1M 575M 11-Nov-2011 46 LCC-Apr 51.8M 1.1B 15-Mar-2012 29 LCC-Aug 79.2M 1.6B 20-Aug-2012 4 LCC-Sep 85.3M 1.7B 17-Sep-2012 5 LCC-Oct 89.8M 1.8B 15-Oct-2012 5 LCC-Nov 93.1M 1.9B 28-Oct-2012 6

12 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-13
SLIDE 13

Random Samples & Users’ Activity Datasets

Name #nodes #edges Start Date Duration (days) Rand-Apr 2.2M 145M 08-Apr-2012 23 Rand-Oct 5.7M 263M 15-Oct-2012 10 Rand-Nov 3.5M 157M 28-Oct-2012 13

13

Users Posts Attachments Plusones Comments Reshares 13.6M 218M 299M 352M 202M 64M

Random Samples Users’ Activities

Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-14
SLIDE 14

Other datasets (comparison)

14

Name OSN Date Info Tw-Pro Twitter Jul 2011 Profile (80K rand. Users) Tw-Con* Twitter Aug 2009 Connectivity (55M users) Tw-Act* Twitter Jun 2010 Activity (895K rand. Users) FB-Pro Facebook Jun 2012 Profile (480K rand. Users) FB-Con Facebook Jun 2012 Connectivity (75K rand. Users) FB-Act Facebook Sep 2012 Activity (16K rand. Users)

Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-15
SLIDE 15

Macro-level structure & its evolution

15 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-16
SLIDE 16

Macro-level structure & its evolution

  • Every OSN is formed by

– Largest Connected Component (LCC) – Partitions (or islands)

  • Connected components smaller than the LCC

– Singletons

  • Isolated nodes without connections to others

16 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-17
SLIDE 17

Evolution of LCC size

10

3

10

4

10

5

10

6

10

7

10

8

LCC−DEC LCC−APR LCC−AUG LCC−SEP LCC−OCT LCC−NOV

  • Num. Users

Number of Users

  • Avg. Number of arriving

users (Users/day)

  • Avg. Number of departing

users (Users/day)

17

10

3

10

4

10

5

10

6

10

7

10

8

LCC−DEC LCC−APR LCC−AUG LCC−SEP LCC−OCT LCC−NOV

  • Num. Users

Number of Users

  • Avg. Number of arriving

users (Users/day)

  • Avg. Number of departing

users (Users/day)

  • Avg. daily number of new LCC users

– 150K (Dec 2011-Apr 2012) – 207K (Apr 2012- Nov 2012)

  • Impressive…
  • but significantly lower than 0.85M-1.8M new

registered users reported by Google in the same period

  • Why??
  • 9.6K LCC users leaves the system (in avg.) every day

– They show a connectivity similar to other LCC users, but they do not have any activity – Lack of interest to actively participate in the system

Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-18
SLIDE 18

Evolution of the main components

Element % users Ran-Apr Ran-Oct Ran-Nov LCC 43.5 32.3 32.2 Partitions 1.4 1.7 1.5 Singletons 55.1 66.0 66.3 All 100 100 100

18

– % singletons (é), % LCC (ê), % Islands (~) – LCC in other OSNs à FB (99.91%), TW (94.18%) – This is a side effect of the integrated registration process impossed by Google – e.g., a new gmail (youtube) account automatically generates a G+ accounts – Singletons may be unaware they are in G+

Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-19
SLIDE 19

Public Activity & Its evolution

19 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-20
SLIDE 20

Public activity & its evolution

  • Public activity is important

– It is the one providing more visibility – Can be indexed by search engines (including Google) – Available to others (excluding Google) for marketing and mining purposes

  • An early study using ground truth-data

concludes that 30% of posts in G+ are public

  • Collecting private posts

– no representative – unethical

20 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-21
SLIDE 21

Temporal Characteristics of Public Activity (1)

  • Steadily increasing rate

in # daily posts after initial phase

  • Peaks correlated with

major events

  • Saw-tooth shape due to

weekends

  • Most posts have

attachments but…

  • The #posts triggering

reactions is significantly smaller # daily posts

1 2 3 4 5 6 7 8 9 x 10

5

  • Num. Posts

J A S O N D J F M A M J J A S Total With Attachements With +1’s With Comments With Reshares

21

  • Google+ or Google-? Dissecting the evolution of the New OSN in its first year
  • March 2012
slide-22
SLIDE 22
  • The number of daily

reactions are also steadily increasing after the initial phase

  • +1 is the preferred

reaction and rapidly growing # daily reactions/attachments

2 4 6 8 10 12 14 16 18 x 10

5

  • Num. Reactions

J A S O N D J F M A M J J A S

  • Num. Attachments
  • Num. +1’s
  • Num. Comments
  • Num. Resharers

22

Temporal Characteristics of Public Activity (2)

  • Google+ or Google-? Dissecting the evolution of the New OSN in its first year
  • March 2012
slide-23
SLIDE 23
  • Growth rate -> 3K

users/day

  • ~60 times less than the

# new daily LCC registrations

  • The comparison of this
  • Fig. with the previous
  • ne suggests a clear

skewness in the users’ contribution #daily users making a post

0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 10

5

  • Num. Active Users

J A S O N D J F M A M J J A S Total With Attachements With +1’s With Comments With Reshares

23

Temporal Characteristics of Public Activity (3)

  • Google+ or Google-? Dissecting the evolution of the New OSN in its first year
  • March 2012
slide-24
SLIDE 24

Skewness in the user’s contribution

  • f posts and attracted reactions

10

−4

10

−2

10 10

2

20 40 60 80 100 % of. % of Users Posts Attachements +1’s Comments Resharers

  • Top 10% users

generate 80% of public posts

  • Top 1% users attract:

– 80% comments – 90% +1s and reshares

24

  • Google+ or Google-? Dissecting the evolution of the New OSN in its first year
  • March 2012
slide-25
SLIDE 25

Correlation posting vs reactions

  • Defined groups (posts/day):

– Casual (<1/7) – Regular (1/7-1) – Active (>1)

  • Most active users attract a

larger number of reactions

  • The public activity (posts +

reactions) in G+ happens around a small fraction of active users

10

−2

10

−1

10 10

1

10

2

Posts/day

<1/7 1/7−1 >1

Reactions/day

25

  • Google+ or Google-? Dissecting the evolution of the New OSN in its first year
  • March 2012
slide-26
SLIDE 26

Comparison with other OSNs Activity Rate

10

−4

10

−2

10 10

2

10

4

0.2 0.4 0.6 0.8 1 Posts/Day CDF LCC Twitter Facebook

  • We use our G+, TW and

FB activity datasets

  • Fraction of active users:

– FB (73%) – TW (35%) – G+ (17%)

  • Activity rate for active

users

– FB & G+ more homogeneous – Median values

  • FB (0.19) vs. G+ (0.08)

26

slide-27
SLIDE 27
  • G+ (17), FB (21), TW (3+3)
  • Stability of results across

LCC snapshots

  • In median FB users make

public 6 attributes vs <10% in G+

  • In Twitter 69% of users do

not make public any non- mandatory attribute and 13% make public 1.

  • Level of information sharing:

– FB > G+ > TW

5 10 15 20 25 0.2 0.4 0.6 0.8 1 CDF

  • Num. Public Attributes

LCC−APR LCC−AUG LCC−SEP LCC−OCT LCC−NOV Facebook

27

Comparison with other OSNs User’s public attributes

  • Google+ or Google-? Dissecting the evolution of the New OSN in its first year
  • March 2012
slide-28
SLIDE 28

Connectivity Properties & its evolution

28 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-29
SLIDE 29

Connectivity & Its Evolution Degree Distribution (# followers)

10 10

2

10

4

10

6

10

8

10

−8

10

−6

10

−4

10

−2

10

  • Num. Followers

CCDF Facebook Twitter LCC−NOV LCC−OCT LCC−SEP LCC−AUG LCC−APR LCC−DEC

  • Stable since Apr 2012
  • Power-law (α = 1.26)
  • Similar to other OSNs

(excepting FB)

  • Distribution very similar

to Twitter!!

29

  • Google+ or Google-? Dissecting the evolution of the New OSN in its first year
  • March 2012
slide-30
SLIDE 30

Connectivity & Its Evolution Degree Distribution (# friends)

  • Similar results for

#friends

  • Power-law (α = 1.39)
  • Distribution very similar

to Twitter, but…

  • FB & G+ #friends limits

– 5k

30

10 10

2

10

4

10

6

10

−8

10

−6

10

−4

10

−2

10

  • Num. Friends

CCDF Facebook Twitter LCC−NOV LCC−OCT LCC−SEP LCC−AUG LCC−APR LCC−DEC

  • Google+ or Google-? Dissecting the evolution of the New OSN in its first year
  • March 2012
slide-31
SLIDE 31

0.2 0.4 0.6 0.8 1

Number of Followers

0−10 10−100 100−1K 1K−10K 10K−100K 100K−1M >1M

% bidirectional relations

  • Aggregate % bidir. relation.

– Dec 2011 (32%) vs Nov 2012( 21.3%) – TW 2009 (22%)

  • Again, very similar to TW!!
  • Just low popular users (< 1k

followers) reciprocate a significant portion of connections (> 30%)

  • G+ is used as a broadcast

network (similar to TW)

31

Connectivity & Its Evolution Reciprocation

  • Google+ or Google-? Dissecting the evolution of the New OSN in its first year
  • March 2012
slide-32
SLIDE 32

Conclusion

32 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-33
SLIDE 33

Conclusion “Take Aways”

1. G+ is growing rapidly: – 200k new LCC registered users per day (they show interest). – However this rate is 1 order of magnitude smaller than the one reported by Google – Reason: integrated registration process

  • 2. The number of LCC active users steadily grow (3k

per day) – But… 60 times less than new LCC registered users per day

  • 3. G+ activity (posts & reactions) are concentrated

around a small fraction of active users

33 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-34
SLIDE 34
  • 4. Despite of the impressive growth of the LCC,

the main connectivity properties have become rather stable. This indicates that the network has reached a mature status

  • 5. Most key connectivity attributes have a

striking similarity with TW and are very different from FB. These attributes suggest that G+ is used for message propagation similar to TW rather than pairwise user interaction like FB.

34

Conclusion “Take Aways”

Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-35
SLIDE 35

Conclusion Answer to the initial question

“Under a mature OSN marketplace where few players (FB, TW) present a dominant position, a new OSN (supported by a major player in the Internet) is able to attract an impressive number of initially interested users (LCC users) but has serious difficulties to get those users actively engaged in the system”

35 Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012

slide-36
SLIDE 36

Google+ or Google-? Dissecting the evolution of the New OSN in its first year

Roberto González & Rubén Cuevas, UC3M Reza Motamedi & Reza Rejaie, Univ. Oregon Angel Cuevas, Institut Telecom Sud Paris (now UC3M)

Rubén Cuevas rcuevas@it.uc3m.es Universidad Carlos III de Madrid

Google+ or Google-? Dissecting the evolution of the New OSN in its first year March 2012