Discussion Topics and Ego Networks on Twitter Emma S. Spiro - - PowerPoint PPT Presentation

discussion topics and ego networks on twitter
SMART_READER_LITE
LIVE PREVIEW

Discussion Topics and Ego Networks on Twitter Emma S. Spiro - - PowerPoint PPT Presentation

Discussion Topics and Ego Networks on Twitter Emma S. Spiro University of California, Irvine Department of Sociology Presented at MURI AHM May 25, 2010 This material is based on research supported by the Office of Naval Research under award


slide-1
SLIDE 1

Discussion Topics and Ego Networks on Twitter

Emma S. Spiro

University of California, Irvine Department of Sociology Presented at MURI AHM May 25, 2010 This material is based on research supported by the Office of Naval Research under award N00014-08-1-1015.

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-2
SLIDE 2

Talk Outline

◮ Project Motivation ◮ Introduction to Twitter ◮ Data Collection ◮ Communication Dynamics ◮ Structural Characteristics of Personal Networks

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-3
SLIDE 3

Project Motivation

◮ Informal communication channels are often the primary means

by which time-sensitive hazard information first reaches members of the public.

◮ Social media technologies, e.g. micro-blogging, provide a

means for gathering, sorting and disseminating information — a venue for collective problem solving.

◮ Relatively little is known about the dynamics of informal

information communication in emergencies or hazards.

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-4
SLIDE 4

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-5
SLIDE 5

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-6
SLIDE 6

Why Twitter?

◮ Twitter represents an extremely large social network —

100 million users.

◮ Tie formation and destruction are rapid and widespread. ◮ Combination of text and interpersonal networks. ◮ Extreme heterogeneity in terms of network properties as well

as communication behavior.

◮ Scalable methods and models.

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-7
SLIDE 7

Modeling Discussion Topics on Twitter

Consider the population of individuals talking about a given topic. Can we make predictions about

◮ the dynamics of this communication? ◮ the network properties of this discussion group? ◮ For now, sampling-based approaches.

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-8
SLIDE 8

Project Activities

Using automated data collection methods we collect information

◮ on the dynamics of communication content. ◮ on the properties of communicants’ online interpersonal

networks.

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-9
SLIDE 9

Twitter Data Collection, Part I - Topic Dynamics

◮ Public, global content is searchable by keyword. ◮ Begin with a list of topics each characterized by a set of

keywords.

◮ We include a control topic in which words are chosen from

Ogden’s word list.

◮ Automated data collection designed to capture all public

tweets containing the given keyword.

◮ Potential missing data.

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-10
SLIDE 10

Twitter Data Collection, Part II - Personal Networks

◮ Each user on Twitter has a personal network consisting of

friends (out-ties) and followers (in-ties).

◮ For each keyword we sample 20 recently active users each day

and keep them in the sample for 7 days.

◮ For each user we obtain a list of alters, as well as various

covariates if available.

◮ Potential covariates: location, privacy settings, timezone,

account creation date, activity level, language.

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-11
SLIDE 11

Research Questions

◮ What seasonality exists within a discussion?

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-12
SLIDE 12

Research Questions

◮ What seasonality exists within a discussion? ◮ How do exogenous events affect communication dynamics?

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-13
SLIDE 13

Research Questions

◮ What seasonality exists within a discussion? ◮ How do exogenous events affect communication dynamics? ◮ What are the structural characteristics of the interpersonal

networks of the discussant group?

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-14
SLIDE 14

Research Questions

◮ What seasonality exists within a discussion? ◮ How do exogenous events affect communication dynamics? ◮ What are the structural characteristics of the interpersonal

networks of the discussant group?

◮ Individual level prediction?

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-15
SLIDE 15

Oil spill: Seasonality and Exogenous Events

Time Tweet Count

500 1000 1500

  • Oil Reaches Coastline in LA

BP Launches Live Webcam of Leak Weekend Weekend Weekend

03−May 10−May 17−May 24−May

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-16
SLIDE 16

Structural Characteristics of Personal Networks

◮ Mean degree of topic participants. ◮ Is an increase in overall mean degree due to those already

present in the discussion gaining alters or is it due to high degree individuals entering the discussion?

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-17
SLIDE 17

Degree Distribution Dynamics

◮ Consider the mean degree in the population, i.e. topic

discussion sample, over time.

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-18
SLIDE 18

Degree Distribution Dynamics

◮ Consider the mean degree in the population, i.e. topic

discussion sample, over time.

◮ The mean degree is affected by different population processes:

those entering the sample (immigrants), those who stay in the sample (non-migrants), and those who leave the sample (emigrants).

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-19
SLIDE 19

Lightning: Mean Degree Dynamics

Date Degree

102 102.5 103 103.5 104 Mar−10 Apr−10 factor(type) e i s t

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-20
SLIDE 20

Degree Dynamics Decomposition

time: t+1 time: t

I E' S E

population sample population sample

I' E'' S' E'

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-21
SLIDE 21

Degree Dynamics Decomposition

¯ dt+1 − ¯ dt = ¯ dt+1(I′)N I′ + ¯ dt+1(S′)N S′ Nt+1 − ¯ dt(S′)N S′ + ¯ dt(E′)N E′ Nt = ¯ dt+1(I′)N I′ Nt+1 + ¯ dt+1(S′)N S′ Nt+1 − ¯ dt(S′)N S′ Nt − ¯ dt(E′)N E′ Nt

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-22
SLIDE 22

Degree Dynamics Decomposition

◮ Let ¯

dt+1(I′) = α ¯ dt(S′), ¯ dt(E′) = ǫ ¯ dt(S′), ¯ dt+1(S′) = γ ¯ dt(S′).

◮ Intuitively, we are expressing the respective degrees of the immigrants,

emigrants, and (t+1) stayers in terms of what the stayers’ degrees were at time t.

◮ Likewise, let wI′

t+1 = N I′/Nt+1, wE′ t

= N E′/Nt, and wS′

t+1 = N S′/Nt+1

be the relative population weights for the three groups. ¯ dt+1 − ¯ dt = α ¯ dt(S′)wI′

t+1 + γ ¯

dt(S′)wS′

t+1

» 1 − Nt+1 γNt – − ǫ ¯ dt(S′)wE′

t

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-23
SLIDE 23

Lightning: Mean Degree Dynamics

Date Parameter

5 10 15 20 25 30 Mar−10 Apr−10 factor(type) a e g

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-24
SLIDE 24

Summary

◮ Modeling large scale dynamic networks with text component. ◮ Scalability. ◮ Activity sampling and egocentric properties.

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data

slide-25
SLIDE 25

Future Work

◮ Time-series analysis of the topic data. ◮ Complete sampling of discussant groups. ◮ Decomposition of the change in average number of shared

partners or other statistics.

◮ Statistical models of topics on dynamics networks. ◮ Questions?

NCASD

NETWORKS, COMPUTATION, and SOCIAL DYNAMICS

Scalable Methods for the Analysis of Network-Based Data