Similarity Ranking in Large- Scale Bipartite Graphs
Brown University - 20th March 2014
1
Similarity Ranking in Large- Scale Bipartite Graphs Alessandro - - PowerPoint PPT Presentation
Similarity Ranking in Large- Scale Bipartite Graphs Alessandro Epasto Brown University - 20 th March 2014 1 Joint work with J. Feldman, S. Lattanzi, S. Leonardi, V. Mirrokni [WWW, 2014] 2 AdWords Ads Ads Our Goal Tackling AdWords
1
2
Joint work with J. Feldman, S. Lattanzi,
Ads Ads
Large advertisers (e.g., Amazon, Ask.com, etc) compete in several market segments with very different advertisers.
Query Information
Nike store New York Market Segment: Retailer, Geo: NY (USA), Stats: 10 clicks Soccer shoes Market Segment: Apparel, Geo: London, UK, Stats: 4 clicks Soccer ball Market Segment: Equipment Geo: San Francisco (USA), Stats: 5 clicks
…. millions of other queries ….
Millions of Advertisers Billions of Queries Hundreds of Labels
contexts:
users and suggest movies.
related authors and suggest papers to read.
we want algorithms with complexity depending on the smaller side.
A
A
A
Goal: Find the nodes most “similar” to A.
similarity measures:
Jaccard Coefficient, Adamic-Adar.
induce real-time similarity rankings in multi- categorical bipartite graphs, that we apply to several similarity measures.
algorithms.
v u The stationary distribution assigns a similarity score to each node in the graph w.r.t. node v.
evaluation compared to other similarities (Jaccard, Intersection, etc.).
large graphs (hundred of millions of nodes).
large-scale systems.
subset of labels.
Reduce: Given the bipartite and a category construct a graph with only A nodes that preserves the ranking on the entire graph.
graphs of the subset of categories interested determine the ranking for v.
MapReduce pre-computation of the individual category reduced graphs.
aggregation algorithm.
(Simon and Ado, ’61; Meyer ’89, etc.).
while preserving correctly the PPR distribution on the entire graph.
following matrix
i )−1P∗i
. . . P1i . . . P1k . . . . . . . . . . . . . . . Pi1 . . . Pii . . . Pik . . . . . . . . . . . . . . . Pk1 . . . Pki . . . Pkk
following matrix
i )−1P∗i
. . . P1i . . . P1k . . . . . . . . . . . . . . . Pi1 . . . Pii . . . Pik . . . . . . . . . . . . . . . Pk1 . . . Pki . . . Pkk
following matrix
i )−1P∗i
. . . P1i . . . P1k . . . . . . . . . . . . . . . Pi1 . . . Pii . . . Pik . . . . . . . . . . . . . . . Pk1 . . . Pki . . . Pkk
following matrix
i )−1P∗i
. . . P1i . . . P1k . . . . . . . . . . . . . . . Pi1 . . . Pii . . . Pik . . . . . . . . . . . . . . . Pk1 . . . Pki . . . Pkk
following matrix
i )−1P∗i
. . . P1i . . . P1k . . . . . . . . . . . . . . . Pi1 . . . Pii . . . Pik . . . . . . . . . . . . . . . Pk1 . . . Pki . . . Pkk
Theorem [Meyer ’89] For every irreducible aperiodic Markov Chain, where is the stationary distribution of the nodes in and is the stationary distribution of
unfeasible in general for large matrices (matrix inversion).
invert the matrix analytically.
y x
Side A Side B
y x
Side A Side B
z∈N(x)∪N(y)
h∈N(z)w(z,h)
y x
One step in the reduced graph is equivalent to two steps in the bipartite graph.
Side A Side B
Lemma 1: PPR(G, α, a)[A] =
1 2−αPPR( ˆ
G, 2α − α2, a)
Proof Sketch:
stationarity does not depend on the graph.
Lemma 2: PPR(G, α, a)[B] = 1−α
2−α
P
b∈N(a) w(a, b)PPR( ˆ
GB, 2α − α2, b)
Similarly, we can reduce the process to a graph with B-Side nodes only. Finally, the stationary distribution of either side uniquely determines that of the other side.
Koury et al. Aggregation-Disaggregation Algorithm
Step 1: Partition the Markov chain into disjoint subsets
Koury et al. Aggregation-Disaggregation Algorithm
Step 2: Approximate the stationary distribution on each subset independently.
Koury et al. Aggregation-Disaggregation Algorithm
Step 3: Compute the k x k approximated transition matrix T between the subsets.
Koury et al. Aggregation-Disaggregation Algorithm
Step 4: Compute the stationary distribution of T.
Koury et al. Aggregation-Disaggregation Algorithm
Step 5: Based on the stationary distribution improve the estimation of and . Repeat until convergence.
X Y
Precompute the stationary distributions individually
X Y
Precompute the stationary distributions individually
X Y
X Y
X Y X Y
stationary distributions of the two sides.
with Advertiser-Side nodes.
converges to the correct distribution.
proprietary datasets:
billions nodes, > 5 billions edges.
Inventions graphs.
AdWords.
Recall Precision
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision Recall Precision vs Recall Inter Jaccard Adamic-Adar Katz PPR
Recall Precision
Iterations 1-Cosine Similarity
1e-06 1e-05 0.0001 0.001 2 4 6 8 10 12 14 16 18 20 1-Cosine Iterations Approximation Error vs # Iterations DBLP (1 - Cosine) Patent (1 - Cosine)