Local clustering with graph diffusions and spectral solution paths - PowerPoint PPT Presentation
Local clustering with graph diffusions and spectral solution paths Joint with Kyle Kloster David F David F. . Gleich Gleich, (Purdue), supported by Purdue University NSF CAREER 1149756-CCF Local Clustering Given seed(s) S in G
Local clustering with � graph diffusions and � spectral solution paths Joint with Kyle Kloster � David F David F. . Gleich Gleich, (Purdue), supported by � Purdue University � NSF CAREER 1149756-CCF
Local Clustering Given seed(s) S in G , find a good cluster near S seed
Local Clustering Given seed(s) S in G , find a good cluster near S seed “Near”? -> local, small containing S “Good”? -> low conductance
Low-conductance sets are clusters # edges leaving T conductance( T ) = # edge endpoints in T (for small sets T, i.e. vol(T) < vol(G)/2) = “ chance a random edge � that touches T exits T ”
Low-conductance sets are clusters # edges leaving T conductance( T ) = # edge endpoints in T (for small sets T, i.e. vol(T) < vol(G)/2) For a global cluster, could use Fiedler… But we want a local cluster
Fiedler Compute Fiedler vector, v : L v = λ 2 D v “Sweep” over v : 1. sort: v (1) ≥ v (2) ≥ · · · 2. for each set S k = (1,…,k) compute conductance φ ( S k ) 3. output best S k
Fiedler Compute Fiedler vector, v : L v = λ 2 D v Cheeger Inequality: � Fiedler finds a cluster “not “Sweep” over v : too much worse” than � 1. sort: global optimal v (1) ≥ v (2) ≥ · · · 2. for each set S k = (1,…,k) But we want local… compute conductance φ ( S k ) 3. output best S k
Local Fiedler and diffusions [Mahoney Orecchia Vishnoi 12] “A local spectral method…” Fiedler L v = D v [ λ ] with local bias (MOV) L v = D v [ λ ] + “ s ” (normalized seed vector s ) THM: MOV is a scaling of personalized PageRank*!
Local Fiedler and diffusions Intuition: why MOV ~ PageRank Fiedler L v = D v [ λ ] with local bias L v = D v [ λ ] + “ s ” ( I − D − 1 / 2 AD − 1 / 2 )ˆ v = ˆ v [ λ ] + “ s ” AD − 1 ˆ v = ˆ v [1 − λ ] + “ s ” PageRank vector, � ( I − α P ) ˆ v = “ s ” a diffusion
PageRank and other diffusions “Personalized” PageRank (PPR) [Andersen, Chung, Lang 06]: local Cheeger inequality � and fast algorithm, “Push” procedure Diffusion perspective Standard setting α k P k ˆ X x = ( I − α P ) x = ˆ s s k =0
PageRank and other diffusions α k P k ˆ X x = s “Personalized” PageRank (PPR) k =0 [Andersen, Chung, Lang 06]: local Cheeger inequality � and fast algorithm, “Push” procedure k ! P k ˆ t k X Heat Kernel diffusion (HK) � f = s (many more!) k =0 0 10 α =0.99 Various diffusions Weight explore different − 5 10 aspects of graphs. α =0.85 t=1 t=5 t=15 0 20 40 60 80 100 Length
Diffusions, theory & practice good fast conductance algorithm Local Cheeger Inequality [Andersen Chung Lang 06] PR “PPR-push” is O(1/( ε (1- 𝛽 ))) Local Cheeger Inequality [K., Gleich 2014] HK [Chung 07] “HK-push” is O(e t C/ ε ) [Avron, Horesh 2015] TDPR Open question Gen � This talk Open question Diff
Diffusions, theory & practice good fast conductance algorithm Local Cheeger Inequality [Andersen Chung Lang 06] PR “PPR-push” is O(1/( ε (1- 𝛽 ))) Local Cheeger Inequality [K., Gleich 2014] HK [Chung 07] “HK-push” is O(e t C/ ε ) [Avron, Horesh 2015] TDPR Open question Gen � This talk Open question Diff David Gleich and I are working with Olivia Simpson (a student of Fan Chung’s)
General diffusions: intuition A diffusion propagates “rank” from a seed across a graph. seed = high � diffusion value � = low � = local cluster / � low-conductance set �
General diffusions A diffusion propagates “rank” from a seed across a graph. General diffusion vector c k P k ˆ X f = s k =0 f = + … + p 3 p 0 + p 1 + p 2 c 3 c 0 c 1 c 2 Sweep over f !
General algorithm k D − 1 ( f � ˆ 1. Approximate f so f ) k ∞ ✏ D − 1 ˆ 2. Scale, f 3. Then sweep! How to do this efficiently?
Algorithm Intuition From parameters c k , ε , seed s … � Starting from here… seed seed p 3 … p 0 p 1 p 2 How to end up here? + p 3 + … + p 0 + p 1 p 2 c 3 c 1 c 2 c 0
Algorithm Intuition Begin with mass at seed(s) seed seed in a “residual” staging area, r 0 r 3 … r 0 r 1 r 2 The residuals r k hold mass that is unprocessed – it’s like error Idea : “push” any entry r k (j)/ d j > (some threshold) + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � (2) put in f , r 3 … r 0 r 1 r 2 + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � c 1 (2) put in f , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r � + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � c 2 (2) put in f , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r (repeat) + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � (2) put in f , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r c 2 (repeat) + p 3 + … + c 3 p 0 + p 1 p 2 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � (2) put in f , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r c 2 (repeat) c 3 + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Thresholds entries < threshold ERROR equals weighted sum of entries left in r k r 3 … r 0 r 1 r 2 à Set threshold so “leftovers” sum to < ε + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Thresholds entries < threshold ERROR equals weighted sum of entries left in r k r 3 … r 0 r 1 r 2 à Set threshold so “leftovers” sum to < ε Threshold for stage r k is � 0 1 ∞ X ✏ / c j @ A + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0 j = k +1 k D − 1 ( f � ˆ Then f ) k ∞ ✏
Another perspective Fiedler L v = D v [ λ ] with local bias L v = D v [ λ ] + “ s ” ( I − D − 1 / 2 AD − 1 / 2 )ˆ v = ˆ v [ λ ] + “ s ” AD − 1 ˆ v = ˆ v [1 − λ ] + “ s ” PageRank vector, � ( I − α P ) ˆ v = “ s ” a diffusion
Another perspective L V k = D V k Λ k Fiedler with local bias L V k = D V k Λ k + S V k Λ k + ˆ ( I − D − 1 / 2 AD − 1 / 2 ) ˆ V k = ˆ S AD − 1 ˆ V k ( I − Λ k ) + ˆ V k = ˆ S
Another perspective L V k = D V k Λ k Fiedler with local bias L V k = D V k Λ k + S V k Λ k + ˆ ( I − D − 1 / 2 AD − 1 / 2 ) ˆ V k = ˆ S AD − 1 ˆ V k ( I − Λ k ) + ˆ V k = ˆ S Mix-product property � V k + ¯ P ˆ V k Γ = ˆ S For Kronecker product
Another perspective L V k = D V k Λ k Fiedler with local bias L V k = D V k Λ k + S V k Λ k + ˆ ( I − D − 1 / 2 AD − 1 / 2 ) ˆ V k = ˆ S AD − 1 ˆ V k ( I − Λ k ) + ˆ V k = ˆ S Mix-product property � V k + ¯ P ˆ V k Γ = ˆ S For Kronecker product ( I − Γ T ⊗ P )vec( ˆ V k ) = vec( ˜ S )
Another perspective ( I − Γ T ⊗ P )vec( ˆ V k ) = vec( ˜ S ) ( I − α P ) ˆ v = ˜ s - generalizes PageRank to “matrix teleportation parameter” Γ = ( I − Λ k ) − 1 Standard spectral approach:
Another perspective ( I − Γ T ⊗ P )vec( ˆ V k ) = vec( ˜ S ) ( I − α P ) ˆ v = ˜ s - generalizes PageRank to “matrix teleportation parameter” c 0 ˜ 0 ... Our framework 0 Γ = is equivalent to: ... c N ˜ 0 (Details in [K., Gleich KDD 14])
General diffusions: conclusion THM : For diffusion coefficients c k >= 0 satisfying N ∞ “rate of X and X c k = 1 c k ≤ ✏ / 2 decay” k =0 k =0 “generalized push” approximates the diffusion f k D − 1 ( f � ˆ on a symmetric graph so that f ) k ∞ ✏ in work bounded by O (2 N 2 / ✏ ) Constant for any inputs! (If diffusion decays fast)
Proof sketch N X 1. Stop pushing after N terms. c k ≤ ✏ / 2 k =0 2. Push residual entries in first N terms if r k ( j ) ≥ d ( j ) ✏ / (2 N ) m k N − 1 3. Total work is # pushes: X X d ( j t ) t =1 k =0
Push Recap d(j) work push – (1) remove entry in r k , � (2) put in p , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r c 2 c 3 + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Proof sketch N X 1. Stop pushing after N terms. c k ≤ ✏ / 2 k =0 2. Push residual entries in first N terms if r k ( j ) ≥ d ( j ) ✏ / (2 N ) m k N − 1 3. Total work is # pushes: X X d ( j t ) t =1 k =0
Proof sketch N X 1. Stop pushing after N terms. c k ≤ ✏ / 2 k =0 2. Push residual entries in first N terms if r k ( j ) ≥ d ( j ) ✏ / (2 N ) m k m k N − 1 N − 1 3. Total work is # pushes: X X X X r k ( j t )(2 N ) / ✏ d ( j t ) ≤ t =1 k =0 t =1 k =0
Proof sketch N X 1. Stop pushing after N terms. c k ≤ ✏ / 2 k =0 2. Push residual entries in first N terms if r k ( j ) ≥ d ( j ) ✏ / (2 N ) m k m k N − 1 N − 1 3. Total work is # pushes: X X X X r k ( j t )(2 N ) / ✏ d ( j t ) ≤ t =1 k =0 t =1 k =0 m k 4. Each r k sums to <= 1 X r k ( j t ) ≤ 1 (each push is added to f , which sums to 1) t =1 O (2 N 2 / ✏ )
Solutions Paths Benefit of these “push” diffusions? A direct decomposition is a black box: Feed in input, get output. In contrast, the iterative nature of “push” means running the algorithm is essentially “watching” the diffusion process occur.
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.