no node2vec: Scalable Feature Learning for Networks Aditya Grover - - PowerPoint PPT Presentation

▶

Oct 04, 2022 456 likes •596 views

no node2vec: Scalable Feature Learning for Networks Aditya Grover and Jure Leskovec. KDD 2016. Presented by Haoxiang Wang. Feb 26, 2020. Node Embeddings Ou Outpu put A B In Input Intuition: Find embeddings of nodes in a d-

SLIDE 1

no node2vec:

Scalable Feature Learning for Networks

Aditya Grover and Jure Leskovec. KDD 2016.

Presented by Haoxiang Wang. Feb 26, 2020.

SLIDE 2

Node Embeddings

´Intuition: Find embeddings of nodes in a d- dimensional space so that “similar” nodes in the graph have embeddings that are close together.

A B

Ou Outpu put In Input

SLIDE 3

Setup

´Assume we have a graph G: ´V is the vertex set (i.e., node set). ´A is the adjacency matrix (assume binary).

SLIDE 4

Embedding Nodes

´ Goal: to encode nodes so that similarity in the embedding space (e.g., dot product) approximates similarity in the original network.

similarity(u, v) ≈ z>

v zu

SLIDE 5

Random Walk Embeddings: Basic Idea

1. Estimate probability of visiting

node v on a random walk starting from node u using some random walk strategy R.

2. Optimize embeddings to

encode these random walk statistics. probability that u and v co-occur on a random walk over the network

z>

u zv ≈

SLIDE 6

Algorithm/Optimization of Random Walk Embeddings

1. Run short random walks starting from each node on the graph using some strategy R. 2. For each node u collect NR(u), the multiset* of nodes visited on random walks starting from u. （* NR(u) can have repeat elements since nodes can be visited multiple times on random walks.） 3. Optimize embeddings to according to:

L = X

u∈V

X

v∈NR(u)

− log(P(v|zu))

P(v|zu) = exp(z>

u zv)

P

n2V exp(z> u zn) In practice, random sampling based

n some distribution over nodes

SLIDE 7

Node2vec: Biased Random Walks

´ Idea: use flexible, biased random walks that can trade off between local and global views of the network (Grover and Leskovec, 2016). ´ BFS (Breath-First Search)and DFS (Depth-First Search): Two classic strategies to define a neighborhood 𝑂, 𝑣 of a given node 𝑣:

u s3 s2

s1

s4 s8 s9 s6 s7 s5

BFS DFS

𝑂./0 𝑣 = { 𝑡4, 𝑡6, 𝑡7} 𝑂9/0 𝑣 = { 𝑡:, 𝑡;, 𝑡<} Local microscopic view Global macroscopic view

SLIDE 8

Combine BFS + DFS by a Ratio

Biased random walk 𝑆 that given a node 𝑣 generates neighborhood 𝑂, 𝑣 ´Two parameters: ´Return parameter 𝑞: Return back to the previous node ´Walk-away parameter 𝑟 : Moving outwards (DFS) vs. inwards (BFS)

1 1/𝑟 1/𝑞

s1 s2 w s3 u

s1 s2 s3 1/𝑞 1 1/𝑟

BFS-like walk: Low value of 𝑞 DFS-like walk: Low value of 𝑟

Unnormalized transition prob.

w →

Walker is at 𝑥. Where to go next?

SLIDE 9

Benchmarks: Node Classification & Link Prediction

? ? ? ? ?

Machine Learning

Node Classification Link Prediction

Machine Learning

? ? ?

x

SLIDE 10

Empirical Results

Node Classification Link Prediction

SLIDE 11

Advantages of Node2Vec

´node2vec performs better on node classification compared with other node embedding methods. ´Random walk approaches are generally more efficient (i.e., O(|E|) vs. O(|V|2)) ´(Note: In general, one must choose definition of node similarity that matches application. )

SLIDE 12

Other random walk node embedding works

´ Different kinds of biased random walks: ´Based on node attributes (Dong et al., 2017). ´Based on a learned weights (Abu-El-Haija et al., 2017) ´ Alternative optimization schemes: ´Directly optimize based on 1-hop and 2-hop random walk probabilities (as in LINE from Tang et al. 2015). ´ Network preprocessing techniques: ´Run random walks on modified versions of the original network (e.g., Ribeiro et al. 2017’s struct2vec, Chen et

al. 2016’s HARP).