[PPT] - Belief Propagation Probabilistic Graphical Models Sharif University PowerPoint Presentation

SLIDE 1

Sum-Product: Message Passing Belief Propagation

Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani

SLIDE 2

All single-node marginals

2

 If we need the full set of marginals, repeating elimination

algorithm for each individual variable is wasteful

 It does not share intermediate terms

 Message-passing algorithms on graphs (messages are

the shared intermediate terms).

 sum-product and junction tree  upon convergence of the algorithms, we obtain marginal

probabilities for all cliques of the original graph.

SLIDE 3

Tree

3

 Sum-product work only in trees (and we will see it also

work on tree-like graphs)

Directed tree All nodes have one parent expect to the root Undirected tree A unique path between any pair of nodes

SLIDE 4

Parameterization

4

 Consider a tree 𝒰(𝒲, ℰ)  Potential functions: 𝜚 𝑦𝑗 , 𝜚(𝑦𝑗, 𝑦𝑘)

𝑄 𝒚 = 1 𝑎

𝑗∈𝒲

𝜚 𝑦𝑗

𝑗,𝑘 ∈ℰ

𝜚 𝑦𝑗, 𝑦𝑘

 In directed graphs:

 𝜚 𝑦𝑠 = 𝑄(𝑦𝑠), ∀𝑗 ≠ 𝑠, 𝜚 𝑦𝑗 = 1  𝜚 𝑦𝑗, 𝑦𝑘 = 𝑄(𝑦𝑘|𝑦𝑗) (𝑦𝑗 is the parent of 𝑦𝑘)  𝑎 = 1

 When we have evidence on variable 𝑦𝑗 as 𝑦𝑗 =

𝑦𝑗 we replace 𝑦𝑗 in all factors in which it appears by 𝑦𝑗

𝑄 𝒚 = 𝑄(𝑦𝑠)

𝑗,𝑘 ∈ℰ

𝑄 𝑦𝑘|𝑦𝑗

SLIDE 5

Sum-product: elimination view

5

 Query node 𝑠  Elimination order: inverse of the topological order

 Starts from leaves and generates elimination cliques of size at

most two

 Elimination of each node can be considered as message-

passing (or Belief Propagation):

 Elimination on trees is equivalent to message passing along tree

branches

 Instead of the node elimination, we preserve the node and

compute a message from it to its parent

 This message is equivalent to the factor resulted from the elimination

f that node and all of the nodes in its subtree

SLIDE 6

Messages

6

Message that 𝑘 sends to 𝑗 … root

SLIDE 7

Messages on a tree

7

 Messages can be reused to find probabilities on different

query variables.

 Messages on the tree provide a data structure for caching

computations.

𝑌1 𝑌2 𝑌3 𝑌4 𝑌5 We need 𝑛32(𝑦2) to find both 𝑄(𝑌1) and 𝑄(𝑌2)

SLIDE 8

Messages and marginal distribution

8

Message that X𝑘 sends to 𝑌𝑗 𝑛𝑘𝑗 𝑦𝑗 =

𝑦𝑘

𝜚 𝑦𝑘 𝜚 𝑦𝑗, 𝑦𝑘

𝑙∈𝒪(𝑘)\𝑗

𝑛𝑙𝑘(𝑦𝑘) 𝑞 𝑦𝑠 ∝ 𝜚 𝑦𝑠

𝑙∈𝒪(𝑠)

𝑛𝑙𝑠(𝑦𝑠)

a function of only 𝑦𝑗

SLIDE 9

Messages and marginal: Example

9

𝑞 𝑦2 ∝ 𝜚 𝑦2 𝑛12(𝑦2)𝑛32(𝑦2)𝑛42(𝑦2) 𝑛12 𝑦2 =

𝑦1

𝜚 𝑦1 𝜚 𝑦1, 𝑦2

SLIDE 10

Computing all node marginals

10

 We can compute over all possible elimination order

(generating only elimination cliques of size 2) by only computing all possible messages (2 ℰ )

 T

allow all nodes can be the root, we just need to compute

2 ℰ messages

 Messages can be reused

 Instead of running the elimination algorithm 𝑂 times

 Dynamic programming approach

 2-Pass algorithm that saves and uses messages

 A pair of messages (one for each direction) have been computed for

each edge

SLIDE 11

Messages required to compute all node marginals

11

SLIDE 12

A two-pass message-passing schedule

12

 Arbitrarily pick a node as the root

 First pass: starting at the leaves and proceeds inward

 each node passes a message to its parent.  continues until the root has obtained messages from all of its

adjoining nodes.

 Second pass: starting at the root and passing the messages back

ut

 messages are passed in the reverse direction.  continues until all leaves have received their messages.

SLIDE 13

Asynchronous two-pass message-passing

13

First pass: upward Second pass: downward

SLIDE 14

Sum-product algorithm: example

14

𝑛21(𝑦1) 𝑛21(𝑦1)

SLIDE 15

Sum-product algorithm: example

15

𝑛21(𝑦1)

SLIDE 16

Parallel message-passing

16

 Message-passing protocol: a node can send a message to a

neighboring node when and only when it has received messages from all of its other neighbors

 Correctness of parallel message-passing on trees

 The synchronous implementation is “non-blocking”  Theorem:

The message-passing guarantees

btaining

all marginals in the tree

SLIDE 17

Parallel message passing: Example

17

SLIDE 18

Tree-like graphs

18

 Sum-product message passing idea can also be extended

to work in tree-like graphs (e.g., polytrees) too.

 Although the undirected marginalized graphs resulted

from polytrees are not tree, the corresponding factor graph is a tree

Polytree Nodes can have multiple parents Moralized graph Factor graph

SLIDE 19

Recall: Factor graph

19

𝜚 𝑦1, 𝑦2, 𝑦3 = 𝑔

𝑏(𝑦1, 𝑦2)𝑔 𝑐(𝑦1, 𝑦3)𝑔 𝑑(𝑦2, 𝑦3)

𝜚 𝑦1, 𝑦2, 𝑦3 = 𝑔 𝑦1, 𝑦2, 𝑦3

SLIDE 20

Sum-product on factor trees

20

 Factor tree: a factor graph with no loop  Two types of messages:

 Message that flows from variable node 𝑗 to

factor node 𝑡: 𝑤𝑗𝑡 𝑦𝑗 =

𝑢∈𝒪 𝑗 −{s}

𝜈𝑢𝑗(𝑦𝑗)

 Message that flows from factor node 𝑡 to

variable node 𝑗: 𝜈𝑡𝑗 𝑦𝑗 =

𝒚𝒪 𝑡 −{𝑗}

𝑔

𝑡 𝒚𝒪(𝑡) 𝑘∈𝒪 𝑡 −{𝑗}

𝑤𝑘𝑡(𝑦𝑘)

SLIDE 21

Sum-product on factor trees

21

 The introduced message-passing schedule for trees can

also be used on factor trees

 When the messages from all the neighbors of a node is

received, the marginal probability will be: 𝑄 𝑦𝑗 ∝

𝑡∈𝒪 𝑗

𝜈𝑡𝑗 𝑦𝑗 𝑄 𝑦𝑗 ∝ 𝑤𝑗𝑡(𝑦𝑗)𝜈𝑡𝑗(𝑦𝑗)

𝑡 ∈ 𝒪 𝑗 𝑡 is a factor node that is neighbor of 𝑌𝑗

SLIDE 22

The relation between sum-product on factor trees and sum-product on undirected trees

22

 Relation of 𝑛 messages of sum-product algorithm for undirected

trees and 𝜈 messages of sum-product algorithm for factor trees

𝜈𝑡𝑗 𝑦𝑗 =

𝒚𝒪 𝑡 −{𝑗}

𝑔

𝑡 𝒚𝒪(𝑡) 𝑘∈𝒪 𝑡 −{𝑗}

𝑤𝑘𝑡(𝑦𝑘) =

𝑦𝑘

𝜚(𝑦𝑗, 𝑦𝑘)𝑤𝑘𝑡(𝑦𝑘) =

𝑦𝑘

𝜚(𝑦𝑗, 𝑦𝑘)

𝑢∈𝒪 𝑘 −{s}

𝜈𝑢𝑘(𝑦𝑘) =

𝑦𝑘

𝜚(𝑦𝑗)𝜚(𝑦𝑗, 𝑦𝑘)

𝑢∈𝒪′ 𝑘 −{s}

𝜈𝑢𝑘(𝑦𝑘)

𝒪′ 𝑘 = 𝒪 𝑘 − {factor corresponding to 𝜚(𝑦𝑘)}

SLIDE 23

23

Example