Belief Propagation Probabilistic Graphical Models Sharif University - - PowerPoint PPT Presentation
Belief Propagation Probabilistic Graphical Models Sharif University - - PowerPoint PPT Presentation
Sum-Product: Message Passing Belief Propagation Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani All single-node marginals If we need the full set of marginals, repeating elimination algorithm for each
All single-node marginals
2
If we need the full set of marginals, repeating elimination
algorithm for each individual variable is wasteful
It does not share intermediate terms
Message-passing algorithms on graphs (messages are
the shared intermediate terms).
sum-product and junction tree upon convergence of the algorithms, we obtain marginal
probabilities for all cliques of the original graph.
Tree
3
Sum-product work only in trees (and we will see it also
work on tree-like graphs)
Directed tree All nodes have one parent expect to the root Undirected tree A unique path between any pair of nodes
Parameterization
4
Consider a tree 𝒰(𝒲, ℰ) Potential functions: 𝜚 𝑦𝑗 , 𝜚(𝑦𝑗, 𝑦𝑘)
𝑄 𝒚 = 1 𝑎
𝑗∈𝒲
𝜚 𝑦𝑗
𝑗,𝑘 ∈ℰ
𝜚 𝑦𝑗, 𝑦𝑘
In directed graphs:
𝜚 𝑦𝑠 = 𝑄(𝑦𝑠), ∀𝑗 ≠ 𝑠, 𝜚 𝑦𝑗 = 1 𝜚 𝑦𝑗, 𝑦𝑘 = 𝑄(𝑦𝑘|𝑦𝑗) (𝑦𝑗 is the parent of 𝑦𝑘) 𝑎 = 1
When we have evidence on variable 𝑦𝑗 as 𝑦𝑗 =
𝑦𝑗 we replace 𝑦𝑗 in all factors in which it appears by 𝑦𝑗
𝑄 𝒚 = 𝑄(𝑦𝑠)
𝑗,𝑘 ∈ℰ
𝑄 𝑦𝑘|𝑦𝑗
Sum-product: elimination view
5
Query node 𝑠 Elimination order: inverse of the topological order
Starts from leaves and generates elimination cliques of size at
most two
Elimination of each node can be considered as message-
passing (or Belief Propagation):
Elimination on trees is equivalent to message passing along tree
branches
Instead of the node elimination, we preserve the node and
compute a message from it to its parent
This message is equivalent to the factor resulted from the elimination
- f that node and all of the nodes in its subtree
Messages
6
Message that 𝑘 sends to 𝑗 … root
Messages on a tree
7
Messages can be reused to find probabilities on different
query variables.
Messages on the tree provide a data structure for caching
computations.
𝑌1 𝑌2 𝑌3 𝑌4 𝑌5 We need 𝑛32(𝑦2) to find both 𝑄(𝑌1) and 𝑄(𝑌2)
Messages and marginal distribution
8
Message that X𝑘 sends to 𝑌𝑗 𝑛𝑘𝑗 𝑦𝑗 =
𝑦𝑘
𝜚 𝑦𝑘 𝜚 𝑦𝑗, 𝑦𝑘
𝑙∈𝒪(𝑘)\𝑗
𝑛𝑙𝑘(𝑦𝑘) 𝑞 𝑦𝑠 ∝ 𝜚 𝑦𝑠
𝑙∈𝒪(𝑠)
𝑛𝑙𝑠(𝑦𝑠)
a function of only 𝑦𝑗
Messages and marginal: Example
9
𝑞 𝑦2 ∝ 𝜚 𝑦2 𝑛12(𝑦2)𝑛32(𝑦2)𝑛42(𝑦2) 𝑛12 𝑦2 =
𝑦1
𝜚 𝑦1 𝜚 𝑦1, 𝑦2
Computing all node marginals
10
We can compute over all possible elimination order
(generating only elimination cliques of size 2) by only computing all possible messages (2 ℰ )
T
- allow all nodes can be the root, we just need to compute
2 ℰ messages
Messages can be reused
Instead of running the elimination algorithm 𝑂 times
Dynamic programming approach
2-Pass algorithm that saves and uses messages
A pair of messages (one for each direction) have been computed for
each edge
Messages required to compute all node marginals
11
A two-pass message-passing schedule
12
Arbitrarily pick a node as the root
First pass: starting at the leaves and proceeds inward
each node passes a message to its parent. continues until the root has obtained messages from all of its
adjoining nodes.
Second pass: starting at the root and passing the messages back
- ut
messages are passed in the reverse direction. continues until all leaves have received their messages.
Asynchronous two-pass message-passing
13
First pass: upward Second pass: downward
Sum-product algorithm: example
14
𝑛21(𝑦1) 𝑛21(𝑦1)
Sum-product algorithm: example
15
𝑛21(𝑦1)
Parallel message-passing
16
Message-passing protocol: a node can send a message to a
neighboring node when and only when it has received messages from all of its other neighbors
Correctness of parallel message-passing on trees
The synchronous implementation is “non-blocking” Theorem:
The message-passing guarantees
- btaining
all marginals in the tree
Parallel message passing: Example
17
Tree-like graphs
18
Sum-product message passing idea can also be extended
to work in tree-like graphs (e.g., polytrees) too.
Although the undirected marginalized graphs resulted
from polytrees are not tree, the corresponding factor graph is a tree
Polytree Nodes can have multiple parents Moralized graph Factor graph
Recall: Factor graph
19
𝜚 𝑦1, 𝑦2, 𝑦3 = 𝑔
𝑏(𝑦1, 𝑦2)𝑔 𝑐(𝑦1, 𝑦3)𝑔 𝑑(𝑦2, 𝑦3)
𝜚 𝑦1, 𝑦2, 𝑦3 = 𝑔 𝑦1, 𝑦2, 𝑦3
Sum-product on factor trees
20
Factor tree: a factor graph with no loop Two types of messages:
Message that flows from variable node 𝑗 to
factor node 𝑡: 𝑤𝑗𝑡 𝑦𝑗 =
𝑢∈𝒪 𝑗 −{s}
𝜈𝑢𝑗(𝑦𝑗)
Message that flows from factor node 𝑡 to
variable node 𝑗: 𝜈𝑡𝑗 𝑦𝑗 =
𝒚𝒪 𝑡 −{𝑗}
𝑔
𝑡 𝒚𝒪(𝑡) 𝑘∈𝒪 𝑡 −{𝑗}
𝑤𝑘𝑡(𝑦𝑘)
Sum-product on factor trees
21
The introduced message-passing schedule for trees can
also be used on factor trees
When the messages from all the neighbors of a node is
received, the marginal probability will be: 𝑄 𝑦𝑗 ∝
𝑡∈𝒪 𝑗
𝜈𝑡𝑗 𝑦𝑗 𝑄 𝑦𝑗 ∝ 𝑤𝑗𝑡(𝑦𝑗)𝜈𝑡𝑗(𝑦𝑗)
𝑡 ∈ 𝒪 𝑗 𝑡 is a factor node that is neighbor of 𝑌𝑗
The relation between sum-product on factor trees and sum-product on undirected trees
22
Relation of 𝑛 messages of sum-product algorithm for undirected
trees and 𝜈 messages of sum-product algorithm for factor trees
𝜈𝑡𝑗 𝑦𝑗 =
𝒚𝒪 𝑡 −{𝑗}
𝑔
𝑡 𝒚𝒪(𝑡) 𝑘∈𝒪 𝑡 −{𝑗}
𝑤𝑘𝑡(𝑦𝑘) =
𝑦𝑘
𝜚(𝑦𝑗, 𝑦𝑘)𝑤𝑘𝑡(𝑦𝑘) =
𝑦𝑘
𝜚(𝑦𝑗, 𝑦𝑘)
𝑢∈𝒪 𝑘 −{s}
𝜈𝑢𝑘(𝑦𝑘) =
𝑦𝑘
𝜚(𝑦𝑗)𝜚(𝑦𝑗, 𝑦𝑘)
𝑢∈𝒪′ 𝑘 −{s}
𝜈𝑢𝑘(𝑦𝑘)
𝒪′ 𝑘 = 𝒪 𝑘 − {factor corresponding to 𝜚(𝑦𝑘)}
23
Example
References
24