Message Passing in the Presence of Erasures Nicholas Ruozzi - - PowerPoint PPT Presentation

▶

Dec 05, 2023 375 likes •606 views

Message Passing in the Presence of Erasures Nicholas Ruozzi Motivation Real networks are dynamic and constrained Messages are lost Nodes join and leave Nodes may be power constrained Empirical studies suggest that belief

SLIDE 1

Message Passing in the Presence of Erasures

Nicholas Ruozzi

SLIDE 2

Motivation

Real networks are dynamic and constrained
Messages are lost
Nodes join and leave
Nodes may be power constrained
Empirical studies suggest that belief propagation and its

relatives continue to perform well over real networks

[Anker, Dolev, and Hod, 2008]
[Anker, Bickson, Dolev, and Hod, 2008]
Few theoretical guarantees

SLIDE 3

Convergent Message Passing

New classes of reweighted message passing algorithms

guarantee convergence and a notion of correctness

e.g., MPLP, tree-reweighted max-product, norm-product, etc.
Need special updating schedules or central control
No guarantees if messages are lost or updated in the wrong order

SLIDE 4

Factorizations

A function, f, factorizes with respect to a graph G = (V, E) if
Goal is to maximize the function, f
Max-product attempts to solve this problem by passing

messages over the graph G

f(x1; : : : ; xn) = Y

i2V

Ái(xi) Y

(i;j)2E

Ãij(xi; xj)

SLIDE 5

Reweighted Message Passing

Messages passed from a node only depend on the messages

received by that node at the previous time step

Generalization of max-product

mt

ij(xj) := max xi

h Ãij(xi; xj)1=cijÁi(xi) Q

k2N(i) mt¡1 ki (xi)cki

mt¡1

ji (xi)

i

SLIDE 6

Reweighted Message Passing

These “beliefs” provide an alternative factorization of the
bjective function

f(x) = Y

i2V

bi(xi)(1¡P

k2@i cik)

Y

(i;j)2E

bij(xi; xj)cij

bt

i(xi)

= Ái(xi) Y

k2N(i)

mt

ki(xi)cki

bt

ij(xi; xj)

= Ãij(xi; xj)1=cij bt

i(xi)

mt

ji(xi)

bt

j(xj)

mt

ij(xj)

SLIDE 7

Reweighted Message Passing

Certain choices of the reweighting parameters produce natural

convex upper bounds on the objective function

bt

i(xi)

= Ái(xi) Y

k2N(i)

mt

ki(xi)cki

bt

ij(xi; xj)

= Ãij(xi; xj)1=cij bt

i(xi)

mt

ji(xi)

bt

j(xj)

mt

ij(xj)

max

f(x) · Y

i2V

max

xi bi(xi)(1¡P

k2@i cik)

¢ Y

(i;j)2E

max

xi;xj bij(xi; xj)cij

SLIDE 8

Reweighted Message Passing

If each c < 1/max degree, then there is a simple, “asynchronous”

coordinate descent scheme

mt

ij(xj) := max xi

h Ãij(xi; xj)1=cijÁi(xi) Q

k2N(i) mt¡1 ki (xi)cki

mt¡1

ji (xi)

i

SLIDE 9

Reweighted Message Passing

Convergence is guaranteed by performing coordinate descent
n a convex upper bound
Can we extend our convergence guarantees to networks in

which messages can be lost?

Delivered too slowly
Adversarially lost
Intentionally not sent
Lost independently with some fixed probability

SLIDE 10

Results

For pairwise MRFs:
Can modify the graph locally in order to guarantee convergence

when there are message erasures

Yields a completely local message passing algorithm as a side

effect

If no messages are lost, the convergence of the asynchronous

algorithm implies convergence of the synchronous one

SLIDE 11

Extending Convergence

With a linear amount of additional state at each node of the

network we can, again, guarantee convergence with erasures

Construct a new graphical model such that message passing on

the new model can be simulated over the network

Update messages “internal” to each node in such a way as to

guarantee convergence

SLIDE 12

Extending Convergence

Construct a new graphical model from the network:
Create a copy of node i for each one of i’s neighbors
Attach each copy to exactly one copy of each neighbor
Enforce equality among the copies of each node with equality

constraints

Messages can only be lost between copies of different nodes

(all other messages are internal to a node of the network)

SLIDE 13

Extending Convergence

Original network

New graphical model (dashed circles are the nodes of the network)

1 2 3 4 1 2 3 4

= = = = = = = =

SLIDE 14

Extending Convergence

Original network

New graphical model (dashed circles are the nodes of the network)

2

Á1(x1;2) 3 Á1(x1;1) 3

3 4 1 2 3 4

= = = = = = = =

Á1(x1)

Á1(x1;3) 3

SLIDE 15

Extending Convergence

Convergence on the new network follows from the

convergence of the asynchronous message passing algorithm

Works even in the presence of erasures
Requires no global knowledge of the network
Can convert any network into a equivalent 3-regular network

SLIDE 16

Other Extensions

Many different updating strategies can be used to guarantee

convergence:

Solve the “internal” problem exactly
Complete graph versus single cycle
Don’t divide the potentials evenly
Other graph modifications?

SLIDE 17

Performance

The additional overhead may result in slower rates of

convergence

In practice, there exist sequences of erasures for which either

algorithm outperforms the other

However, the reweighted max-product algorithm always

seems to converge in practice for appropriate choices of the parameters

SLIDE 18

Networks Without Erasures

Synchronous algorithm is an asynchronous algorithm on the

bipartite 2-cover of the network

1 2 3 4 1’ 2’ 3’ 4’ 1 2 3 4 Original network Bipartite 2-cover

SLIDE 19

Networks Without Erasures

Synchronous algorithm is an asynchronous algorithm on the

bipartite 2-cover of the network

1 2 3 4 1’ 2’ 3’ 4’ 1 2 3 4 Original network Bipartite 2-cover

SLIDE 20

Networks Without Erasures

Synchronous algorithm is an asynchronous algorithm on the

bipartite 2-cover of the network

1 2 3 4 1’ 2’ 3’ 4’ 1 2 3 4 Original network Bipartite 2-cover

SLIDE 21

Networks Without Erasures

Synchronous algorithm is an asynchronous algorithm on the

bipartite 2-cover of the network

1 2 3 4 1’ 2’ 3’ 4’ 1 2 3 4 Original network Bipartite 2-cover

SLIDE 22

Conclusions

Understanding the convergence behavior of BP-like algorithms
n a network with errors is a challenging problem
Can engineer around the problem to achieve a purely local

algorithm

May incur a performance penalty
What is the exact relationship between these algorithms?
Empirically, the reweighted algorithm on the original network

appears to always converge

Prove it?