Accelerating Bayesian Inference on Structured Graphs Using Parallel - - PowerPoint PPT Presentation

accelerating bayesian inference on structured graphs
SMART_READER_LITE
LIVE PREVIEW

Accelerating Bayesian Inference on Structured Graphs Using Parallel - - PowerPoint PPT Presentation

Accelerating Bayesian Inference on Structured Graphs Using Parallel Gibbs Sampling Glenn G. Ko gko@seas.harvard.edu Harvard University September 10, 2019 Supervised vs. Unsupervised Machine Learning Supervised Unsupervised


slide-1
SLIDE 1

Accelerating Bayesian Inference on Structured Graphs Using Parallel Gibbs Sampling

Glenn G. Ko gko@seas.harvard.edu Harvard University September 10, 2019

slide-2
SLIDE 2

Supervised vs. Unsupervised Machine Learning

[https://mapr.com/blog/demystifying-ai-ml-dl]

Supervised Unsupervised

slide-3
SLIDE 3

Why Bayesian Machine Learning

[https://github.com/stan-dev/stancon_talks/blob/master/2017/Contributed-Talks/08_trangucci/hierarchical_GPs_in_stan.pdf]

Republican Democratic

  • Predict a probability distribution not a point estimate
  • Quantify uncertainty
slide-4
SLIDE 4

Deep Learning vs. Bayesian ML

Deep Learning Bayesian Inference Data Type / Size Needs large labeled data Scarce or no labeled data Interpretability Black-box Interpretable models Prior Knowledge No Prior + new observations Scalability Parallelizable Limited parallelism Generalizability Generalizable Hand-crafted models Unsupervised Good at supervised Good at unsupervised ... ... ... Combining the two: Variational autoencoder, Generative Adversarial Networks, Bayesian neural networks, and etc.

slide-5
SLIDE 5

Bayesian Models and Inference

  • Unsupervised learning
  • Scarce or no labeled data for training
  • Ability to represent and manipulate uncertainty
  • Generative models

X: Hidden Parameters Y: Observed Data

Bayes’ Rule:

Likelihood Prior Evidence

slide-6
SLIDE 6

Markov Random Fields and Inference

Pixel-labeling problems on MRF:

  • Stereo matching
  • Image restoration
  • Image segmentation
  • Sound source separation

Stereo matching Pixels = nodes Edges to neighbors Inference for best set of new labels

Likelihood (Data cost) Prior (Smoothness cost)

y: input pixels x: labels for each pixel

slide-7
SLIDE 7

Unsupervised Learning Tasks on MRF

MRF

Markov Random Field

Solve

Approximate Bayesian Inference Image Reconstruction Stereo Matching Sound Source Separation

slide-8
SLIDE 8

Markov Chain Monte Carlo Methods

Approximating pi

[https://wiki.ubc.ca/Course:CPSC522/MCMC]

A biased random walk that explores the target distribution P

slide-9
SLIDE 9

Gibbs Sampling Inference

Sample & update parameter

Gibbs sampling on Markov Random Field Maximum A Posteriori Inference:

slide-10
SLIDE 10

Stereo Matching Using Gibbs Sampling

Input Ground Truth

slide-11
SLIDE 11

Parallelizing Gibbs Sampling

Geman & Geman stated, “the MRF can be divided into collections of variables with each collection assigned to an independently running asynchronous processor.” Three types of parallelism:

  • Naïve: Run multiple parallel chains independently
  • Algorithmic: Graph-coloring and blocking:

Blocked, Chromatic (Gonzalez), Splash (Gonzalez)

  • Empirical: Asynchronous (Hogwild!) updates of partitioned graphs

Newman et al. (AD-LDA), De Sa et al. (2016 ICML best paper)

slide-12
SLIDE 12

Chromatic Gibbs Sampling

Conditional Independence via Local Markov Property

slide-13
SLIDE 13

Hybrid CPU-FPGA Architecture

Xilinx Zynq UltraScale+ ZCU102-ES2

slide-14
SLIDE 14

Running Sound Source Separation

Noisy mixture Separated source

slide-15
SLIDE 15

Compute Partition

230x speedup over ARM Cortex-A53

slide-16
SLIDE 16

Speedups

1048x speedup and 99.8% energy reduction vs. ARM Cortex A53 for binary label MRF Gibbs sampling

slide-17
SLIDE 17

Number of Iterations vs. Quality of the Solution

Sound source separation Stereo matching: tsukuba Image restoration: house

slide-18
SLIDE 18

Future Work

  • Asynchronous Gibbs Sampling
  • Accelerating more complex graphs
  • More complex structured graphs
  • Unstructured graphs
  • Challenges
  • Programmable inference architecture
  • Probabilistic programming languages
  • Compilers, IR

[https://bricaud.github.io/HCmails]

Hilary Clinton’s emails

slide-19
SLIDE 19

THANK YOU

This work is supported by the Semiconductor Research Corporation (SRC) and DARPA.

slide-20
SLIDE 20

Reconstructed image Markov Random Field Damaged Image

Input pixels Output labels (Pixel-labeling)

Unsupervised Learning Reconstructed image

slide-21
SLIDE 21

Gibbs Sampler Optimization for Source Separation

20 of 11

Optimizations: Multipliers -> Shifters MRF size: 513x24