Func%onal Probabilis%c Programming CUFP 2013 Avi Pfeffer - - PowerPoint PPT Presentation

func onal probabilis c programming cufp 2013
SMART_READER_LITE
LIVE PREVIEW

Func%onal Probabilis%c Programming CUFP 2013 Avi Pfeffer - - PowerPoint PPT Presentation

Func%onal Probabilis%c Programming CUFP 2013 Avi Pfeffer Charles River Analy2cs apfeffer@cra.com Outline What is probabilistic programming? History Our Figaro language


slide-1
SLIDE 1

Func%onal ¡Probabilis%c ¡Programming ¡ CUFP ¡2013 ¡

Avi ¡Pfeffer ¡ Charles ¡River ¡Analy2cs ¡ apfeffer@cra.com ¡

slide-2
SLIDE 2

Outline

Ÿ What is probabilistic programming? Ÿ History Ÿ Our Figaro language Ÿ Examples

slide-3
SLIDE 3

Ÿ Suppose you have some information

Ÿ E.g., Brian ate pizza last night

Ÿ You want to answer some questions based on this information

Ÿ Is Brian a student? Ÿ Is Brian a programmer?

Ÿ There is uncertainty in the answers

3

The Problem

slide-4
SLIDE 4

Ÿ Create a joint probability distribution over the variables

Ÿ P(Pizza, programmer, student) Ÿ Either directly or by learning it from data

Ÿ Assert the evidence

Ÿ Brian ate pizza

Ÿ Use probabilistic inference to get the answer

Ÿ P(student, programmer | pizza)

4

Probabilistic Modeling

slide-5
SLIDE 5

Ÿ Probabilistic models in which variables are generated in order

Ÿ Later variables can depend on earlier variables

Ÿ Large number of variants, e.g.

Ÿ Bayesian networks Ÿ Hidden Markov models Ÿ Probabilistic context free grammars Ÿ Kalman filters Ÿ Probabilistic relational models

5

Generative Models

slide-6
SLIDE 6

Developing a new model requires implementing Ÿ Representation Ÿ Inference algorithm Ÿ Learning algorithm

Ÿ All three are significant challenges

Ÿ Considered paper worthy

6

Building Generative Models

Can ¡we ¡make ¡this ¡easier? ¡

slide-7
SLIDE 7

Ÿ Expressive representation language

Ÿ Capture wide variety of probabilistic models

Ÿ Built-in inference and learning algorithms

Ÿ Automatically apply to models written in the language

7

Probabilistic Programming Systems

slide-8
SLIDE 8

Ÿ Ordinary functional language: an expression describes a computation that produces a value let student = true in let programmer = student in let pizza = student && programmer in (student, programmer, pizza) Ÿ Functional probabilistic programming language: an expression describes a random computation that produces a value let student = flip(0.7) in let programmer = if (student) flip(0.2) else flip(0.1) in let pizza = if (student && programmer) flip(0.9) else flip(0.3) in (student, programmer, pizza)

8

Functional Probabilistic Programming

slide-9
SLIDE 9

let student = flip(0.7) in let programmer = if (student) flip(0.2) else flip(0.1) in let pizza = if (student && programmer) flip(0.9) else flip(0.3) in (student, programmer, pizza) Ÿ Imagine running this program many times Ÿ Each run generates a sample outcome Ÿ In each run, each outcome has some probability of being generated Ÿ The program defines a probability distribution over outcomes

9

Sampling Semantics

slide-10
SLIDE 10

Ÿ Turing complete language + probabilistic primitives

Ÿ Naturally express wide range of probabilistic models

Ÿ A number of general purpose algorithms have been developed

Ÿ Structured variable elimination Ÿ Markov chain Monte Carlo Ÿ Importance sampling Ÿ Factor graph compilation

10

Power of Functional Probabilistic Programming

slide-11
SLIDE 11

Ÿ PPLs aim to “democratize” model building

Ÿ One should not need extensive training in ML or AI to build and code a model

Ÿ This means that a PPL should (broadly) satisfy two main goals:

Ÿ Usability

Ÿ Intuitive to use Ÿ Common design patterns easily expressed Ÿ Integration into other/existing applications Ÿ Extensible language Ÿ Extensible reasoning

Ÿ Power

Ÿ Ability to represent a wide variety of models, data, etc Ÿ Powerful and practical inference techniques

Making Probabilistic Programming Practical

11

slide-12
SLIDE 12

Ÿ With Daphne Koller and David McAllester, we first formulated the idea of probabilistic programming Ÿ Lisp + flip Ÿ Convoluted inference algorithm

Ÿ Later found to be buggy

12

History | KMP 97

slide-13
SLIDE 13

Ÿ Representation

Ÿ First practical probabilistic programming language Ÿ OCaml like syntax Ÿ Implemented in Ocaml

Ÿ Inference

Ÿ Exact inference using structured variable elimination Ÿ Later implemented intelligent importance sampling

Ÿ Limitations

Ÿ Hard to integrate with applications and data Ÿ No continuous variables

13

History | IBAL (2000-2007)

slide-14
SLIDE 14

Ÿ Representation

Ÿ Embedded DSL in Scala Ÿ Allows distributions over any data type Ÿ Highly expressive constraint system also allows it to express non- generative models

Ÿ Inference

Ÿ Extensible library of inference algorithms Ÿ Contains many of the most popular probabilistic inference algorithms, generalized to probabilistic programs

Ÿ E.g., variable elimination, Metropolis-Hastings, particle filtering

Ÿ New version to be released shortly

Ÿ Parameter learning Ÿ Decision making Ÿ Improved algorithms

14

History | Figaro (2009-Present)

slide-15
SLIDE 15

Goals of the Figaro Language

Ÿ Implement a PPL in a widely-used language

Ÿ Scala is widely-used Ÿ Scala interoperability with Java also gives Figaro access to an even larger library

Ÿ Provide a language to describe models with interacting components Ÿ Object-oriented Ÿ Provide a means to expressed directed and undirected models with general constraints Ÿ Functional Ÿ Extensibility and reuse of inference algorithms Ÿ Object-oriented, traits Ÿ Using Scala helps achieve all of these goals!

15

slide-16
SLIDE 16

Ÿ Element[T] is class of probabilistic models over type T Ÿ Atomic elements Constant[T], Flip, Uniform, Geometric Ÿ Compound elements built out of other elements If(Flip(0.8), Constant(0.5), Uniform(0,1))

16

Basic Figaro Concepts

slide-17
SLIDE 17

Ÿ Constant[T] is the monadic unit Ÿ Chain[T,U] implements monadic bind

Ÿ Use an Element[T] to generate T Ÿ Apply a function to the T to generate an Element[U] Ÿ Generate a U from the Element[U]

Chain(Uniform(0,1), (d: Double) => Normal(d, 0.5)) Ÿ Apply[T,U] implements monadic fmap Apply(Uniform(0,1), (d: Double) => d * 2) Ÿ Most Figaro compound elements implemented using monad

Ÿ E.g., If

17

The Probability Monad

slide-18
SLIDE 18

Ÿ Any Element[T] can have conditions and constraints Ÿ Condition: function from T T to Boolean

Ÿ Specifies a property that must be satisfied for a value to have positive probability

Ÿ Constraint: function from T to Double

Ÿ Weights probability of value

Ÿ Two purposes

Ÿ Asserting evidence Ÿ Specifying new kinds of models including undirected models

18

Conditions and Constraints

slide-19
SLIDE 19

19

Example 1: Probabilistic Processes on Graphs

Ÿ Google’s PageRank is a model of a probabilistic process on a graph

Ÿ Directed edge from page A to page B if A links to B

Ÿ Consider a random walk starting at any point in the graph

Ÿ What is the probability a node will be reached in n steps?

slide-20
SLIDE 20

Ÿ Start by defining some data structures for a webpage graph

20

Random Walk in Figaro

class Edge(from: Int, to: Int) class Node(ID: int, edges: Set[Edge]) class Graph(nodes: Set[Nodes]) { def get(id: Int) = // return Node with ID == id } // function that randomly builds a graph given some params def graphGenProcess(params*): Element[Graph]

Ÿ Define some parameters of the random walk

val numSteps: Element[Int] = Constant(10) val inputGraph: Element[Graph] = graphGenProcess(…) val startNode: Element[Int] = Uniform(inputGraph.nodes)

slide-21
SLIDE 21

21

Random Walk in Figaro

// randomly move forward from a node def step(last: Int, g: Graph): Element[Int] = Uniform(g(last).edges.map(e => e.to)) val rWalk = Chain(inputGraph, numSteps, startNode, rFcn) def rFcn(g: Graph, remain: Int, n: Int): Element[List[Int]] = { if (remain == 1) Apply(step(n, g), (i: Int) => List(i)) else { val prev = rFcn(g, remain-1, n) val curr = step(Apply(prev, (l: List[Int]) => l.head), g) Apply(curr, prev, (i: Int, l: List[Int]) => I :: l) } }

slide-22
SLIDE 22

Ÿ People smoke with probability 0.6 Ÿ Friends are 3 times as likely to have the same smoking habit than different Ÿ Alice is friends with Bob, Bob is friends with Clara Ÿ Alice smokes Ÿ What is the probability that Clara smokes? Want a general solution that works for any friends network

Example 2: Network Analysis

slide-23
SLIDE 23

Friends and Smokers | General Solution

// A per person

  • n smok

mokes es wit ith h pr proba

  • babilit

bility 0.6 0.6 clas lass Per erson

  • n { val

al smok mokes es = Flip lip(0.6) 0.6) } // Friends iends ar are e thr hree ee times imes as as lik likel ely to

  • ha

have e the he same ame // smoking moking ha habit bit than han dif differ erent ent def def cons constraint aint(pair pair: : (Boolean,

  • olean, Boolean)
  • olean)) =

if if (pair pair._1 ._1 == == pair pair._2 ._2) 3.0; 3.0; els else e 1.0 1.0 // Appl pply the he cons constraint aints to

  • all

all pair pairs of

  • f friends

iends def def appl pplyCons

  • nstraint

aints(friends iends: : Lis List[Per erson]

  • n]) {

for

  • r { (p1,p2

p1,p2) ) ← friends iends } { (p1.s p1.smok mokes es ^^ ^^ p2.s p2.smok mokes es). ).ad addC dCons

  • nstraint

aint(cons constraint aint) } } } }

slide-24
SLIDE 24

// Set etting ing up up the he sit itua uation ion val al alice, alice, bob, bob, clar lara a = new new Per erson

  • n

val al friends iends = Lis List((alice, alice, bob) bob), , (bob, bob, clar lara) a)) appl pplyCons

  • nstraint

aints(friends iends) alice.s alice.smok mokes es.condit .condition( ion(true) ue) // Running unning inf infer erence ence and and quer querying ing val al algor algorit ithm hm = Var aria iableE bleElimina limination ion(clar lara.s a.smok mokes es) algor algorit ithm. hm.star art() () pr print intln( ln(algor algorit ithm. hm.pr proba

  • babilit

bility(clar lara.s a.smok mokes es, , true) ue))

24

Friends and Smokers | Specific Situation

slide-25
SLIDE 25

Ÿ We observe an object (e.g. a vehicle on a road) Ÿ We want to know what type of object it is Ÿ We have some observations about it Ÿ Inheritance hierarchies are a natural fit

25

Example 3: Hierarchical Reasoning

slide-26
SLIDE 26

Ÿ Every element

Ÿ Has a name Ÿ Belongs to an element collection

Ÿ These are implicit arguments

Ÿ A reference is a sequence of names

Ÿ e.g., vehicle1.size

Ÿ Starting with an element collection, you can get to the element associated with a reference

Ÿ Go through sequence of nested element collections

Ÿ There may be uncertainty in the identity of a reference

Ÿ E.g., you don’t know what vehicle1 is Ÿ Figaro always resolves the reference to the actual element in any given world

26

Referring to Elements

slide-27
SLIDE 27

abs bstract act clas lass Vehic ehicle le ext xtends ends Element lementCollect

  • llection

ion { val al siz ize: e: Element lement[Symbol] mbol] val al speed: peed: Element lement[Int nt] } clas lass Truc uck k ext xtends ends Vehic ehicle le { val al siz ize e = Select elect(0.25 0.25 ->

  • > 'medium,

medium, 0.75 0.75 ->

  • > 'big)

big)("s "siz ize", e", this his) val al speed peed = Unif Unifor

  • rm(50,

50, 60, 60, 70) 70)("s "speed", peed", this his) } clas lass Pic ickup kup ext xtends ends Truc uck k {

  • ver

erride ide val al speed peed = Unif Unifor

  • rm(70,

70, 80) 80)("s "speed", peed", this his)

  • ver

erride ide val al siz ize e = Cons

  • nstant

ant('medium) medium)("s "siz ize", e", this his) } clas lass Twent entyWheeler heeler ext xtends ends Truc uck k … clas lass Car ar ext xtends ends Vehic ehicle le …

27

Defining the Class Hierarchy and Properties

slide-28
SLIDE 28
  • bject
  • bject Vehic

ehicle le { def def gener generate( e(name: name: String) ing): : Element lement[Vehic ehicle] le] = Dist(0.6 -> Car.generate, 0.4 -> Truck.generate)(name, universe) }

  • bject
  • bject Truc

uck k { def def gener generate: e: Element lement[Vehic ehicle] le] = Dis ist(0.1 0.1 ->

  • > Twent

entyWheeler heeler.gener .generate, e, 0.3 0.3 ->

  • > Pic

ickup kup.gener .generate, e, 0.6 0.6 ->

  • > Cons
  • nstant

ant[Vehic ehicle] le](new new Truc uck) k)) }

  • bject
  • bject Pic

ickup kup { def def gener generate e … … }

  • bject
  • bject Twent

entyWheeler heeler { def def gener generate e … … }

  • bject
  • bject Car

ar { def def gener generate e … … }

28

Defining a Distribution Over Objects

slide-29
SLIDE 29

val al my myVehic ehicle le = Vehic ehicle.gener le.generate( e("v "v1") 1") universe.assertEvidence(List(NamedEvidence("v1.size", Observation('medium))))

29

Instantiating and Observing Evidence

slide-30
SLIDE 30

// Element representing the class name of the vehicle, // e.g. Truck val al clas lassName Name = shor hortClas lassName( Name(my myVehic ehicle) le) val al is isPic ickup kup = Appl pply(my myVehic ehicle, le, (v: : Vehic ehicle) le) => v.is .isIns nstanceOf anceOf[Pic ickup] kup]) val al alg alg = Var aria iableE bleElimina limination ion(is isPic ickup kup, , name) name) alg.start() println(alg.probability(isPickup, true) ue)) // Print int a a lis list of

  • f clas

lass names names wit ith h their heir pr proba

  • babilit

bilities ies println(alg.distribution(className).toList)

30

Querying The Model

slide-31
SLIDE 31

Ÿ Free and open-source, available now at www.cra.com/figaro

Ÿ Tutorial available in release

Ÿ Version 2.0 release imminent

Ÿ Development will move to GitHub as of release https://github.com/p2t2

Ÿ Contact me apfeffer@cra.com or figaro@cra.com

31

Obtaining Figaro