Reasoning with Graphical Models Class 1 Rina Dechter Darwiche - - PowerPoint PPT Presentation

reasoning with graphical models class 1 rina dechter
SMART_READER_LITE
LIVE PREVIEW

Reasoning with Graphical Models Class 1 Rina Dechter Darwiche - - PowerPoint PPT Presentation

Reasoning with Graphical Models Class 1 Rina Dechter Darwiche chapters 1,3 DechterMorgan&claypool book: Chapters 12 Pearl chapter 12 class1 compsci2020 Congressional Breifing: AI at UCI Rina Dechter Congressional Briefing,


slide-1
SLIDE 1

Reasoning with Graphical Models Class 1 Rina Dechter

class1 compsci2020

Darwiche chapters 1,3 Dechter‐Morgan&claypool book: Chapters 1‐2 Pearl chapter 1‐2

slide-2
SLIDE 2

Congressional Breifing: AI at UCI

Congressional Briefing, December 2019 2

  • Rina Dechter
slide-3
SLIDE 3

Congressional Briefing, December 2019 3

The Primary AI Challenges

  • Machine Learning focuses on

replicating humans learning

  • Automated reasoning focuses on

replicating how people reason.

PCWP CO HRBP HREKG HRSAT ERRCAUTER HR HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 VENTALV VENTLUNG VENITUBE DISCONNECT MINVOLSET VENTMACH KINKEDTUBE INTUBATION PULMEMBOLUS PAP SHUNT ANAPHYLAXIS MINOVL PVSAT FIO2 PRESS INSUFFANESTH TPR LVFAILURE ERRBLOWOUTPUT STROEVOLUME LVEDVOLUME HYPOVOLEMIA CVP BP

A neural network A Graphical Model

slide-4
SLIDE 4

Congressional Briefing, December 2019 4

Automated Reasoning

Lawyer Policy Maker Medical Doctor

Queries:

  • Prediction: what will happen?
  • Diagnosis: what happened?
  • Situation assessment: What is going on?
  • Planning, decision making: what to do?
slide-5
SLIDE 5

Automated Reasoning

Queries:

  • Prediction
  • Diagnosis
  • Situation assessment
  • Planning, decision making

answers

5

Knowledge is huge, so How to identify what’s relevant?

Graphical Models

Congressional Briefing, December 2019

slide-6
SLIDE 6

Graphical Models

Example: diagnosing liver disease (Onisko et al., 1999)

6

Automated Reasoning:

  • Develop methods to answer these questions.
  • Learning the models: from experts and data.

Congressional Briefing, December 2019

Queries:

  • Prediction
  • Diagnosis
  • Situation assessment
  • Planning, decision making
slide-7
SLIDE 7

class1 compsci2020

slide-8
SLIDE 8

Global Seismic Monitoring Compliance for the Comprehensive Nuclear‐Test‐Ban Treaty (CTBT) (Nimar, Russel, Sudderth, 2011)

Congressional Briefing, December 2019 8

278 monitoring stations (147 seismic)

CNTBT:A Graphical Model Application

The IDC (International Data Centers)

  • perates continuously and in real time,

performing signal processing

slide-9
SLIDE 9

Congressional Briefing, December 2019 9

Given: continuous waveform measurements from a global network of seismometer stations

Global Seismic Monitoring for the Comprehensive Nuclear‐Test‐Ban Treaty (Nimar, Russel, Sudderth, 2011)

slide-10
SLIDE 10

Congressional Briefing, December 2019

Input: obsreved detection Output: a bulletin listing seismic events, with

  • Time
  • Location (latitude, longitude)
  • Depth
  • Magnitude

Global Seismic Monitoring for the Comprehensive Nuclear‐Test‐Ban Treaty (Nimar, Russel, Sudderth, 2011)

Result: 60% reduction in error compared with human experts. Reasoning methods infers the most likely set of seismic events given the observed detections,

slide-11
SLIDE 11

Complexity of Automated Reasoning

  • Prediction
  • Diagnosis
  • Planning and scheduling
  • Probabilistic Inference
  • Explanation
  • Decision‐making

200 400 600 800 1000 1200 1 2 3 4 5 6 7 8 9 10 f(n) n

Linear / Polynomial / Exponential

Line ar

Reasoning is computationally hard

Complexity is exponential

Congressional Briefing, December 2019

Approximation, anytime

Bounded error

11

slide-12
SLIDE 12

AI Renaissance

  • Deep learning

– Fast predictions – “Instinctive”

  • Probabilistic models

– Slow reasoning – “Logical / deliberative” Tools: Tensorflow, PyTorch, … Tools: Graphical Models, Probabilistic programming, Markov Logic, …

class1 compsci2020

slide-13
SLIDE 13

Text Books, Outline, Requirements

class1 compsci2020

Class page

slide-14
SLIDE 14

Outline of classes

  • Part 1: Introduction and Inference
  • Part 2: Search
  • Parr 3: Variational Methods and Monte‐Carlo Sampling

class1 compsci2020

E K F L H C B A M G J D ABC BDEF DGF EFH FHK HJ KLM

1 1 1 1 1 1 1 1 1 1 1 1 0101010101010101010101010101010101010101010101010101010101010101 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1

E C F D B A

1

Context minimal AND/OR search graph

A

OR AND

B

OR AND OR

E

OR

F F

AND

01

AND

0 1 C D D 01 0 1 1 E C D D 0 1 1 B E F F 0 1 C 1 E C

slide-15
SLIDE 15

Probabilistic Graphical models

  • Describe structure in large problems

– Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence

class1 compsci2020

slide-16
SLIDE 16

Probabilistic Graphical models

  • Describe structure in large problems

– Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence

  • Examples & Tasks

– Maximization (MAP): compute the most probable configuration

[Yanover & Weiss 2002] [Bruce R. Donald et. Al. 2016]

class1 compsci2020

  • Protein Structure prediction: predicting the 3d structure from given

sequences

  • PDB: Protein design (backbone) algorithms enumerate a

combinatorial number of candidate structures to compute the Global Minimum Energy Conformation (GMEC).

slide-17
SLIDE 17

Probabilistic Graphical models

  • Describe structure in large problems

– Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence

  • Examples & Tasks

– Summation & marginalization

grass plane sky grass cow

Observation y Observation y Marginals p( xi | y ) Marginals p( xi | y )

and “partition function”

class1 compsci2020

e.g., [Plath et al. 2009]

Image segmentation and classification:

slide-18
SLIDE 18

Graphical models

  • Describe structure in large problems

– Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence

  • Examples & Tasks

– Mixed inference (marginal MAP, MEU, …)

Test Drill Oil sale policy Test result Seismic structure Oil underground Oil produced Test cost Drill cost Sales cost Oil sales Market information

Influence diagrams &

  • ptimal decision‐making

(the “oil wildcatter” problem)

class1 compsci2020

e.g., [Raiffa 1968; Shachter 1986]

slide-19
SLIDE 19

class1 compsci2020

In more details…

slide-20
SLIDE 20

Bayesian Networks (Pearl 1988)

P(S, C, B, X, D) = P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B)

lung Cancer Smoking X-ray Bronchitis Dyspnoea

P(D|C,B) P(B|S) P(S) P(X|C,S) P(C|S)

Θ) (G, BN 

CPD:

C B P(D|C,B) 0 0 0.1 0.9 0 1 0.7 0.3 1 0 0.8 0.2 1 1 0.9 0.1

  • Posterior marginals, probability of evidence, MPE
  • P( D= 0) = ∑

P(S)· P(C|S)· P(B|S)· P(X|C,S)· P(D|C,B

,,,

MAP(P)= 𝑛𝑏𝑦,,, P(S)· P(C|S)· P(B|S)· P(X|C,S)· P(D|C,B) Combination: Product Marginalization: sum/max

class1 compsci2020

An early example From medical diagnosis

slide-21
SLIDE 21

Alarm network

  • Bayes nets: compact representation of large joint distributions

PCWP CO HRBP HREKG HRSAT ERRCAUTER HR HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 VENTALV VENTLUNG VENITUBE DISCONNECT MINVOLSET VENTMACH KINKEDTUBE INTUBATION PULMEMBOLUS PAP SHUNT ANAPHYLAXIS MINOVL PVSAT FIO2 PRESS INSUFFANESTH TPR LVFAILURE ERRBLOWOUTPUT STROEVOLUME LVEDVOLUME HYPOVOLEMIA CVP BP

The “alarm” network: 37 variables, 509 parameters (rather than 237 = 1011 !) [Beinlich et al., 1989]

class1 compsci2020

slide-22
SLIDE 22

class1 compsci2020

slide-23
SLIDE 23

A B

red green red yellow green red green yellow yellow green yellow red

Example: map coloring

Variables - countries (A,B,C,etc.) Values - colors (red, green, blue) Constraints:

etc. , E D D, A B, A   

C A B D E F G

Constraint Networks

A B E G D F C

Constraint graph

class1 828X‐2018

slide-24
SLIDE 24

Propositional Reasoning

  • If Alex goes, then Becky goes:
  • If Chris goes, then Alex goes:
  • Question:

Is it possible that Chris goes to the party but Becky does not?

Example: party problem

B A  A C 

e? satisfiabl , , the Is 

   C B, A C B A theory nal propositio 

A B C

class1 828X‐2018

slide-25
SLIDE 25

Probabilistic reasoning (directed)

  • Alex is‐likely‐to‐go in bad weather
  • Chris rarely‐goes in bad weather
  • Becky is indifferent but unpredictable

Questions:

  • Given bad weather, which group of individuals is most

likely to show up at the party?

  • What is the probability that Chris goes to the party

but Becky does not?

Party example: the weather effect

P(W,A,C,B) = P(B|W) · P(C|W) · P(A|W) · P(W) P(A,C,B|W=bad) = 0.9 · 0.1 · 0.5

P(A|W=bad)=.9

W A

P(C|W=bad)=.1

W C

P(B|W=bad)=.5

W B W P(W) P(A|W) P(C|W) P(B|W) B C A

W A P(A|W) good .01 good 1 .99 bad .1 bad 1 .9

class1 828X‐2018

slide-26
SLIDE 26

Mixed Probabilistic and Deterministic networks

P(C|W) P(B|W) P(W) P(A|W) W B A C

Query: Is it likely that Chris goes to the party if Becky does not but the weather is bad?

PN CN

) , , | , ( A C B A bad w B C P   

  • A→B

C→A

B A C P(C|W) P(B|W) P(W) P(A|W) W B A C

A→B C→A

B A C

Alex is‐likely‐to‐go in bad weather Chris rarely‐goes in bad weather Becky is indifferent but unpredictable

class1 828X‐2018

slide-27
SLIDE 27

Example domains for graphical models

  • Natural Language processing

– Information extraction, semantic parsing, translation, topic models, …

  • Computer vision

– Object recognition, scene analysis, segmentation, tracking, …

  • Computational biology

– Pedigree analysis, protein folding and binding, sequence matching, …

  • Networks

– Webpage link analysis, social networks, communications, citations, ….

  • Robotics

– Planning & decision making

class1 compsci2020

slide-28
SLIDE 28

Complexity of Reasoning Tasks

  • Constraint satisfaction
  • Counting solutions
  • Combinatorial optimization
  • Belief updating
  • Most probable explanation
  • Decision‐theoretic planning

200 400 600 800 1000 1200 1 2 3 4 5 6 7 8 9 10 f(n) n

Linear / Polynomial / Exponential

Linear Polynomial Exponential

Reasoning is computationally hard

Complexity is Time and space(memory)

class1 compsci2020

slide-29
SLIDE 29

Tree‐solving is easy

Belief updating (sum-prod) MPE (max-prod)

CSP – consistency (projection-join) #CSP (sum-prod)

P(X) P(Y|X) P(Z|X) P(T|Y) P(R|Y) P(L|Z) P(M|Z)

) (X mZX ) (X mXZ ) (Z mZM

) (Z mZL

) (Z mMZ ) (Z mLZ ) (X mYX ) (X mXY

) (Y mTY ) (Y mYT ) (Y mRY ) (Y mYR

Trees are processed in linear time and memory

class1 compsci2020

slide-30
SLIDE 30

Transforming into a Tree

  • By Inference (thinking)

– Transform into a single, equivalent tree of sub‐ problems

  • By Conditioning (guessing)

– Transform into many tree‐like sub‐problems.

class1 compsci2020

slide-31
SLIDE 31

Inference and Treewidth

E K F L H C B A M G J D ABC BDEF DGF EFH FHK HJ KLM

treewidth = 4 - 1 = 3 treewidth = (maximum cluster size) - 1 Inference algorithm: Time: exp(tree-width) Space: exp(tree-width)

class1 compsci2020

slide-32
SLIDE 32

Conditioning and Cycle cutset

C P J A L B E D F M O H K G N C P J L B E D F M O H K G N

A

C P J L E D F M O H K G N

B

P J L E D F M O H K G N

C Cycle cutset = {A,B,C}

C P J A L B E D F M O H K G N C P J L B E D F M O H K G N C P J L E D F M O H K G N C P J A L B E D F M O H K G N

class1 compsci2020

slide-33
SLIDE 33

Search over the Cutset

A=yellow A=green B=red B=blue B=red B=blue B=green B=yellow

C K G L D F H M J E C K G L D F H M J E C K G L D F H M J E C K G L D F H M J E C K G L D F H M J E C K G L D F H M J E

  • Inference may require too much memory
  • Condition on some of the variables

A C B K G L D F H M J E

Graph Coloring problem

class1 compsci2020

slide-34
SLIDE 34

Inference

exp(w*) time/space

A D B C E F

1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

E C F D B A

1

Search

Exp(w*) time O(w*) space

E K F L H C B A M G J D ABC BDEF DGF EFH FHK HJ KLM A=yellow A=green B=blue B=red B=blue B=green C K G L D F H M J E A C B K G L D F H M J E C K G L D F H M J E C K G L D F H M J E C K G L D F H M J E

Search+inference: Space: exp(q) Time: exp(q+c(q)) q: user controlled

Bird's‐eye View of Exact Algorithms

class1 compsci2020

slide-35
SLIDE 35

Inference

exp(w*) time/space

A D B C E F

1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

E C F D B A

1

Search

Exp(w*) time O(w*) space

E K F L H C B A M G J D ABC BDEF DGF EFH FHK HJ KLM A=yellow A=green B=blue B=red B=blue B=green C K G L D F H M J E A C B K G L D F H M J E C K G L D F H M J E C K G L D F H M J E C K G L D F H M J E

Search+inference: Space: exp(q) Time: exp(q+c(q)) q: user controlled

Context minimal AND/OR search graph 18 AND nodes

A

OR AND

B

OR AND OR

E

OR

F F

AND

0 1

AND

1 C D D 0 1 1 1 E C D D 1 1 B E F F 1 C 1 E C

Bird's‐eye View of Exact Algorithms

class1 compsci2020

slide-36
SLIDE 36

A D B C E F

1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

E C F D B A

1

E K F L H C B A M G J D ABC BDEF DGF EFH FHK HJ KLM A=yellow A=green B=blue B=red B=blue B=green C K G L D F H M J E A C B K G L D F H M J E C K G L D F H M J E C K G L D F H M J E C K G L D F H M J E

Inference

Bounded Inference

Search Sampling

Search + inference: Sampling + bounded inference

Bird's‐eye View of Approximate Algorithms

class1 compsci2020

Context minimal AND/OR search graph 18 AND nodes

A

OR AND

B

OR AND OR

E

OR

F F

AND

0 1

AND

1 C D D 0 1 1 1 E C D D 1 1 B E F F 1 C 1 E C

slide-37
SLIDE 37

Outline

  • Basics of probability theory
  • DAGS, Markov(G), Bayesian networks
  • Graphoids: axioms of for inferring conditional

independence (CI)

  • D‐separation: Inferring CIs in graphs

class1 compsci2020

slide-38
SLIDE 38

Outline

  • Basics of probability theory
  • DAGS, Markov(G), Bayesian networks
  • Graphoids: axioms of for inferring conditional

independence (CI)

  • Capturing CIs by graphs
  • D‐separation: Inferring CIs in graphs

class1 compsci2020

slide-39
SLIDE 39

Examples: Common Sense Reasoning

  • Zebra on Pajama: (7:30 pm): I told Susannah: you have a nice pajama, but it was just

a dress. Why jump to that conclusion?: 1. because time is night time. 2. certain designs look like pajama.

  • Cars going out of a parking lot: You enter a parking lot which is quite full (UCI), you

see a car coming : you think ah… now there is a space (vacated), OR… there is no space and this guy is looking and leaving to another parking lot. What other clues can we have?

  • Robot gets out at a wrong level: A robot goes down the elevator. stops at 2nd floor

instead of ground floor. It steps out and should immediately recognize not being in the right level, and go back inside.

  • Turing quotes

– If machines will not be allowed to be fallible they cannot be intelligent – (Mathematicians are wrong from time to time so a machine should also be allowed)

class1 compsci2020

slide-40
SLIDE 40

Why Uncertainty?

  • AI goal: to have a declarative, model‐based, framework that allows computer

system to reason.

  • People reason with partial information
  • Sources of uncertainty:

– Limitation in observing the world: e.g., a physician see symptoms and not exactly what goes in the body when he performs diagnosis. Observations are noisy (test results are inaccurate) – Limitation in modeling the world, – maybe the world is not deterministic.

class1 compsci2020

slide-41
SLIDE 41

Why/What/How Uncertainty?

  • Why Uncertainty?

– Answer: It is abandant

  • What formalism to use?

– Answer: Probability theory

  • How to overcome exponential representation?

– Answer: Graphs, graphs, graphs… to capture irrelevance, independence, causality

class1 compsci2020

slide-42
SLIDE 42

The Burglary Example

class1 compsci2020

Earthquake Burglary Radio Alarm Call

slide-43
SLIDE 43

class1 compsci2020

slide-44
SLIDE 44

class1 compsci2020

slide-45
SLIDE 45

class1 compsci2020

slide-46
SLIDE 46

class1 compsci2020

slide-47
SLIDE 47

class1 compsci2020

slide-48
SLIDE 48

class1 compsci2020

slide-49
SLIDE 49

class1 compsci2020

slide-50
SLIDE 50

Alpha and beta are events

class1 compsci2020

slide-51
SLIDE 51

class1 compsci2020

slide-52
SLIDE 52

Burglary is independent of Earthquake

class1 compsci2020

slide-53
SLIDE 53

Earthquake is independent of burglary

class1 compsci2020

slide-54
SLIDE 54

class1 compsci2020

slide-55
SLIDE 55

class1 compsci2020

slide-56
SLIDE 56

class1 compsci2020

slide-57
SLIDE 57

class1 compsci2020

slide-58
SLIDE 58

class1 compsci2020

slide-59
SLIDE 59

class1 compsci2020

slide-60
SLIDE 60

class1 compsci2020

slide-61
SLIDE 61

class1 compsci2020

slide-62
SLIDE 62

class1 compsci2020

slide-63
SLIDE 63

Example

P(B,E,A,J,M)=?

class1 compsci2020

slide-64
SLIDE 64

class1 compsci2020

slide-65
SLIDE 65

class1 compsci2020

slide-66
SLIDE 66

class1 compsci2020

slide-67
SLIDE 67

class1 compsci2020

slide-68
SLIDE 68

Bayesian Networks: Representation

= P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B) lung Cancer Smoking X-ray Bronchitis Dyspnoea

P(D|C,B) P(B|S) P(S) P(X|C,S) P(C|S)

P(S, C, B, X, D)

Conditional Independencies Efficient Representation

Θ) (G, BN 

CPD:

C B D=0 D=1 0 0 0.1 0.9 0 1 0.7 0.3 1 0 0.8 0.2 1 1 0.9 0.1

class1 compsci2020

slide-69
SLIDE 69

class1 compsci2020

End of slides