and Applications Lecture 9: Probabilistic reasoning (Bayesian - - PowerPoint PPT Presentation

β–Ά
and applications
SMART_READER_LITE
LIVE PREVIEW

and Applications Lecture 9: Probabilistic reasoning (Bayesian - - PowerPoint PPT Presentation

Artificial Intelligence: Methods and Applications Lecture 9: Probabilistic reasoning (Bayesian Networks) Juan Carlos Nieves Snchez December 02, 2014 Outline Motivation Syntax Semantics Parameterized distributions Bayesian


slide-1
SLIDE 1
slide-2
SLIDE 2

Artificial Intelligence: Methods and Applications

Lecture 9: Probabilistic reasoning (Bayesian Networks) Juan Carlos Nieves SΓ‘nchez December 02, 2014

slide-3
SLIDE 3

Bayesian Networks 3

Outline

  • Motivation
  • Syntax
  • Semantics
  • Parameterized distributions
slide-4
SLIDE 4

General Inference Procedure

  • Let X be the query variable,
  • Let E by the set of evidence variables,
  • Let e be the observed values for them,
  • Let Y be the remaining unobserved variables.

Then the query P(X|e) can be avaluated as: where the summation is over all possible 𝒛s (i.e., all possible combinations

  • f values of the unobserved variables Y)

Bayesian Networks 4

slide-5
SLIDE 5

Some observations

This approach to inference does not scale well. For a domain described by π‘œ variables, where 𝑒 is the largest arity. 1. Worst-case time complexity 𝑃(π‘’π‘œ) 2. Space complexity 𝑃(π‘’π‘œ) to store the joint distribution.

Bayesian Networks 5

For these reasons, the full joint distribution in tabular form is not a practical tool for building reasoning systems. How do you avoid the exponential space and time complexity

  • f the inference based on probability distributions?

Let us take advantage of independence and Bayes’ Rule

slide-6
SLIDE 6

Independence

Bayesian Networks 6

slide-7
SLIDE 7

Independence

How to think about conditional independence:

Bayesian Networks 7

If knowing 𝐷 tells me everything about 𝐡, I don’t gain anything by knowing 𝐢.

Conditional independence assertions can allow probabilistic systems to scale up; moreover, they are much more commonly available than absolute independence assertions.

slide-8
SLIDE 8

Bayesian Networks

Independece and conditional independece relationships among variables can greatly reduce the number of probabilities that need to be specified in order to define the full joint distribution.

Bayesian Networks 8

  • Bayesian Networks is a data structure that can represent the

dependencias amoung variables.

  • Bayesian Networks can represent essentialy any full joint probability

distributuin and in many cases can do very concisely.

  • Bayesian networks have been one of the most important

contribution to the field of AI.

  • Provide a way to represent knowledge in an uncertain domain and

a way to reason about this knowledge.

  • Many applications: medicine, factories, etc.
slide-9
SLIDE 9

Bayesian Network

A Bayesian network is made up of two parts:

Bayesian Networks 9

  • 1. A directed acyclic graph
  • 2. A set of parameters

Burglary Earthquake Alarm

slide-10
SLIDE 10

A directed acyclic graph

  • The nodes are random variables (which can be discrete or

continuous).

  • Arrows connect pairs of nodes (X is a parent of Y if there is an

arrow from node X to node Y).

  • Intuitively, an arrow from node X to node Y means X has a

direct influence on Y (we can say X has a casual effect on Y).

  • Easy for a domain expert to determine these relationships.

Bayesian Networks 10

Burglary Earthquake Alarm

slide-11
SLIDE 11

A set of parameters

Bayesian Networks 11

  • Each node π‘Œπ‘— has a conditional

probability distribution 𝑄 π‘Œπ‘— π‘„π‘π‘ π‘“π‘œπ‘’π‘‘(π‘Œπ‘—)) that quantifies the effect of the parents

  • n the node.
  • The set of parameters are the

probabilities in these conditional probabilities distributions.

  • As we have discrete random variables,

we have conditional probability tables (CPTs).

Burglary Earthquak e Alarm

slide-12
SLIDE 12

Observations of the set of parameters

Bayesian Networks 12

  • Conditonal Probability Distribution for

Alarm stores the probability distribution for Alarm given the values of Burglary and Earthquake.

  • For a given combination of values of the

parents (𝐢 and 𝐹 in this example), the entries for 𝑄(𝐡 = 𝑒𝑠𝑣𝑓|𝐢, 𝐹) and 𝑄(𝐡 = π‘”π‘π‘šπ‘‘π‘“|𝐢, 𝐹) must add up to 1.

  • For instance,

𝑄 𝐡 = 𝑒𝑠𝑣𝑓 𝐢 = π‘”π‘π‘šπ‘‘π‘“, 𝐹 = π‘”π‘π‘šπ‘‘π‘“ + 𝑄(𝐡 = π‘”π‘π‘šπ‘‘π‘“|𝐢 = π‘”π‘π‘šπ‘‘π‘“, 𝐹 = π‘”π‘π‘šπ‘‘π‘“) = 1

slide-13
SLIDE 13

Indepence and Bayesian Network

What does it means the absence/presence of arrows in a Bayesian Network?

Bayesian Networks 13

Cavity Toothache Catch Weather

  • Weather is independent of the other variables
  • Toothache and Catch are conditionally independent given

Cavity (this is represented by the fact that there is no link between Toothache and Catch and by the fact that they have Cavity as a parent)

slide-14
SLIDE 14

Semantics of Bayesian Networks

Two ways to view Bayes networks:

  • 1. A representation of a joint

probability distribution.

  • 2. An encoding of a collection of

conditional independence statements.

Bayesian Networks 14

slide-15
SLIDE 15

Representation of the Full Joint Distribution

Write

Bayesian Networks 15

Factorization (chain rule)

slide-16
SLIDE 16

Global Semantics

Bayesian Networks 16

Global semantics defines the full joint distribution as the product

  • f the local conditional distributions. In other words, as a

Bayesian Network structure implies that the value of a particular node is conditional only on the values of its parent nodes, this reduces to In which

slide-17
SLIDE 17

Global Semantics

Bayesian Networks 17

Example:

slide-18
SLIDE 18

Local Semantics

Bayesian Networks 18

We can look at the actual graph structure and determine conditional independence relationships.

Local Semantics: A node π‘Œ is conditionally independent of its non- decendants (π‘Ž1π‘˜π‘Žπ‘œπ‘˜), given its parents (𝑉1𝑉𝑛) Theorem: Local semantics if and only if global semantics

slide-19
SLIDE 19

Conditional Independence: Markov blanket

Bayesian Networks 19

A node π‘Œ is conditionally independent of all other nodes in the network, given its parents 𝑉1𝑉𝑛 , children 𝑍

1𝑍 π‘œ , and children’s parents π‘Ž1π‘˜π‘Žπ‘œπ‘˜ ,

that is given its Markov blanket: parents + children + childen’s parents

slide-20
SLIDE 20

Pearl’s Network Construction Algorithm

Bayesian Networks 20

Need a method such that a series of locally testable assertions of conditional independence guarantees the required global semantics

slide-21
SLIDE 21

Example: Lung Cancer Diagnosis

Bayesian Networks 21

A patient has been suffering from shortness of breath (called dyspnoea) and visits the doctor, worried that he has lung cancer. The doctor knows that other diseases, such as tuberculosis and bronchitis are possible causes, as well as lung cancer. She also knows that other relevant information includes whether or not the patient is a smoker (increasing the chances of cancer and bronchitis) and what sort of air pollution he has been exposed to. A positive XRay would indicate either TB or lung cancer.

slide-22
SLIDE 22

Lung cancer example: nodes and values

Bayesian Networks 22

slide-23
SLIDE 23

Lung cancer example: CPTs

Bayesian Networks 23

Are the CPTs expressing all the possible combination of values?

slide-24
SLIDE 24

Reasoning with Bayesian Networks

  • Basic task for any probabilistic

inference system:

  • Also called conditioning or belief

updating or inference.

Bayesian Networks 24

Compute the posterior probability distribution for a set of query variables, given new information about some evidence variables.

slide-25
SLIDE 25

Most Usual Queries

Let π‘Œ = 𝐹 βˆͺ 𝑍 βˆͺ π‘Ž, where 𝐹 are the evidence variable, 𝑓 are the observed values, 𝑍 are the varaible of interest, π‘Ž are the rest of the varaibles.

Bayesian Networks 25

slide-26
SLIDE 26

Types of reasoning

Bayesian Networks 26

How do you express these reasoning types in terms of conditional probabilities?

slide-27
SLIDE 27

Some tools for using Bayesian Networks in real systems

  • BayesiaLab: http://www.bayesia.com/
  • GeNIe: https://dslpitt.org/genie/
  • Hugin: http://www.hugin.com
  • Netica: http://www.norsys.com/

Bayesian Networks 27

Why not? you download one of these tools and play a little bit!

slide-28
SLIDE 28

Bayesian Networks 28

Sources of this Lecture

  • S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach. Third

Edition.

  • K. B. Korb, A. E. Nicholson, Bayesian Artificial Intelligence, Second

Edition, 2010.