and Applications Lecture 9: Probabilistic reasoning (Bayesian - - PowerPoint PPT Presentation
and Applications Lecture 9: Probabilistic reasoning (Bayesian - - PowerPoint PPT Presentation
Artificial Intelligence: Methods and Applications Lecture 9: Probabilistic reasoning (Bayesian Networks) Juan Carlos Nieves Snchez December 02, 2014 Outline Motivation Syntax Semantics Parameterized distributions Bayesian
Artificial Intelligence: Methods and Applications
Lecture 9: Probabilistic reasoning (Bayesian Networks) Juan Carlos Nieves SΓ‘nchez December 02, 2014
Bayesian Networks 3
Outline
- Motivation
- Syntax
- Semantics
- Parameterized distributions
General Inference Procedure
- Let X be the query variable,
- Let E by the set of evidence variables,
- Let e be the observed values for them,
- Let Y be the remaining unobserved variables.
Then the query P(X|e) can be avaluated as: where the summation is over all possible πs (i.e., all possible combinations
- f values of the unobserved variables Y)
Bayesian Networks 4
Some observations
This approach to inference does not scale well. For a domain described by π variables, where π is the largest arity. 1. Worst-case time complexity π(ππ) 2. Space complexity π(ππ) to store the joint distribution.
Bayesian Networks 5
For these reasons, the full joint distribution in tabular form is not a practical tool for building reasoning systems. How do you avoid the exponential space and time complexity
- f the inference based on probability distributions?
Let us take advantage of independence and Bayesβ Rule
Independence
Bayesian Networks 6
Independence
How to think about conditional independence:
Bayesian Networks 7
If knowing π· tells me everything about π΅, I donβt gain anything by knowing πΆ.
Conditional independence assertions can allow probabilistic systems to scale up; moreover, they are much more commonly available than absolute independence assertions.
Bayesian Networks
Independece and conditional independece relationships among variables can greatly reduce the number of probabilities that need to be specified in order to define the full joint distribution.
Bayesian Networks 8
- Bayesian Networks is a data structure that can represent the
dependencias amoung variables.
- Bayesian Networks can represent essentialy any full joint probability
distributuin and in many cases can do very concisely.
- Bayesian networks have been one of the most important
contribution to the field of AI.
- Provide a way to represent knowledge in an uncertain domain and
a way to reason about this knowledge.
- Many applications: medicine, factories, etc.
Bayesian Network
A Bayesian network is made up of two parts:
Bayesian Networks 9
- 1. A directed acyclic graph
- 2. A set of parameters
Burglary Earthquake Alarm
A directed acyclic graph
- The nodes are random variables (which can be discrete or
continuous).
- Arrows connect pairs of nodes (X is a parent of Y if there is an
arrow from node X to node Y).
- Intuitively, an arrow from node X to node Y means X has a
direct influence on Y (we can say X has a casual effect on Y).
- Easy for a domain expert to determine these relationships.
Bayesian Networks 10
Burglary Earthquake Alarm
A set of parameters
Bayesian Networks 11
- Each node ππ has a conditional
probability distribution π ππ πππ πππ’π‘(ππ)) that quantifies the effect of the parents
- n the node.
- The set of parameters are the
probabilities in these conditional probabilities distributions.
- As we have discrete random variables,
we have conditional probability tables (CPTs).
Burglary Earthquak e Alarm
Observations of the set of parameters
Bayesian Networks 12
- Conditonal Probability Distribution for
Alarm stores the probability distribution for Alarm given the values of Burglary and Earthquake.
- For a given combination of values of the
parents (πΆ and πΉ in this example), the entries for π(π΅ = π’π π£π|πΆ, πΉ) and π(π΅ = ππππ‘π|πΆ, πΉ) must add up to 1.
- For instance,
π π΅ = π’π π£π πΆ = ππππ‘π, πΉ = ππππ‘π + π(π΅ = ππππ‘π|πΆ = ππππ‘π, πΉ = ππππ‘π) = 1
Indepence and Bayesian Network
What does it means the absence/presence of arrows in a Bayesian Network?
Bayesian Networks 13
Cavity Toothache Catch Weather
- Weather is independent of the other variables
- Toothache and Catch are conditionally independent given
Cavity (this is represented by the fact that there is no link between Toothache and Catch and by the fact that they have Cavity as a parent)
Semantics of Bayesian Networks
Two ways to view Bayes networks:
- 1. A representation of a joint
probability distribution.
- 2. An encoding of a collection of
conditional independence statements.
Bayesian Networks 14
Representation of the Full Joint Distribution
Write
Bayesian Networks 15
Factorization (chain rule)
Global Semantics
Bayesian Networks 16
Global semantics defines the full joint distribution as the product
- f the local conditional distributions. In other words, as a
Bayesian Network structure implies that the value of a particular node is conditional only on the values of its parent nodes, this reduces to In which
Global Semantics
Bayesian Networks 17
Example:
Local Semantics
Bayesian Networks 18
We can look at the actual graph structure and determine conditional independence relationships.
Local Semantics: A node π is conditionally independent of its non- decendants (π1ππππ), given its parents (π1ππ) Theorem: Local semantics if and only if global semantics
Conditional Independence: Markov blanket
Bayesian Networks 19
A node π is conditionally independent of all other nodes in the network, given its parents π1ππ , children π
1π π , and childrenβs parents π1ππππ ,
that is given its Markov blanket: parents + children + childenβs parents
Pearlβs Network Construction Algorithm
Bayesian Networks 20
Need a method such that a series of locally testable assertions of conditional independence guarantees the required global semantics
Example: Lung Cancer Diagnosis
Bayesian Networks 21
A patient has been suffering from shortness of breath (called dyspnoea) and visits the doctor, worried that he has lung cancer. The doctor knows that other diseases, such as tuberculosis and bronchitis are possible causes, as well as lung cancer. She also knows that other relevant information includes whether or not the patient is a smoker (increasing the chances of cancer and bronchitis) and what sort of air pollution he has been exposed to. A positive XRay would indicate either TB or lung cancer.
Lung cancer example: nodes and values
Bayesian Networks 22
Lung cancer example: CPTs
Bayesian Networks 23
Are the CPTs expressing all the possible combination of values?
Reasoning with Bayesian Networks
- Basic task for any probabilistic
inference system:
- Also called conditioning or belief
updating or inference.
Bayesian Networks 24
Compute the posterior probability distribution for a set of query variables, given new information about some evidence variables.
Most Usual Queries
Let π = πΉ βͺ π βͺ π, where πΉ are the evidence variable, π are the observed values, π are the varaible of interest, π are the rest of the varaibles.
Bayesian Networks 25
Types of reasoning
Bayesian Networks 26
How do you express these reasoning types in terms of conditional probabilities?
Some tools for using Bayesian Networks in real systems
- BayesiaLab: http://www.bayesia.com/
- GeNIe: https://dslpitt.org/genie/
- Hugin: http://www.hugin.com
- Netica: http://www.norsys.com/
Bayesian Networks 27
Why not? you download one of these tools and play a little bit!
Bayesian Networks 28
Sources of this Lecture
- S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach. Third
Edition.
- K. B. Korb, A. E. Nicholson, Bayesian Artificial Intelligence, Second
Edition, 2010.