Machine Learning
Computational Learning Theory: Probably Approximately Correct (PAC) Learning
1
Slides based on material from Dan Roth, Avrim Blum, Tom Mitchell and others
Computational Learning Theory: Probably Approximately Correct (PAC) - - PowerPoint PPT Presentation
Computational Learning Theory: Probably Approximately Correct (PAC) Learning Machine Learning 1 Slides based on material from Dan Roth, Avrim Blum, Tom Mitchell and others Computational Learning Theory The Theory of Generalization
1
Slides based on material from Dan Roth, Avrim Blum, Tom Mitchell and others
2
3
4
5
target function β Eg: all π-conjunctions; all π-dimensional linear functions, β¦
β This is the set that the learning algorithm explores
6
7
All the notation we have so far on one slide
8
9
10
11
12
13
14
15
16
17
18
19
Given a small enough number of examples
20
Given a small enough number of examples with high probability
21
Given a small enough number of examples the learner will produce a βgood enoughβ classifier. with high probability
22
Given a small enough number of examples with high probability the learner will produce a βgood enoughβ classifier.
23
# [π π¦ β β π¦ ]
β Is there enough information in the sample to distinguish a hypothesis h that approximate f ?
β Is there an efficient algorithm that can process the sample and produce a good hypothesis h ?
β for every distribution (The distribution free assumption) β for every target function f in the class C
24
β Is there enough information in the sample to distinguish a hypothesis h that approximates f ?
β Is there an efficient algorithm that can process the sample and produce a good hypothesis h ?
β for every distribution (The distribution free assumption) β for every target function f in the class C
25
β Is there enough information in the sample to distinguish a hypothesis h that approximates f ?
β Is there an efficient algorithm that can process the sample and produce a good hypothesis h ?
β for every distribution (The distribution free assumption) β for every target function f in the class C
26
β Is there enough information in the sample to distinguish a hypothesis h that approximates f ?
β Is there an efficient algorithm that can process the sample and produce a good hypothesis h ?
β for every distribution (The distribution free assumption) β for every target function f in the class C
27
β Is there enough information in the sample to distinguish a hypothesis h that approximates f ?
β Is there an efficient algorithm that can process the sample and produce a good hypothesis h ?
β for every distribution (The distribution free assumption) β for every target function f in the class C
28