Bias Variance Trade-off Intuition: If the model is too simple, the - PowerPoint PPT Presentation
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Bias Variance Trade-off Intuition: If the model is too simple, the solution is biased and does not fit the data If the model is too complex then it is very sensitive to
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Bias Variance Trade-off � Intuition: � If the model is too simple, the solution is biased and does not fit the data � If the model is too complex then it is very sensitive to small changes in the data 2/12/13 Introduction to Probability Theory 1
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Bias � If you sample a dataset D multiple times you expect to learn a different h(x) � Expected hypothesis is E D [h(x)] � Bias: difference between the truth and what you expect to learn Z � bias 2 = { E D [ h ( x )] − t ( x ) 2 } 2 p ( x ) dx x � Decreases with more complex models 2/12/13 Recitation 1: Statistics Intro 2
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Variance � Variance: difference between what you learn from a particular dataset and what you expect to learn Z { E D [( h ( x ) − ¯ � h ( x )) 2 ] } p ( x ) dx variance = x ¯ h ( x ) = E D [ h ( x )] � Decreases with simpler models 2/12/13 Recitation 1: Statistics Intro 3
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Bias-Variance Tradeoff � The choice of hypothesis class introduces a learning bias � More complex class: less bias and more variance. 2/12/13 Recitation 1: Statistics Intro 4
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Training error � Given a dataset � Chose a loss function (L 2 for regression for example) � Training set error: N train 1 ⇣ ⌘ X error train = I ( y i 6 = h ( x )) N train j =1 N train ⌘ 2 1 ⇣ X error train = y i − w. x i N train j =1 2/12/13 Recitation 1: Statistics Intro 5
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Training error as a function of complexity 2/12/13 Recitation 1: Statistics Intro 6
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Prediction error � Training error is not necessary a good measure � We care about the error over all inputs points: ⇣ ⌘ error true = E x I ( y 6 = h ( x )) 2/12/13 Recitation 1: Statistics Intro 7
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Prediction error as a function of complexity 2/12/13 Recitation 1: Statistics Intro 8
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Prediction error � Training error is not necessary a good measure � We care about the error over all inputs points: ⇣ ⌘ error true = E x I ( y 6 = h ( x )) � Training error is an optimistically biased estimate of prediction error. You optimized with respect to training set. 2/12/13 Recitation 1: Statistics Intro 9
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Train-test � In practice: � Randomly divide the dataset into test and train. � Use training data to optimize parameters. � Test error: N test 1 ⇣ ⌘ X error test = I ( y i 6 = h ( x i )) N test i =1 2/12/13 Recitation 1: Statistics Intro 10
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Test error as a function of complexity 2/12/13 Recitation 1: Statistics Intro 11
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Overfitting � Overfitting happens when we obtain a model h when there exist another solution h’ such that: [ error train ( h ) < error train ( h 0 )] ∧ [ error true ( h ) > error true ( h 0 )] 2/12/13 Recitation 1: Statistics Intro 12
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Error as a function of data size for fixed complexity 2/12/13 Recitation 1: Statistics Intro 13
Carnegie Mellon University 10-701 Machine Learning Spring 2013 Careful � Test set only unbiased if never ever do any learning on it (including parameter selection!). 2/12/13 Recitation 1: Statistics Intro 14
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.