Applications of metric e v al u ation P R E D IC TIN G C TR W ITH - - PowerPoint PPT Presentation

applications of metric e v al u ation
SMART_READER_LITE
LIVE PREVIEW

Applications of metric e v al u ation P R E D IC TIN G C TR W ITH - - PowerPoint PPT Presentation

Applications of metric e v al u ation P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor Fo u r categories of o u tcomes First part of categor y ( tr u e / false ) represents w hether model w as


slide-1
SLIDE 1

Applications of metric evaluation

P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

Kevin Huo

Instructor

slide-2
SLIDE 2

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Four categories of outcomes

First part of category (true/false) represents whether model was correct or not Second part of the category (positive/negative) represents the target label the model applied

slide-3
SLIDE 3

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Interpretations of four categories

If model predicts there is a click, then there is a bid for that impression which costs money If no click predicted, no bidding and hence no cost True positives (TP): money gained (impressions paid for that were clicked on). False positives (FP): money lost (impressions that were paid for, but not clicked). True negatives (TN): money saved (no click predicted so no impressions bought). False negatives (FN): money lost out on (no click predicted, but would have been actual click in reality).

slide-4
SLIDE 4

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Confusion matrix

print(confusion_matrix(y_test, y_pred)) [[8163 166] [1517 154]] # Order: tn, fp, fn, tp print(confusion_matrix(y_test, y_pred).ravel()) [8163, 166, 1517, 154]

slide-5
SLIDE 5

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

ROI analysis

Assume: some cost c and return r per X number of impressions

total_return = tp * r total_cost = (tp + fp) * c tp * r > (tp + fp) * c roi = total_return / total_spent

slide-6
SLIDE 6

Let's practice!

P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

slide-7
SLIDE 7

Model evaluation

P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

Kevin Huo

Instructor

slide-8
SLIDE 8

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Precision and recall

Precision: proportion of clicks relative to total number of impressions, TP / (TP + FP) Higher precision means higher ROI on ad spend Recall: the proportion of clicks goen of all clicks available, TP / (TP + FN) Higher recall means beer targeting of relevant audience

slide-9
SLIDE 9

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Calculating precision and recall

print(precision_score( y_test, y_pred, average = 'weighted')) 0.73 print(recall_score( y_test, y_pred, average = 'weighted')) 0.75

slide-10
SLIDE 10

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Baseline classifiers

It is important to evaluate classiers relative to an appropriate baseline The baseline here, due to imbalanced nature of click data, is a classier that always predicts no click

y_pred = np.asarray([0 for x in range(len(X_test))]) [[0] [0] ...]

slide-11
SLIDE 11

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Implications on ROI analysis

For the baseline classier, tp and fp will be zero Therefore total return and total spend will be zero, and ROI undened Confusion matrix via confusion_matrix() along with ravel() to get the four categories of

  • utcomes

total_return = tp * r total_spent = (tp + fp) * cost roi = total_return / total_spent

slide-12
SLIDE 12

Let's practice!

P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

slide-13
SLIDE 13

Tuning models

P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

Kevin Huo

Instructor

slide-14
SLIDE 14

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Regularization

Regularization: addressing overing by altering the magnitude of coecients of parameters within a model Regularization can increase performance metrics and hence ROI on ad spend

slide-15
SLIDE 15

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Examples of regularization

Logistic Regression: the C parameter is the inverse of the regularization strength. From least to most complex: C=0.05 < C=0.5 < C=1 Decision Tree: the max_depth parameter controls how many layers deep the tree can grow. From least to most complex: max_depth=3 < max_depth=5 < max_depth=10

slide-16
SLIDE 16

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Cross validation

For each of the k folds, that fold will be used as a testing set (for validation) while other

k-1 are used as training.

Therefore, you have k evaluations of model performance. Note you still have the separate evaluation testing set.

slide-17
SLIDE 17

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Examples of cross validation

k_fold = KFold(n_splits = 4, random_state = 0) for i in [3, 5, 10]: clf = DecisionTreeClassifier(max_depth = i) cv_precision = cross_val_score( clf, X_train, y_train, cv = k_fold, scoring = 'precision_weighted')

Scoring strings: precision_weighted, recall_weighted, roc_auc

slide-18
SLIDE 18

Let's practice!

P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

slide-19
SLIDE 19

Ensembles and hyperparameter tuning

P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

Kevin Huo

Instructor

slide-20
SLIDE 20

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Ensemble methods

Bagging: random samples selected for dierent models, then models are individually trained and combined.

slide-21
SLIDE 21

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Random forests

clf = RandomForestClassifier() print(clf) RandomForestClassifier( bootstrap=True, ... max_depth = 10, ... n_estimators = 100, ...)

slide-22
SLIDE 22

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Hyperparameter tuning

Hyperparameter: parameters congured before training, and external to a model Examples of parameters but NOT hyperparameters: slope coecient in linear regression, weights in logistic regression, etc. Examples of hyperparameters: max_depth , n_estimators , etc.

slide-23
SLIDE 23

PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Grid search

param_grid = {'n_estimators': n_estimators, 'max_depth': max_depth} clf = GridSearchCV(estimator = model, param_grid = param_grid, scoring = 'roc_auc') print(clf.best_score_) print(clf.best_estimator_) 0.6777 RandomForestClassifier(max_depth = 100, ...)

slide-24
SLIDE 24

Let's practice!

P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON