How to train an image classifier using PyTorch Rogier van der Geer - - PowerPoint PPT Presentation

how to train
SMART_READER_LITE
LIVE PREVIEW

How to train an image classifier using PyTorch Rogier van der Geer - - PowerPoint PPT Presentation

How to train an image classifier using PyTorch Rogier van der Geer -- GoDataDriven What is an image classifier? What is a neural network? How do you build one in PyTorch? What can you do with them? Rogier van der Geer --


slide-1
SLIDE 1

How to train

an image classifier

using PyTorch

Rogier van der Geer -- GoDataDriven
slide-2
SLIDE 2 — What is an image classifier? — What is a neural network? — How do you build one in PyTorch? — What can you do with them? Rogier van der Geer -- GoDataDriven
slide-3
SLIDE 3 Labelled training data set Rogier van der Geer -- GoDataDriven
slide-4
SLIDE 4 Simple classifier Rogier van der Geer -- GoDataDriven
slide-5
SLIDE 5 Unlabelled data set Rogier van der Geer -- GoDataDriven
slide-6
SLIDE 6 Classifications based on classifier Rogier van der Geer -- GoDataDriven
slide-7
SLIDE 7 Neural network Rogier van der Geer -- GoDataDriven
slide-8
SLIDE 8 Neuron Rogier van der Geer -- GoDataDriven
slide-9
SLIDE 9 Neuron Rectified linear unit: Rogier van der Geer -- GoDataDriven
slide-10
SLIDE 10 Neural network Rogier van der Geer -- GoDataDriven
slide-11
SLIDE 11 Image classification "Dog" "Cat" Rogier van der Geer -- GoDataDriven
slide-12
SLIDE 12

deep

convolutional

networks

Rogier van der Geer -- GoDataDriven
slide-13
SLIDE 13 deep neural network Rogier van der Geer -- GoDataDriven
slide-14
SLIDE 14 convolutional neural network Rogier van der Geer -- GoDataDriven
slide-15
SLIDE 15 VGG16: 16 layers, 144 million weights Rogier van der Geer -- GoDataDriven
slide-16
SLIDE 16 ImageNet — 14 million images — annotated into 1000 classes VGG16: ~ 90% accuracy on 1000 classes Rogier van der Geer -- GoDataDriven
slide-17
SLIDE 17 Transfer learning Rogier van der Geer -- GoDataDriven
slide-18
SLIDE 18 Transfer learning Rogier van der Geer -- GoDataDriven
slide-19
SLIDE 19 Why PyTorch not Keras? — Keras was there first — PyTorch is more flexible — Keras is faster — PyTorch lets you play with the internals

You learn more from PyTorch

Rogier van der Geer -- GoDataDriven
slide-20
SLIDE 20 PyTorch: define a model from torch import nn import torch.nn.functional as F class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv = nn.Conv2d(3, 18, kernel_size=3, stride=1, padding=1) self.pool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0) self.fc1 = nn.Linear(18 * 16 * 16, 64) self.fc2 = nn.Linear(64, 10) def forward(self, x): # Input: 3 channels, 32x32 x = F.relu(self.conv(x)) # Converts to 18 channels, 32x32 x = self.pool(x) # Pooling reduces to 18 channels, 16x16 x = x.view(-1, 18 * 16 * 16) # Reshape to a 1D vector of size 4608 x = F.relu(self.fc1(x)) # Apply first FC layer, output has size 64 x = self.fc2(x) # Apply second FC layer, output has size 10 return x Rogier van der Geer -- GoDataDriven
slide-21
SLIDE 21 PyTorch: loading a pre-trained model from torchvision.models import squeezenet1_0 # Or VGG model = squeezenet1_0(pretrained=True) Rogier van der Geer -- GoDataDriven
slide-22
SLIDE 22 from torchvision.models import squeezenet1_0 print(squeezenet1_0(pretrained=True)) SqueezeNet( (features): Sequential( (0): Conv2d(3, 96, kernel_size=(7, 7), stride=(2, 2)) (1): ReLU(inplace) (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True) ... ) (classifier): Sequential( (0): Dropout(p=0.5) (1): Conv2d(512, 1000, kernel_size=(1, 1), stride=(1, 1)) (2): ReLU(inplace) (3): AvgPool2d(kernel_size=13, stride=1, padding=0) ) ) Rogier van der Geer -- GoDataDriven
slide-23
SLIDE 23 PyTorch: pre-trained model from torch import nn from torchvision.models import squeezenet1_0 n_classes = 4 model = squeezenet1_0(pretrained=True) model.num_classes = n_classes model.classifier[1] = nn.Conv2d(512, n_classes, kernel_size=(1, 1), stride=(1, 1)) Rogier van der Geer -- GoDataDriven
slide-24
SLIDE 24 Train your model from torch import nn, optim model.train() # Set your model to training mode criterion = nn.CrossEntropyLoss()
  • ptimizer = optim.SGD(model.parameters(), lr=1E-3, momentum=0.9)
for inputs, labels in loader: # Multiple images at once
  • ptimizer.zero_grad() # Reset the optimizer
  • utputs = model(inputs) # Forward pass
loss = criterion(outputs, labels) # Compute the loss loss.backward() # Backward pass
  • ptimizer.step() # Optimize the weights
One loop through all training images is an epoch. Rogier van der Geer -- GoDataDriven
slide-25
SLIDE 25 Evaluation from torch import max, no_grad model.eval() # Set model to evaluation mode: disable dropout etc loss = 0 with no_grad(): for inputs, labels in loader:
  • utputs = model(inputs)
_, predictions = max(outputs.data, dim=1) # Returns (values, indices) loss += criterion(outputs, labels) Rogier van der Geer -- GoDataDriven
slide-26
SLIDE 26 Loading data: the dataset from torchvision import transforms from torchvision.datasets import ImageFolder train_transform = transforms.Compose([ transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor(), ]) test_transform = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor() ]) train_set = ImageFolder(path_to_train_images, transform=train_transform) test_set = ImageFolder(path_to_test_images, transform=test_transform) Rogier van der Geer -- GoDataDriven
slide-27
SLIDE 27 Loading data: the loader from torch.utils.data import DataLoader train_loader = DataLoader( dataset=train_set, batch_size=32, num_workers=4, shuffle=True, ) test_loader = DataLoader( dataset=test_set, batch_size=32, num_workers=4, shuffle=True, ) Rogier van der Geer -- GoDataDriven
slide-28
SLIDE 28 Learning rate Remember our optimizer:
  • ptimizer = SGD(model.parameters(), lr=1E-3, momentum=0.9)
Here lr is our learning rate, the rate at which we change the weights when training. What is a good value? Rogier van der Geer -- GoDataDriven
slide-29
SLIDE 29 Rogier van der Geer -- GoDataDriven
slide-30
SLIDE 30 Rogier van der Geer -- GoDataDriven
slide-31
SLIDE 31 Rogier van der Geer -- GoDataDriven
slide-32
SLIDE 32 A learning rate sweep def set_learning_rate(optimizer, learning_rate): for param_group in optimizer.param_groups: param_group['lr'] = learning_rate learning_rates = np.logspace(min_lr, max_lr, num=n_steps) results = [] for learning_rate in learning_rates: set_learning_rate(optimizer, learning_rate) train_batches(...) results.append(evaluate_batches(...)) Rogier van der Geer -- GoDataDriven
slide-33
SLIDE 33 Learning rate sweep plot Rogier van der Geer -- GoDataDriven
slide-34
SLIDE 34 Learning rate scheduler from torch.optim.lr_scheduler import ReduceLROnPlateau scheduler = ReduceLROnPlateau(optimizer, factor=0.5, patience=25) After every training epoch: scheduler.step(test_loss) Rogier van der Geer -- GoDataDriven
slide-35
SLIDE 35 Learning rate step plot Rogier van der Geer -- GoDataDriven
slide-36
SLIDE 36

Data

Rogier van der Geer -- GoDataDriven
slide-37
SLIDE 37 Rogier van der Geer -- GoDataDriven
slide-38
SLIDE 38 Data set Photos taken in the worlds largest cities — 72 cities — ~ 0.5M images — 10k photographers — ~ 30 GB — licensed for reuse Rogier van der Geer -- GoDataDriven
slide-39
SLIDE 39 Rogier van der Geer -- GoDataDriven
slide-40
SLIDE 40 Rogier van der Geer -- GoDataDriven
slide-41
SLIDE 41 Rogier van der Geer -- GoDataDriven
slide-42
SLIDE 42 — Amsterdam Rogier van der Geer -- GoDataDriven
slide-43
SLIDE 43 — Amsterdam — Dublin Rogier van der Geer -- GoDataDriven
slide-44
SLIDE 44 Dublin, Terminal 2, Amsterdam, Schiphol, Seoul, Incheon, Taipei, Taoyuan, Hong Kong, Airport, Citygate, Aer Lingus, KLM, Korean Air, Eva Air, Cathay Pacific, Jeju, Gimpo, Hyatt Regency, Grand Hyatt, The Sherwood Hotel, Regent Hotel, Park Hyatt, Intercontinental, COEX, Taipei 101, Elite Concepts, cars, ICC, Ritz Carlton, W Hotel Hong Kong, breakfast, lunch, dinner, room service, french toast, ice cream, birthday, Mercedes, Hyundai, Kia, BMW, Bentley, Bongeunsa, Buddhist temple, Shilla, Lotte, cocktails, Taxis, transport, traffic, landmark, watch, bed, bathroom, suite, rooms, facades, architecture, street art, candid, men, girls, people, Jungmun beach, Teddy Bear Museum, Grand Club, Regency Club, irish love... Rogier van der Geer -- GoDataDriven
slide-45
SLIDE 45 Rogier van der Geer -- GoDataDriven
slide-46
SLIDE 46 Rogier van der Geer -- GoDataDriven
slide-47
SLIDE 47 — find median latitude and longitude — remove all images more than ~ 5 km away — repeat for all cities Rogier van der Geer -- GoDataDriven
slide-48
SLIDE 48 Other tags city, street, sony, square, belgium, squareformat, architecture, london, photography, australia, brussels, 2016, art, urban, tokyo, bruxelles, japan, park, berlin, paris, night, travel, 2018, sky, ilce6500, sonyilce6500, california, sydney, streetphotography, nikon, chicago, people, building, belgique, spain, de, new, barcelona, nyc, losangeles, 2015, music, highiso, europe, museum, usa, amsterdam, concert, toronto, , england, skyline, bxl, bru, france, switzerland, , live, manhattan, canada, downtown, photoderue, sport, outdoor, china, rome, uk Rogier van der Geer -- GoDataDriven
slide-49
SLIDE 49 Other tags city, street, sony, square, belgium, squareformat, architecture, london, photography, australia, brussels, 2016, art, urban, tokyo, bruxelles, japan, park, berlin, paris, night, travel, 2018, sky, ilce6500, sonyilce6500, california, sydney, streetphotography, nikon, chicago, people, building, belgique, spain, de, new, barcelona, nyc, losangeles, 2015, music, highiso, europe, museum, usa, amsterdam, concert, toronto, , england, skyline, bxl, bru, france, switzerland, , live, manhattan, canada, downtown, photoderue, sport, outdoor, china, rome, uk Rogier van der Geer -- GoDataDriven
slide-50
SLIDE 50 Top 10 most common cities city # images london 1677 new york city 1320 chicago 909 toronto 521 sydney 203 los angeles 201 tokyo 191 philadelphia 175 houston 173 shanghai 151 Rogier van der Geer -- GoDataDriven
slide-51
SLIDE 51 Top 10 most common cities city train images test images london 1509 168 new york city 1188 132 chicago 818 91 toronto 469 52 sydney 183 20 los angeles 182 19 tokyo 172 19 philadelphia 157 18 houston 157 16 shanghai 136 15 Rogier van der Geer -- GoDataDriven
slide-52
SLIDE 52

wait...

  • r get a fast gpu
Rogier van der Geer -- GoDataDriven
slide-53
SLIDE 53 Rogier van der Geer -- GoDataDriven
slide-54
SLIDE 54

London

Rogier van der Geer -- GoDataDriven
slide-55
SLIDE 55 Rogier van der Geer -- GoDataDriven
slide-56
SLIDE 56

Sydney

Rogier van der Geer -- GoDataDriven
slide-57
SLIDE 57 Rogier van der Geer -- GoDataDriven
slide-58
SLIDE 58

Toronto

Rogier van der Geer -- GoDataDriven
slide-59
SLIDE 59 Rogier van der Geer -- GoDataDriven
slide-60
SLIDE 60

Los Angeles

Rogier van der Geer -- GoDataDriven
slide-61
SLIDE 61 Rogier van der Geer -- GoDataDriven
slide-62
SLIDE 62

Chicago

Rogier van der Geer -- GoDataDriven
slide-63
SLIDE 63 Rogier van der Geer -- GoDataDriven
slide-64
SLIDE 64

Philadelphia

Rogier van der Geer -- GoDataDriven
slide-65
SLIDE 65 Rogier van der Geer -- GoDataDriven
slide-66
SLIDE 66

Tokyo

Rogier van der Geer -- GoDataDriven
slide-67
SLIDE 67 Rogier van der Geer -- GoDataDriven
slide-68
SLIDE 68

Houston

Rogier van der Geer -- GoDataDriven
slide-69
SLIDE 69 Rogier van der Geer -- GoDataDriven
slide-70
SLIDE 70

Shanghai

Rogier van der Geer -- GoDataDriven
slide-71
SLIDE 71 Rogier van der Geer -- GoDataDriven
slide-72
SLIDE 72

Chicago

Rogier van der Geer -- GoDataDriven
slide-73
SLIDE 73

Chicago

what?

Rogier van der Geer -- GoDataDriven
slide-74
SLIDE 74 More mistagged images Train set Test set Rogier van der Geer -- GoDataDriven
slide-75
SLIDE 75

Plan

Rogier van der Geer -- GoDataDriven
slide-76
SLIDE 76 Assign photographers to train/test splits city train images test images train photographers test photographers london 1509 168 161 18 new york city 1188 132 253 26 chicago 818 91 170 19 toronto 469 52 90 11 sydney 183 20 54 7 los angeles 182 19 50 5 tokyo 172 19 37 4 philadelphia 157 18 30 4 houston 157 16 24 3 shanghai 136 15 38 4 Rogier van der Geer -- GoDataDriven
slide-77
SLIDE 77

wait...

Rogier van der Geer -- GoDataDriven
slide-78
SLIDE 78 Result — Awful performance: — train: ~ 90% accuracy — test: ~ 50% accuracy — Very overtrained! — Too few photographers per city — Too many mistagged photos Rogier van der Geer -- GoDataDriven
slide-79
SLIDE 79

Another

Plan

Rogier van der Geer -- GoDataDriven
slide-80
SLIDE 80 Other plan
  • 1. Build a model:
— Classes skyline and no skyline
  • 2. Train on all data
— Labels: has skyline tag or not
  • 3. Make predictions for all data
  • 4. Only use data with positive prediction
Rogier van der Geer -- GoDataDriven
slide-81
SLIDE 81

wait...

Rogier van der Geer -- GoDataDriven
slide-82
SLIDE 82 Result prediction: no skyline skyline no tag 467452 1070 with tag 1181 6204 Now: re-create train/test split Rogier van der Geer -- GoDataDriven
slide-83
SLIDE 83

wait...

Rogier van der Geer -- GoDataDriven
slide-84
SLIDE 84 Yet more results Rogier van der Geer -- GoDataDriven
slide-85
SLIDE 85 Chicago Rogier van der Geer -- GoDataDriven
slide-86
SLIDE 86 Los Angeles Rogier van der Geer -- GoDataDriven
slide-87
SLIDE 87 prediction: New York City, label: Philadelphia Rogier van der Geer -- GoDataDriven
slide-88
SLIDE 88 prediction: London, label: Toronto Rogier van der Geer -- GoDataDriven
slide-89
SLIDE 89 prediction: Philadelphia, label: Shanghai Rogier van der Geer -- GoDataDriven
slide-90
SLIDE 90 Final remarks — Training an image classifier is not that difficult — Pytorch is fun! — Clean data is more important than a better model Rogier van der Geer -- GoDataDriven
slide-91
SLIDE 91

Thank you!

h!ps://gitlab.com/rogiervandergeer/skylines

h!ps:/ /blog.godatadriven.com

Rogier van der Geer -- GoDataDriven
slide-92
SLIDE 92

Appendix

Rogier van der Geer -- GoDataDriven
slide-93
SLIDE 93 Rogier van der Geer -- GoDataDriven