Ready, Set, Go! Using TensorFlow to prototype, train, and - - PowerPoint PPT Presentation

ready set go
SMART_READER_LITE
LIVE PREVIEW

Ready, Set, Go! Using TensorFlow to prototype, train, and - - PowerPoint PPT Presentation

Ready, Set, Go! Using TensorFlow to prototype, train, and productionalize your models Karmel Allison Building a model is a multi-stage process. Setting the stage Data: Covertype (USFS + CSU) Task: Classify wilderness area (4 classes) Number of


slide-1
SLIDE 1
slide-2
SLIDE 2

Ready, Set, Go!

Using TensorFlow to prototype, train, and productionalize your models Karmel Allison

slide-3
SLIDE 3

Building a model is a multi-stage process.

slide-4
SLIDE 4

Setting the stage

Data: Covertype (USFS + CSU) Task: Classify wilderness area (4 classes) Number of examples: ~500K Features:

  • Real: elevation, slope, etc.
  • Binned: hillshade given hour (0 - 255)
  • Categorical: soil type, tree cover type
slide-5
SLIDE 5

Setting the stage

2596,51,3,258,0,510,221,232,148,6279,1,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0 ,0,0,5 2590,56,2,212,-6,390,220,235,151,6225,1,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0, 0,0,0,5

slide-6
SLIDE 6

Setting the stage

2596,51,3,258,0,510,221,232,148,6279,1,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0 ,0,0,5

# Elevation, Aspect, Slope, Horizontal_Distance_To_Hydrology, Vertical_Distance_To_Hydrology, Horizontal_Distance_To_Roadways, Hillshade_9am, Hillshade_Noon, Hillshade_3pm, Horizontal_Distance_To_Fire_Points, Wilderness_Area (4), Soil_Type (40), Cover_Type

slide-7
SLIDE 7

Prototype your model

import tensorflow as tf tf.enable_eager_execution()

slide-8
SLIDE 8

Eager execution is immediate

a = tf.constant(5) b = a * 3 print(b) <tf.Tensor: id=3, shape=(), dtype=int32, numpy=15>

a Const val:5 Const val:3 b Mul 15

slide-9
SLIDE 9

Loading data

defaults = [tf.int32] * 55 dataset = tf.contrib.data.CsvDataset( ['covtype.csv.train'], defaults)

slide-10
SLIDE 10

Loading data

defaults = [tf.int32] * 55 dataset = tf.contrib.data.CsvDataset( ['covtype.csv.train], defaults) print(list(dataset.take(1))) [(<tf.Tensor: id=188, shape=(), dtype=int32, numpy=2596>, <tf.Tensor: id=189, shape=(), dtype=int32, numpy=51>, <tf.Tensor: id=190, shape=(), dtype=int32, numpy=3>...

slide-11
SLIDE 11

Parsing data

def _parse_csv_row(*vals): return features, class_label

slide-12
SLIDE 12

Parsing data

def _parse_csv_row(*vals): soil_type_t = tf.convert_to_tensor(vals[14:54]) return features, class_label

slide-13
SLIDE 13

Parsing data

col_names = ['elevation', 'aspect', 'slope'...] def _parse_csv_row(*vals): soil_type_t = tf.convert_to_tensor(vals[14:54]) feat_vals = vals[:10] + (soil_type_t, vals[54]) features = dict(zip(col_names, feat_vals)) return features, class_label

slide-14
SLIDE 14

Parsing data

col_names = ['elevation', 'aspect', 'slope'...] def _parse_csv_row(*vals): soil_type_t = tf.convert_to_tensor(vals[14:54]) feat_vals = vals[:10] + (soil_type_t, vals[54]) features = dict(zip(col_names, feat_vals)) class_label = tf.argmax(row_vals[10:14], axis=0) return features, class_label

slide-15
SLIDE 15

Parsing data

dataset = dataset.map(_parse_csv_row).batch(64)

slide-16
SLIDE 16

Parsing data

dataset = dataset.map(_parse_csv_row).batch(64) print(list(dataset.take(1))) ({'aspect': <tf.Tensor: shape=(64,), dtype=int32, array([ 47, ... 77, 184, 328])>, ... 'soil_type': <tf.Tensor: shape=(64, 40), array([[0, 0, 0, ..., 0, 0, 0]...)>}, <tf.Tensor: shape=(64,), dtype=int64, array([0, 0, 3, ... 1, 0, 2])>)

slide-17
SLIDE 17

Raw CSV Dataset

2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=...

slide-18
SLIDE 18

Defining features

# Cover_Type / integer / 1 to 7 cover_type = tf.keras.feature_column. categorical_column_with_identity( 'cover_type', num_buckets=8)

slide-19
SLIDE 19

Defining features

# Cover_Type (7 types) / integer / 1 to 7 cover_type = tf.keras.feature_column. categorical_column_with_identity( 'cover_type', num_buckets=8) cover_embedding = tf.keras.feature_column. embedding_column(cover_type, dimension=10)

slide-20
SLIDE 20

Defining features

numeric_features = [tf.keras.feature_column. numeric_column(feat) for feat in numeric_cols]

slide-21
SLIDE 21

Defining features

numeric_features = [tf.keras.feature_column. numeric_column(feat) for feat in numeric_cols] # Soil_Type (40 binary columns) soil_type = tf.keras.feature_column. numeric_column(soil_type, shape=(40,))

slide-22
SLIDE 22

Defining features

columns = numeric_features + [ soil_type, cover_embedding] feature_layer = tf.keras.feature_column. FeatureLayer(columns)

Coming soon to a release near you!

slide-23
SLIDE 23

Building a model

model = tf.keras.Sequential([ feature_layer, tf.keras.layers.Dense(256), tf.keras.layers.Dense(16), tf.keras.layers.Dense(8), tf.keras.layers.Dense( 4, activation=tf.nn.softmax) ])

slide-24
SLIDE 24

Building a model

model.compile(

  • ptimizer=tf.train.AdamOptimizer(),

loss='sparse_categorical_crossentropy', metrics=['accuracy'])

slide-25
SLIDE 25

Building a model

model.fit(dataset, steps_per_epoch=NUM_TRAIN_EXAMPLES/64) Epoch 10 9000/9000 [====================] - 110s 12ms/step

  • loss: 0.8931 - acc: 0.7561
slide-26
SLIDE 26

Raw CSV Dataset

2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=... Epoch 10

9000/9000 [==============]

Data config layer Model architecture Optimizer and loss

slide-27
SLIDE 27

Validating our model

def load_data(*filenames): dataset = tf.contrib.data.CsvDataset( filenames, record_defaults) dataset = dataset.map(_parse_csv_row) dataset = dataset.batch(64) return dataset

slide-28
SLIDE 28

Validating our model

test_data = load_data('covtype.csv.test') loss, accuracy = model.evaluate( test_data, steps=50) print(loss, accuracy) 0.926471548461914, 0.7402

slide-29
SLIDE 29

Raw CSV Dataset

2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=... Epoch 10

9000/9000 [==============]

Data config layer Model architecture Optimizer and loss Validation

loss: 0.926 acc: 0.7402

slide-30
SLIDE 30

Export to SavedModel

export_dir = tf.contrib.saved_model. save_keras_model(model, 'keras_nn') keras_nn/ 1536162174/ saved_model variables/ assets/

slide-31
SLIDE 31

Export to SavedModel

restored_model = tf.contrib.saved_model. load_keras_model(export_dir)

slide-32
SLIDE 32

Raw CSV Dataset

2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=... Epoch 10

9000/9000 [==============]

Data config layer Model architecture Optimizer and loss Validation

loss: 0.926 acc: 0.7402

Saved Model

saved_model variables/ assets/

slide-33
SLIDE 33

Swapping the model

model = tf.keras.Sequential([ feature_layer, tf.keras.layers.Dense(256), tf.keras.layers.Dense(16), tf.keras.layers.Dense(8), tf.keras.layers.Dense( 4, activation=tf.nn.softmax) ])

slide-34
SLIDE 34

Swapping the model

model = tf.estimator.DNNLinearCombinedClassifier(

Wide Models Deep Models Wide & Deep Models

Hidden Layers Sparse Features Output Units Dense Embeddings Rectified Linear Units Sigmo id

slide-35
SLIDE 35

Swapping the model

model = tf.estimator.DNNLinearCombinedClassifier( linear_feature_columns=[cover_type, soil_type], dnn_feature_columns=numeric_features, dnn_hidden_units=[256, 16, 8], n_classes=4)

slide-36
SLIDE 36

Swapping the model

model.train( input_fn=lambda: load_data('covtype.csv.train'))

slide-37
SLIDE 37

Swapping the model

model.train( input_fn=lambda: load_data('covtype.csv.train')) model.evaluate( input_fn=lambda: load_data('covtype.csv.test'))

slide-38
SLIDE 38

Swapping the model

for epoch in range(10): model.train(...) print('Epoch {}:'.format(epoch + 1)) print(model.evaluate(...)) Epoch 10: {'average_loss': 0.3369278, 'accuracy': 0.86519998, 'global_step': 90010, 'loss': 21.324545}

slide-39
SLIDE 39

Swapping the model

input_receiver_fn = tf.estimator.export. build_raw_serving_input_receiver_fn( ...)

slide-40
SLIDE 40

Swapping the model

features_sample = list(dataset.take(1))[0][0] input_receiver_fn = tf.estimator.export. build_raw_serving_input_receiver_fn( features_sample)

slide-41
SLIDE 41

Swapping the model

features_sample = list(dataset.take(1))[0][0] input_receiver_fn = tf.estimator.export. build_raw_serving_input_receiver_fn( features_sample) model.export_saved_model( export_dir_base='wide_deep', serving_input_receiver_fn=input_receiver_fn)

slide-42
SLIDE 42

Raw CSV Dataset

2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=... Epoch 10: global_step: 90010

Data config layer Model architecture Optimizer and loss Validation

loss: 0.336 acc: 0.8651

Saved Model

saved_model variables/ assets/

slide-43
SLIDE 43

Conclusions

  • Prototype with Eager.
  • Preprocess with Datasets.
  • Transform with Feature Columns.
  • Build with Keras.
  • Borrow with Canned Estimators.
  • Package with SavedModel.
slide-44
SLIDE 44

Thank you!

+Karmel Allison

slide-45
SLIDE 45

Resources and Links

  • Covertype dataset: https://archive.ics.uci.edu/ml/datasets/Covertype
  • Eager execution: https://www.tensorflow.org/guide/eager
  • Performance tuning datasets: https://www.tensorflow.org/performance/datasets_performance
  • Feature columns: https://www.tensorflow.org/guide/feature_columns
  • WideDeep publication: https://arxiv.org/abs/1606.07792
  • Me: https://github.com/karmel