Ready, Set, Go! Using TensorFlow to prototype, train, and - - PowerPoint PPT Presentation
Ready, Set, Go! Using TensorFlow to prototype, train, and - - PowerPoint PPT Presentation
Ready, Set, Go! Using TensorFlow to prototype, train, and productionalize your models Karmel Allison Building a model is a multi-stage process. Setting the stage Data: Covertype (USFS + CSU) Task: Classify wilderness area (4 classes) Number of
Ready, Set, Go!
Using TensorFlow to prototype, train, and productionalize your models Karmel Allison
Building a model is a multi-stage process.
Setting the stage
Data: Covertype (USFS + CSU) Task: Classify wilderness area (4 classes) Number of examples: ~500K Features:
- Real: elevation, slope, etc.
- Binned: hillshade given hour (0 - 255)
- Categorical: soil type, tree cover type
Setting the stage
2596,51,3,258,0,510,221,232,148,6279,1,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0 ,0,0,5 2590,56,2,212,-6,390,220,235,151,6225,1,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0, 0,0,0,5
Setting the stage
2596,51,3,258,0,510,221,232,148,6279,1,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0 ,0,0,5
# Elevation, Aspect, Slope, Horizontal_Distance_To_Hydrology, Vertical_Distance_To_Hydrology, Horizontal_Distance_To_Roadways, Hillshade_9am, Hillshade_Noon, Hillshade_3pm, Horizontal_Distance_To_Fire_Points, Wilderness_Area (4), Soil_Type (40), Cover_Type
Prototype your model
import tensorflow as tf tf.enable_eager_execution()
Eager execution is immediate
a = tf.constant(5) b = a * 3 print(b) <tf.Tensor: id=3, shape=(), dtype=int32, numpy=15>
a Const val:5 Const val:3 b Mul 15
Loading data
defaults = [tf.int32] * 55 dataset = tf.contrib.data.CsvDataset( ['covtype.csv.train'], defaults)
Loading data
defaults = [tf.int32] * 55 dataset = tf.contrib.data.CsvDataset( ['covtype.csv.train], defaults) print(list(dataset.take(1))) [(<tf.Tensor: id=188, shape=(), dtype=int32, numpy=2596>, <tf.Tensor: id=189, shape=(), dtype=int32, numpy=51>, <tf.Tensor: id=190, shape=(), dtype=int32, numpy=3>...
Parsing data
def _parse_csv_row(*vals): return features, class_label
Parsing data
def _parse_csv_row(*vals): soil_type_t = tf.convert_to_tensor(vals[14:54]) return features, class_label
Parsing data
col_names = ['elevation', 'aspect', 'slope'...] def _parse_csv_row(*vals): soil_type_t = tf.convert_to_tensor(vals[14:54]) feat_vals = vals[:10] + (soil_type_t, vals[54]) features = dict(zip(col_names, feat_vals)) return features, class_label
Parsing data
col_names = ['elevation', 'aspect', 'slope'...] def _parse_csv_row(*vals): soil_type_t = tf.convert_to_tensor(vals[14:54]) feat_vals = vals[:10] + (soil_type_t, vals[54]) features = dict(zip(col_names, feat_vals)) class_label = tf.argmax(row_vals[10:14], axis=0) return features, class_label
Parsing data
dataset = dataset.map(_parse_csv_row).batch(64)
Parsing data
dataset = dataset.map(_parse_csv_row).batch(64) print(list(dataset.take(1))) ({'aspect': <tf.Tensor: shape=(64,), dtype=int32, array([ 47, ... 77, 184, 328])>, ... 'soil_type': <tf.Tensor: shape=(64, 40), array([[0, 0, 0, ..., 0, 0, 0]...)>}, <tf.Tensor: shape=(64,), dtype=int64, array([0, 0, 3, ... 1, 0, 2])>)
Raw CSV Dataset
2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=...
Defining features
# Cover_Type / integer / 1 to 7 cover_type = tf.keras.feature_column. categorical_column_with_identity( 'cover_type', num_buckets=8)
Defining features
# Cover_Type (7 types) / integer / 1 to 7 cover_type = tf.keras.feature_column. categorical_column_with_identity( 'cover_type', num_buckets=8) cover_embedding = tf.keras.feature_column. embedding_column(cover_type, dimension=10)
Defining features
numeric_features = [tf.keras.feature_column. numeric_column(feat) for feat in numeric_cols]
Defining features
numeric_features = [tf.keras.feature_column. numeric_column(feat) for feat in numeric_cols] # Soil_Type (40 binary columns) soil_type = tf.keras.feature_column. numeric_column(soil_type, shape=(40,))
Defining features
columns = numeric_features + [ soil_type, cover_embedding] feature_layer = tf.keras.feature_column. FeatureLayer(columns)
Coming soon to a release near you!
Building a model
model = tf.keras.Sequential([ feature_layer, tf.keras.layers.Dense(256), tf.keras.layers.Dense(16), tf.keras.layers.Dense(8), tf.keras.layers.Dense( 4, activation=tf.nn.softmax) ])
Building a model
model.compile(
- ptimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
Building a model
model.fit(dataset, steps_per_epoch=NUM_TRAIN_EXAMPLES/64) Epoch 10 9000/9000 [====================] - 110s 12ms/step
- loss: 0.8931 - acc: 0.7561
Raw CSV Dataset
2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=... Epoch 10
9000/9000 [==============]
Data config layer Model architecture Optimizer and loss
Validating our model
def load_data(*filenames): dataset = tf.contrib.data.CsvDataset( filenames, record_defaults) dataset = dataset.map(_parse_csv_row) dataset = dataset.batch(64) return dataset
Validating our model
test_data = load_data('covtype.csv.test') loss, accuracy = model.evaluate( test_data, steps=50) print(loss, accuracy) 0.926471548461914, 0.7402
Raw CSV Dataset
2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=... Epoch 10
9000/9000 [==============]
Data config layer Model architecture Optimizer and loss Validation
loss: 0.926 acc: 0.7402
Export to SavedModel
export_dir = tf.contrib.saved_model. save_keras_model(model, 'keras_nn') keras_nn/ 1536162174/ saved_model variables/ assets/
Export to SavedModel
restored_model = tf.contrib.saved_model. load_keras_model(export_dir)
Raw CSV Dataset
2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=... Epoch 10
9000/9000 [==============]
Data config layer Model architecture Optimizer and loss Validation
loss: 0.926 acc: 0.7402
Saved Model
saved_model variables/ assets/
Swapping the model
model = tf.keras.Sequential([ feature_layer, tf.keras.layers.Dense(256), tf.keras.layers.Dense(16), tf.keras.layers.Dense(8), tf.keras.layers.Dense( 4, activation=tf.nn.softmax) ])
Swapping the model
model = tf.estimator.DNNLinearCombinedClassifier(
Wide Models Deep Models Wide & Deep Models
Hidden Layers Sparse Features Output Units Dense Embeddings Rectified Linear Units Sigmo id
Swapping the model
model = tf.estimator.DNNLinearCombinedClassifier( linear_feature_columns=[cover_type, soil_type], dnn_feature_columns=numeric_features, dnn_hidden_units=[256, 16, 8], n_classes=4)
Swapping the model
model.train( input_fn=lambda: load_data('covtype.csv.train'))
Swapping the model
model.train( input_fn=lambda: load_data('covtype.csv.train')) model.evaluate( input_fn=lambda: load_data('covtype.csv.test'))
Swapping the model
for epoch in range(10): model.train(...) print('Epoch {}:'.format(epoch + 1)) print(model.evaluate(...)) Epoch 10: {'average_loss': 0.3369278, 'accuracy': 0.86519998, 'global_step': 90010, 'loss': 21.324545}
Swapping the model
input_receiver_fn = tf.estimator.export. build_raw_serving_input_receiver_fn( ...)
Swapping the model
features_sample = list(dataset.take(1))[0][0] input_receiver_fn = tf.estimator.export. build_raw_serving_input_receiver_fn( features_sample)
Swapping the model
features_sample = list(dataset.take(1))[0][0] input_receiver_fn = tf.estimator.export. build_raw_serving_input_receiver_fn( features_sample) model.export_saved_model( export_dir_base='wide_deep', serving_input_receiver_fn=input_receiver_fn)
Raw CSV Dataset
2596,51,3,2 58,0,510,22 1,232,148,6 279,1,0,... ({'aspect': <tf.Tensor: id=567, shape=... Epoch 10: global_step: 90010
Data config layer Model architecture Optimizer and loss Validation
loss: 0.336 acc: 0.8651
Saved Model
saved_model variables/ assets/
Conclusions
- Prototype with Eager.
- Preprocess with Datasets.
- Transform with Feature Columns.
- Build with Keras.
- Borrow with Canned Estimators.
- Package with SavedModel.
Thank you!
+Karmel Allison
Resources and Links
- Covertype dataset: https://archive.ics.uci.edu/ml/datasets/Covertype
- Eager execution: https://www.tensorflow.org/guide/eager
- Performance tuning datasets: https://www.tensorflow.org/performance/datasets_performance
- Feature columns: https://www.tensorflow.org/guide/feature_columns
- WideDeep publication: https://arxiv.org/abs/1606.07792
- Me: https://github.com/karmel