Jidong
- ng Zhai
Tsing nghua hua University versity Introduction Deep learning - - PowerPoint PPT Presentation
Deep ep500 500 BOF 2018 Jidong ong Zhai Tsing nghua hua University versity Introduction Deep learning has widely used in lots of areas Introduction A lot of deep learning frameworks, compute libraries and acceleration devices
Target Framework Compute Library Compute Library Compute Device Compute Library Framework Framework Models Granularity Neural Network Basic Operation Neural Network Neural Network Diversity Only CNN Training Inference 2 CNN + 1 RNN 4 CNN Dataset ImageNet Dummy Data CIFAR10、ImageNet SQuAD ImageNet Metrics Time Per Iteration Time Training Time and Cost to certain Accuracy Total Training Time
Evaluation Target Framework Compute Device Characteristics Granularity Neural Network Diversity
Dataset ImageNet, COCO, WMT, Librispeech, MovieLens, … Evaluation Metrics Training Time, Power Use and Cost to certain Accuracy
Image Classification Machine Translation Language Model Question Answering
Applications
VGG ResNet Seq2seq RNN LM AoA Reader
Models Dataset
WikiText-2 CBTest Cifar Tatoeba
Real time Controllable Easy to obtain Generative
100 200 300 400 500 600 700 VGG ResNet RNN LM AoA Reader Seq2seq Time(ms) Data Forward Backward Loss Update
2000 4000 6000 8000 10000 12000 14000 16000 VGG ResNet RNN LM AoA Reader Seq2seq Memory Use(MB) Weight Mediate Result + Temp 0.0 0.2 0.4 0.6 0.8 1.0 2,048 4,096 6,144 8,192 10,240 12,288 14,336 16,384 18,432 50000 100000 150000 200000 Ratio Memory Use(MB) Pic Area(Pixel2) Traning Inference Training/Inference 0.0 0.2 0.4 0.6 0.8 1.0 2,048 4,096 6,144 8,192 10,240 12,288 14,336 16,384 18,432 200 400 600 800 1000 1200 Ratio Memory Use(MB) Sequence Length Training Inference Training/Inference
GPU Occupancy Warp Execution Efficiency Warp Non-Pred Execution Efficiency Bandwidth Utilization TFLPOS Normalized 1 0.46 1.00 1.00 4.02 5.65