[PPT] - Capsule Networks - An Overview Luca Dombetzki July 13, 2018 PowerPoint Presentation

SLIDE 1

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Capsule Networks - An Overview

Luca Dombetzki

July 13, 2018 Advisor: Marton Kajo Chair of Network Architectures and Services Department of Informatics Technical University of Munich

SLIDE 2

Overview

Introduction Convolutional Neural Networks Capsule Networks Discussion Conclusion Bibliography Appendix

L. Dombetzki — Capsule Networks

2

SLIDE 3

Introduction

Introduction Convolutional Neural Networks Capsule Networks Discussion Conclusion Bibliography Appendix

L. Dombetzki — Capsule Networks

3

SLIDE 4

Introduction

Motivation

Figure 1: figure from [12]

Both images are seen as "face" by a typical Convolutional Neural Network ⇒ Capsule Networks

L. Dombetzki — Capsule Networks

4

SLIDE 5

Introduction

Where does AI come from?

Figure 2: A neuron as part of a Multi Layer Neural Network [21]

Designed after human brain

Advancement in modeling with math
Performance gains with GPUs
Deep Learning - leverage both

BUT not like human brain anymore

Blackbox system
Requires huge amounts of data
Very probabilistic
L. Dombetzki — Capsule Networks

5

SLIDE 6

Introduction

Who is Geoffrey E. Hinton? “The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster.” - Geoffrey E. Hinton (2014) [7]

Professor at Toronto University
Working at Google Brain
Major advancements in AI [13]
Research on Capsule Networks:
Based on biological research
Understanding Human vision (1981) [9]
Talks explaining his motivation [8]
Dynamic Routing Between Capsules (2017) [19]
Matrix Capsules with EM-Routing (2018) [6]

Figure 3: Geoffrey E. Hinton [24]

L. Dombetzki — Capsule Networks

6

SLIDE 7

Convolutional Neural Networks

Introduction Convolutional Neural Networks Capsule Networks Discussion Conclusion Bibliography Appendix

L. Dombetzki — Capsule Networks

7

SLIDE 8

Convolutional Neural Networks

What are CNNs?

Figure 4: Typcial architecture of a CNN [16]

L. Dombetzki — Capsule Networks

8

SLIDE 9

Convolutional Neural Networks

Convolution and kernels

Figure 5: Convolution operation [11]

L. Dombetzki — Capsule Networks

9

SLIDE 10

Convolutional Neural Networks

Activation functions

Figure 6: Sigmoid and Rectified Linear Unit (ReLU) [20] Σ σ +1 x1 x2 x3 xn w0 w1 w2 w3 wn σ

w0 +

n

i=1

wixi

Figure 7: A single neuron [21]
L. Dombetzki — Capsule Networks

10

SLIDE 11

Convolutional Neural Networks

Pooling as a form of routing Routing

find important nodes (inputs)
group together
give to next layer

Pooling

reduces input data
next layer can “see” more than

the previous

enables detecting full objects

through locational invariance

static routing

Figure 8: Max pooling example [2]

L. Dombetzki — Capsule Networks

11

SLIDE 12

Convolutional Neural Networks

How CNNs see the world

Figure 9: Feature detections of a CNN [15]

L. Dombetzki — Capsule Networks

12

SLIDE 13

Convolutional Neural Networks

Problems of pooling

Figure 10: Distorted face from [12]

Geoffrey E. Hinton’s arguments against pooling [8]

Unnatural
No use of the linear structure of vision
Static instead of dynamic routing
Invariance instead of Equivariance
L. Dombetzki — Capsule Networks

13

SLIDE 14

Convolutional Neural Networks

What does a neuron represent?

Figure 11: Face detection with a CNN, from [10]

L. Dombetzki — Capsule Networks

14

SLIDE 15

Capsule Networks

Introduction Convolutional Neural Networks Capsule Networks Discussion Conclusion Bibliography Appendix

L. Dombetzki — Capsule Networks

15

SLIDE 16

Capsule Networks

Hinton’s idea

Figure 12: Hierarchical modeling in Computer Graphics [5]

Build a network to perform inverse graphics

propagate probability and pose of features
dynamic routing based on pose information
introduce concept of an entity into the network’s architecture

⇒ The capsule

L. Dombetzki — Capsule Networks

16

SLIDE 17

Capsule Networks

An abstract view on capsules

Figure 13: Capsule face detection, from [10]

L. Dombetzki — Capsule Networks

17

SLIDE 18

Capsule Networks

The capsule - a group of neurons Before After layer of neurons layer of neuron groups input = n values, output = value input = n vectors, output = vector

A capsule learns parameters (skew, scale, rotation, etc)
n-dimensional capsule = n-dimensional vectorout

⇒ n parameters ˆ = pose

probability = ||vectorout||
L. Dombetzki — Capsule Networks

18

SLIDE 19

Capsule Networks

Architecture - The CapsNet

Figure 14: Capsule Network Architecture as described in [19]

Layer Function Conv1 Convolutional layer PrimaryCaps Convolutional squashing capsules DigitCaps Normal (digit) capsules Class predictions Length of each DigitCapsule

L. Dombetzki — Capsule Networks

19

SLIDE 20

Capsule Networks

Routing-by-agreement - the idea

Figure 15: capsule agreement [4]

L. Dombetzki — Capsule Networks

20

SLIDE 21

Capsule Networks

Routing by agreement Phenomenon “coincidence filtering”

high dimensional pose-parameter-space
similar poses by chance very unlikely (curse of dimensionality)

Clustering the inputs based on their pose: repeat n times:

1. find the mean vector of the cluster
2. weighs all inputs based on their

distance to this mean

3. normalize the weights

Figure 16: weighted clustering [4]

L. Dombetzki — Capsule Networks

21

SLIDE 22

Capsule Networks

How to train the network Margin Loss Reconstruction (Decoder) network

Figure 17: Capsule Network architectures [19]

Goal Lossfunction Learning Parameter learning Reconstruction loss Unsupervised Classification Margin loss Supervised Reconstruction loss

reconstruct digit by

masking the active capsule Margin loss

detection: ||v|| ≥ 0.9
no detection: ||v|| ≤ 0.1
L. Dombetzki — Capsule Networks

22

SLIDE 23

Capsule Networks

How does it perform? - Parameter Effects Scale and thickness Localized part Stroke thick- ness Localized skew Width and translation Localized part

Figure 18: Effects of capsule parameters on reconstruction [19]

L. Dombetzki — Capsule Networks

23

SLIDE 24

Capsule Networks

How does it perform? - MultiMNIST R:(6, 0) R:(6, 8) R:(7, 1) R:(8, 7) R:(9, 4) R:(9, 5) R:(8, 4) L:(6, 0) L:(6, 8) L:(7, 1) L:(8, 7) L:(9, 4) L:(9, 5) L:(8, 4) Routing Rec.Loss MNIST (%) MultiMNIST (%) CNN

0.39

8.1 CapsNet 1 no 0.34±0.032

CapsNet

1 yes 0.29±0.011 7.5 CapsNet 3 no 0.35±0.036

CapsNet

3 yes 0.25±0.005 5.2

Figure 19: Cpasule Network results on MultiMNIST [19]

L. Dombetzki — Capsule Networks

24

SLIDE 25

Capsule Networks

How does it perform? - MultiMNIST Network was forced to reconstruct false predictions R:(5, 7) R:(2, 3) R:(0, 8) R:(1, 6) L:(5, 0) L:(4, 3) L:(1, 8) L:(7, 6)

Figure 20: [19]

L. Dombetzki — Capsule Networks

25

SLIDE 26

Capsule Networks

Further research Authors Contribution Hinton et. al Pose capsules and EM-routing [6] Xi et. al Hyperparamter tuning for complex data [25] Phaye et. al Skip connections [17] Rawlinson et. al Unsupervised training [18] Bahadori et. al New routing (Eigen-decomposition) [3] Wang et. al Optimized routing (KL regularization) [22]

L. Dombetzki — Capsule Networks

26

SLIDE 27

Discussion

Introduction Convolutional Neural Networks Capsule Networks Discussion Conclusion Bibliography Appendix

L. Dombetzki — Capsule Networks

27

SLIDE 28

Discussion

Superior to CNNs? Advantages Viewpoint invariance Less training data needed Fewer parameters Better generalization White-box attacks Validatability Challenges Scalability “Explain everything” Entity based structure Loss functions Crowding Unoptimized implementation

L. Dombetzki — Capsule Networks

28

SLIDE 29

Discussion

CapsNets for real world problems

Figure 21: Results from Afshar et. al [1]

Authors Application Benefit Afshar et. al [1] Brain tumor classification Less training data Wang et. al [23] Sentiment analysis with RNNs State-of-the-art performance LaLonde et. al [14] medical image classification Parameter reduction by 95.4%

L. Dombetzki — Capsule Networks

29

SLIDE 30

Conclusion

Introduction Convolutional Neural Networks Capsule Networks Discussion Conclusion Bibliography Appendix

L. Dombetzki — Capsule Networks

30

SLIDE 31

Conclusion

Conclusion Big step towards human vision

Novel network architecture
Inverse graphics through pose vector capsules
Dynamic routing via routing-by-agreement
Multiple significant advantages
Early development phase

But not comparable to CNNs in “mainstream areas”

L. Dombetzki — Capsule Networks

31

SLIDE 32

Questions?

Figure 22: [20]

L. Dombetzki — Capsule Networks

32

SLIDE 33

Bibliography

Introduction Convolutional Neural Networks Capsule Networks Discussion Conclusion Bibliography Appendix

L. Dombetzki — Capsule Networks

33

SLIDE 34

Bibliography

[1] P . Afshar, A. Mohammadi, and K. N. Plataniotis. Brain tumor type classification via capsule networks. CoRR, abs/1802.10200, 2018. [2] Aphex34. Convolutional neural network - max pooling. https://en.wikipedia.org/wiki/Convolutional_neural_network#Max_pooling_shape; last accessed on 2018/06/14. [3]

M. T. Bahadori.

Spectral capsule networks. 2018. [4]

N. Bourdakos.

Understanding capsule networks - ai’s alluring new architecture. https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc; last accessed on 2018/07/05. [5]

D. J. Eck.

Introduction to computer graphics: Hierarchical modeling. http://math.hws.edu/graphicsbook/c2/s4.html; last accessed on 2018/07/05. [6]

G. Hinton, S. Sabour, and N. Frosst.

Matrix capsules with em routing. 2018. [7]

G. E. Hinton.

Askmeanything on reddit. https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/; last accessed

n

2018/07/03.

L. Dombetzki — Capsule Networks

34

SLIDE 35

Bibliography

[8]

G. E. Hinton.

What is wrong with convolutional neural nets? Talk recorded on youtube, https://youtu.be/rTawFwUvnLE; last accessed on 2018/06/14. [9]

G. F. Hinton and F. Cambridge.

Shape representatton in parallel systems. 1981. [10]

J. Hui.

Understanding dynamic routing between capsules. https://jhui.github.io/2017/11/03/Dynamic-Routing-Between-Capsules/; last accessed on 2018/07/05. [11]

H. Kazemi.

Image filtering. http://machinelearninguru.com/computer_vision/basics/convolution/image_convolution_1.html; last accessed on 2018/07/05. [12]

T. Kothari.

Uncovering the intuition behind capsule networks and inverse graphics. https://hackernoon.com/uncovering-the-intuition-behind-capsule-networks -and-inverse-graphics-part-i-7412d121798d last accessed on 2018/06/14. [13]

A. Krizhevsky, I. Sutskever, and G. E. Hinton.

Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012. [14]

R. LaLonde and U. Bagci.

Capsules for Object Segmentation. ArXiv e-prints, Apr. 2018.

L. Dombetzki — Capsule Networks

35

SLIDE 36

Bibliography

[15]

H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng.

Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th annual international conference on machine learning, pages 609–616. ACM, 2009. [16] Mathworks. Convolutional neural network. https://www.mathworks.com/solutions/deep-learning/convolutional-neural-network.html; last accessed

n

2018/06/14. [17]

S. S. R. Phaye, A. Sikka, A. Dhall, and D. Bathula.

Dense and diverse capsule networks: Making the capsules learn better. arXiv preprint arXiv:1805.04001, 2018. [18]

D. Rawlinson, A. Ahmed, and G. Kowadlo.

Sparse unsupervised capsules generalize better. CoRR, abs/1804.06094, 2018. [19]

S. Sabour, N. Frosst, and G. E. Hinton.

Dynamic routing between capsules. In Advances in Neural Information Processing Systems, pages 3859–3869, 2017. [20]

S. SHARMA.

Activation functions: Neural networks. https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6; last accessed on 2018/07/05. [21] P . Veliˇ ckovi´ c. Tikz figure collection. https://github.com/PetarV-/TikZ/tree/master/Multilayerperceptron; last accessed on 2018/07/05.

L. Dombetzki — Capsule Networks

36

SLIDE 37

Bibliography

[22]

D. Wang and Q. Liu.

An optimization view on dynamic routing between capsules. 2018. [23]

Y. Wang, A. Sun, J. Han, Y. Liu, and X. Zhu.

Sentiment analysis by capsules. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pages 1165–1174. International World Wide Web Conferences Steering Committee, 2018. [24]

N. Wolchover.

As machines get smarter, evidence they learn like us. https://www.quantamagazine.org/as-machines-get-smarter-evidence-they-learn-like-us-20130723/; last accessed

n 2018/07/05.

[25]

E. Xi, S. Bing, and Y. Jin.

Capsule Network Performance on Complex Data. ArXiv e-prints, Dec. 2017.

L. Dombetzki — Capsule Networks

37

SLIDE 38

Appendix

Introduction Convolutional Neural Networks Capsule Networks Discussion Conclusion Bibliography Appendix

L. Dombetzki — Capsule Networks

38

SLIDE 39

Appendix

Improvements: EM-Routing

Hinton et. al
4x4 pose matrix capsules
Expectation Maximization Routing
Performance on smallNORB dataset
CNN: 2.56% CapsNet: 1.4%
Testing on unseen viewpoints
L. Dombetzki — Capsule Networks

39

SLIDE 40

Appendix

Routing by agreemen algorithm

1: procedure ROUTING( ˆ

uj|i, r, l)

2:

for all capsule i in layer l and capsule j in layer (l + 1): bij ← 0.

3:

for r iterations do

4:

for all capsule i in layer l: ci ← softmax(bi)

5:

for all capsule j in layer (l + 1): sj ←

i cij ˆ

uj|i

6:

for all capsule j in layer (l + 1): vj ← squash(sj)

7:

for all capsule i in layer l and capsule j in layer (l + 1): bij ← bij + ˆ uj|i.vj return vj

L. Dombetzki — Capsule Networks

40

SLIDE 41

Appendix

More on capsules

L. Dombetzki — Capsule Networks

41

SLIDE 42

Appendix

Math Squashing function vj = ||sj||2 1 + ||sj||2 sj ||sj|| (1) Full Capsule Connection sj =

i

cij ˆ uj|i , ˆ uj|i = Wijui (2) Routing softmax cij = exp(bij)

k exp(bik )

(3) Margin Loss Lk = Tk max(0, m+ − ||vk||)2 + λ (1 − Tk ) max(0, ||vk|| − m−)2 (4)

L. Dombetzki — Capsule Networks

42

SLIDE 43

Appendix

Pooling is unnatural

L. Dombetzki — Capsule Networks

43