Deep Learning with Myia Olivier Breuleux Research Developer, MILA - - PowerPoint PPT Presentation

deep learning with myia
SMART_READER_LITE
LIVE PREVIEW

Deep Learning with Myia Olivier Breuleux Research Developer, MILA - - PowerPoint PPT Presentation

Deep Learning with Myia Olivier Breuleux Research Developer, MILA Arnaud Bergeron (MILA) Bart van Merrinboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep learning Autodiff What it is, how


slide-1
SLIDE 1

Deep Learning with Myia

Olivier Breuleux

Research Developer, MILA

Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain)

slide-2
SLIDE 2

The Needs

What we need from a language for deep learning

Autodiff

What it is, how it works, what the challenges are

Representation

The best representation for our needs

Type system

Flexible inference for performance and robustness

2

slide-3
SLIDE 3

The Needs

What we need from a language for deep learning

Autodiff

What it is, how it works, what the challenges are

Representation

The best representation for our needs

Type system

Flexible inference for performance and robustness

3

slide-4
SLIDE 4

Deep Learning

4

DL algorithms are increasingly complex Feedforward (trivial) Recurrent (loops) Recursive (recursion)

…?

slide-5
SLIDE 5

Deep Learning

5

DL algorithms are increasingly complex

  • More and more language features needed
  • Most existing frameworks are limited
  • High level abstraction increases productivity
  • Focus on the algorithm over implementation details
  • Effortless abstractions encourage their use
slide-6
SLIDE 6

Needs

Debuggable: Clear errors, inspectable, instrumentable. Goal: a language adapted to the needs of machine learning, past and future

6

General purpose: Capable of expressing complex control flow. Differentiable: Should be able to take nth-order derivative of any program. Portable: Serializable, support multiple hardware. Fast: Must leverage parallelism and GPU.

slide-7
SLIDE 7

Needs

Debuggable: Type+shape inference, step debugger. Myia: a language adapted to the needs of machine learning, past and future

7

General purpose: Conditionals, loops, recursion, data structures. Differentiable: Transformation at the intermediate representation level. Fast & portable: Choose from various backends such as NNVM/Relay.

slide-8
SLIDE 8

The Needs

What we need from a language for deep learning

Autodiff

What it is, how it works, what the challenges are

Representation

The best representation for our needs

Type system

Flexible inference for performance and robustness

8

slide-9
SLIDE 9

Differentiability

How to train a model

  • Initialize a model’s parameters
  • Compute some quantity using the parameters
  • Compute a cost or “loss function”
  • Update parameters using the gradient of the loss
  • Rinse and repeat

Gradients

  • Can be computed exactly and automatically
  • But: no mainstream language supports this natively
  • Computational strategies: forward or reverse
  • Implementation strategies: operator overloading or source transform

9

θ ← θ − λ ∂L(f(x; θ), y) ∂θ θ f(x; θ) L(f(x; θ), y)

slide-10
SLIDE 10

Forward vs Reverse

y1 = f(x) y2 = g(y1) y3 = h(y2)

<latexit sha1_base64="FR8JrMfD1GtSrL9l9jua0DwEKV4=">ACFXicbZBNS8MwGMfT+TbrW9Wjl+BQtoOjnYJ6EAZePE6xbrCOkmbpFpa+kKRiKfsUXvwqXjyoeBW8+W1Mtx508w+Bf37P85A8fy9mVEjT/NZKC4tLyvlVX1tfWNzy9jeuRNRwjGxcQi3vGQIyGxJZUMtKJOUGBx0jbG13m9fY94YJG4a1MY9IL0CkPsVIKuQaR6lrwcML6FcfatBx9NRt5NdBVfECHOdgqECj5hoVs25OBOeNVZgKNRyjS+nH+EkIKHEDAnRtcxY9jLEJcWMjHUnESRGeIQGpKtsiAIietlkrTE8UKQP/YirE0o4ob8nMhQIkQae6gyQHIrZWg7/q3UT6Z/1MhrGiSQhnj7kJwzKCOYZwT7lBEuWKoMwp+qvEA8R1iqJHUVgjW78ryxG/Xzunl9UmneFGmUwR7YB1VgVPQBFegBWyAwSN4Bq/gTXvSXrR37WPaWtKmV3wR9rnD9xrmc4=</latexit><latexit sha1_base64="FR8JrMfD1GtSrL9l9jua0DwEKV4=">ACFXicbZBNS8MwGMfT+TbrW9Wjl+BQtoOjnYJ6EAZePE6xbrCOkmbpFpa+kKRiKfsUXvwqXjyoeBW8+W1Mtx508w+Bf37P85A8fy9mVEjT/NZKC4tLyvlVX1tfWNzy9jeuRNRwjGxcQi3vGQIyGxJZUMtKJOUGBx0jbG13m9fY94YJG4a1MY9IL0CkPsVIKuQaR6lrwcML6FcfatBx9NRt5NdBVfECHOdgqECj5hoVs25OBOeNVZgKNRyjS+nH+EkIKHEDAnRtcxY9jLEJcWMjHUnESRGeIQGpKtsiAIietlkrTE8UKQP/YirE0o4ob8nMhQIkQae6gyQHIrZWg7/q3UT6Z/1MhrGiSQhnj7kJwzKCOYZwT7lBEuWKoMwp+qvEA8R1iqJHUVgjW78ryxG/Xzunl9UmneFGmUwR7YB1VgVPQBFegBWyAwSN4Bq/gTXvSXrR37WPaWtKmV3wR9rnD9xrmc4=</latexit><latexit sha1_base64="FR8JrMfD1GtSrL9l9jua0DwEKV4=">ACFXicbZBNS8MwGMfT+TbrW9Wjl+BQtoOjnYJ6EAZePE6xbrCOkmbpFpa+kKRiKfsUXvwqXjyoeBW8+W1Mtx508w+Bf37P85A8fy9mVEjT/NZKC4tLyvlVX1tfWNzy9jeuRNRwjGxcQi3vGQIyGxJZUMtKJOUGBx0jbG13m9fY94YJG4a1MY9IL0CkPsVIKuQaR6lrwcML6FcfatBx9NRt5NdBVfECHOdgqECj5hoVs25OBOeNVZgKNRyjS+nH+EkIKHEDAnRtcxY9jLEJcWMjHUnESRGeIQGpKtsiAIietlkrTE8UKQP/YirE0o4ob8nMhQIkQae6gyQHIrZWg7/q3UT6Z/1MhrGiSQhnj7kJwzKCOYZwT7lBEuWKoMwp+qvEA8R1iqJHUVgjW78ryxG/Xzunl9UmneFGmUwR7YB1VgVPQBFegBWyAwSN4Bq/gTXvSXrR37WPaWtKmV3wR9rnD9xrmc4=</latexit>

f : Rm → Rp g : Rp → Rq h : Rq → Rn

<latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">ACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtptJ2GLFz9A735P3jxP7ADgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZwKTBwcs1i0fSQJoxFxFWMtBNBEPcZafmDx1xvREhaRy9qFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+lk7G8EIzPRjEQp9IwQk7v5EhLuWI+9qZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPyY8KghUbaYCwoDorxCESCv9RZYuob45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">ACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtptJ2GLFz9A735P3jxP7ADgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZwKTBwcs1i0fSQJoxFxFWMtBNBEPcZafmDx1xvREhaRy9qFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+lk7G8EIzPRjEQp9IwQk7v5EhLuWI+9qZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPyY8KghUbaYCwoDorxCESCv9RZYuob45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">ACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtptJ2GLFz9A735P3jxP7ADgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZwKTBwcs1i0fSQJoxFxFWMtBNBEPcZafmDx1xvREhaRy9qFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+lk7G8EIzPRjEQp9IwQk7v5EhLuWI+9qZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPyY8KghUbaYCwoDorxCESCv9RZYuob45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="C39OhB+IczRcjLNINXH29e9lt8M=">AB2HicbZDNSgMxFIXv1L86Vq1rN8EiuCpTN+pOcOygmML7VAymTtaCYzJHeEMvQFXLhRfDB3vo3pz0KtBwIf5yTk3hMXSloKgi+vtrW9s7tX3/cPGv7h0XGz8WTz0gMRa5y04+5RSU1hiRJYb8wyLNYS+e3i3y3jMaK3P9SLMCo4yPtUyl4OSs7qjZCtrBUmwTOmtowVqj5ucwyUWZoSahuLWDTlBQVHFDUic+8PSYsHFlI9x4FDzDG1ULcecs3PnJCzNjTua2NL9+aLimbWzLHY3M04T+zdbmP9lg5LS6iSuigJtVh9lJaKUc4WO7NEGhSkZg64MNLNysSEGy7INeO7Djp/N96E8LJ90w4eAqjDKZzBXTgCm7hHroQgoAEXuDNm3iv3vuqpq37uwEfsn7+Aap5IoM</latexit><latexit sha1_base64="KTxKCtjSTZwxigvlCFitzU1Zg10=">ACY3icdVG9TsMwGHTCXwkFSheGgrCoQExVwsLPhMTCWBChlZqoclwntXCc1HZAVdSVB2TjHVh4A5y2Q0npJ1k63Z31nc9ByqhUtv1lmGvrG5tblW1rp7q7t187qL7IJBOYuDhiegGSBJGOXEVYx0U0FQHDSCV7vC73zRoSkCX9W45T4MYo4DSlGSlP92kcIz2+hFyM1DIL8adKPoSdoNFRIiOR9Uih51lRyZ2uco8K97DkHq1y6yBNu2VPBy4DZw6aYD7tfu3TGyQ4iwlXmCEpe46dKj9HQlHMyMTyMklShF9RHoachQT6efTvibwTDMDGCZCH67glF28kaNYynEcaGeRUZa1gvxP62UqvPZzytNMEY5ni8KMQZXAonw4oIJgxcYaICyozgrxEAmElf4iS5fglJ+8DNzL1k3LfrRBTAKbgADrgCd+ABtIELMPg26kbDODJ+zEPzeNaWacxrq4M/Y578At0ur4=</latexit><latexit sha1_base64="KTxKCtjSTZwxigvlCFitzU1Zg10=">ACY3icdVG9TsMwGHTCXwkFSheGgrCoQExVwsLPhMTCWBChlZqoclwntXCc1HZAVdSVB2TjHVh4A5y2Q0npJ1k63Z31nc9ByqhUtv1lmGvrG5tblW1rp7q7t187qL7IJBOYuDhiegGSBJGOXEVYx0U0FQHDSCV7vC73zRoSkCX9W45T4MYo4DSlGSlP92kcIz2+hFyM1DIL8adKPoSdoNFRIiOR9Uih51lRyZ2uco8K97DkHq1y6yBNu2VPBy4DZw6aYD7tfu3TGyQ4iwlXmCEpe46dKj9HQlHMyMTyMklShF9RHoachQT6efTvibwTDMDGCZCH67glF28kaNYynEcaGeRUZa1gvxP62UqvPZzytNMEY5ni8KMQZXAonw4oIJgxcYaICyozgrxEAmElf4iS5fglJ+8DNzL1k3LfrRBTAKbgADrgCd+ABtIELMPg26kbDODJ+zEPzeNaWacxrq4M/Y578At0ur4=</latexit><latexit sha1_base64="EkpZKbT9mWbEBvTrFrQGFQwFolA=">ACbnicdVE7T8MwGHTCq4RXKQNDQVhUoE5VwsJjqsTCWCpKzVR5bhOatV51HZAVdSVH8jGf2DhH+C0GUpKP8nS6e4+Xx2Y0aFNM0vTd/Y3NreKe0ae/sHh0fl48qriBKOSQdHLOI9FwnCaEg6kpGejEnKHAZ6brjx0zvhEuaBS+yGlMnAD5IfUoRlJRg/KHB68foB0gOXLdtD0bBNDm1B9JxHn0vizE0LYNv+CO17knmXtUcE/WuVWQmtkw5wNXgZWDGsinNSh/2sMIJwEJWZIiL5lxtJEZcUMzIz7ESQGOEx8klfwRAFRDjpvK8ZvFLMEHoRVyeUcM4ub6QoEGIauMqZRFLSP/0/qJ9O6clIZxIkmIFxd5CYMygln5cEg5wZJNFUCYU5UV4hHiCEv1RYqwSo+eRV0bhr3DfPZrDXbeRslUAWXoA4scAua4Am0QAdg8K1VtKp2pv3op/q5frGw6lq+cwL+jF7/Bf/Ru6Q=</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">ACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtptJ2GLFz9A735P3jxP7ADgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZwKTBwcs1i0fSQJoxFxFWMtBNBEPcZafmDx1xvREhaRy9qFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+lk7G8EIzPRjEQp9IwQk7v5EhLuWI+9qZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPyY8KghUbaYCwoDorxCESCv9RZYuob45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">ACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtptJ2GLFz9A735P3jxP7ADgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZwKTBwcs1i0fSQJoxFxFWMtBNBEPcZafmDx1xvREhaRy9qFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+lk7G8EIzPRjEQp9IwQk7v5EhLuWI+9qZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPyY8KghUbaYCwoDorxCESCv9RZYuob45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">ACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtptJ2GLFz9A735P3jxP7ADgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZwKTBwcs1i0fSQJoxFxFWMtBNBEPcZafmDx1xvREhaRy9qFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+lk7G8EIzPRjEQp9IwQk7v5EhLuWI+9qZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPyY8KghUbaYCwoDorxCESCv9RZYuob45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">ACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtptJ2GLFz9A735P3jxP7ADgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZwKTBwcs1i0fSQJoxFxFWMtBNBEPcZafmDx1xvREhaRy9qFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+lk7G8EIzPRjEQp9IwQk7v5EhLuWI+9qZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPyY8KghUbaYCwoDorxCESCv9RZYuob45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">ACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtptJ2GLFz9A735P3jxP7ADgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZwKTBwcs1i0fSQJoxFxFWMtBNBEPcZafmDx1xvREhaRy9qFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+lk7G8EIzPRjEQp9IwQk7v5EhLuWI+9qZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPyY8KghUbaYCwoDorxCESCv9RZYuob45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit>

Jhgf(x) | {z }

n⇥m

= Jh(y2) | {z }

n⇥q

Jg(y1) | {z }

q⇥p

Jf(x) | {z }

p⇥m

<latexit sha1_base64="bA4x/LBIAvRyVfls/ao3G8bvaiM=">ADSXicpVLNa9RAFJ9k/ajr17YevQwuwvayJKWgHoSCF+mpimsLmyVOJi+boZNJOvMiXYb8fV56s0/wosHFU9O0oj2Qyj4YHg/3sfv/d5jkoKg0Hw2fMHN27eur12Z3j3v0HD0frG+9NWsOM17KUh8kzIAUCmYoUMJBpYEViYT95PBVm9/CNqIUr3DVQWLgi2VyARn6ELxuvchqlUKOtGMg40KhnmS2d3Y5jTiQnO67H3WTI43mya2ikYoCjC0aOjLYVTlTGFZ2CgpZWpWhXM2QjGTpvVkDaTphlePSWfrOKt86xHzX9yLh1n2HEe/easrse5+U/OrN+9+rP7tRnj0TiYBp3RyDswZj0thePTqO05HUBCrlkxszDoMKFZRoFl9BKNFAxfsiWMHdQMSdoYbuhDX3qIinNSu2eQtpF/+6wrDCtVlfZbmcu5trgVbl5jdnzhRWqhEUPxuU1ZJiSdt/RVOhgaNcOcC4Fk4r5Tlzd0T3+9ojhBdXvgxmW9MX0+DN9njnbX+NfKYPCETEpJnZIe8JntkRrj3yfvifO+yf+V/+H/Os1Pf6nkfknA0GvwDrPhlM</latexit><latexit sha1_base64="bA4x/LBIAvRyVfls/ao3G8bvaiM=">ADSXicpVLNa9RAFJ9k/ajr17YevQwuwvayJKWgHoSCF+mpimsLmyVOJi+boZNJOvMiXYb8fV56s0/wosHFU9O0oj2Qyj4YHg/3sfv/d5jkoKg0Hw2fMHN27eur12Z3j3v0HD0frG+9NWsOM17KUh8kzIAUCmYoUMJBpYEViYT95PBVm9/CNqIUr3DVQWLgi2VyARn6ELxuvchqlUKOtGMg40KhnmS2d3Y5jTiQnO67H3WTI43mya2ikYoCjC0aOjLYVTlTGFZ2CgpZWpWhXM2QjGTpvVkDaTphlePSWfrOKt86xHzX9yLh1n2HEe/easrse5+U/OrN+9+rP7tRnj0TiYBp3RyDswZj0thePTqO05HUBCrlkxszDoMKFZRoFl9BKNFAxfsiWMHdQMSdoYbuhDX3qIinNSu2eQtpF/+6wrDCtVlfZbmcu5trgVbl5jdnzhRWqhEUPxuU1ZJiSdt/RVOhgaNcOcC4Fk4r5Tlzd0T3+9ojhBdXvgxmW9MX0+DN9njnbX+NfKYPCETEpJnZIe8JntkRrj3yfvifO+yf+V/+H/Os1Pf6nkfknA0GvwDrPhlM</latexit><latexit sha1_base64="bA4x/LBIAvRyVfls/ao3G8bvaiM=">ADSXicpVLNa9RAFJ9k/ajr17YevQwuwvayJKWgHoSCF+mpimsLmyVOJi+boZNJOvMiXYb8fV56s0/wosHFU9O0oj2Qyj4YHg/3sfv/d5jkoKg0Hw2fMHN27eur12Z3j3v0HD0frG+9NWsOM17KUh8kzIAUCmYoUMJBpYEViYT95PBVm9/CNqIUr3DVQWLgi2VyARn6ELxuvchqlUKOtGMg40KhnmS2d3Y5jTiQnO67H3WTI43mya2ikYoCjC0aOjLYVTlTGFZ2CgpZWpWhXM2QjGTpvVkDaTphlePSWfrOKt86xHzX9yLh1n2HEe/easrse5+U/OrN+9+rP7tRnj0TiYBp3RyDswZj0thePTqO05HUBCrlkxszDoMKFZRoFl9BKNFAxfsiWMHdQMSdoYbuhDX3qIinNSu2eQtpF/+6wrDCtVlfZbmcu5trgVbl5jdnzhRWqhEUPxuU1ZJiSdt/RVOhgaNcOcC4Fk4r5Tlzd0T3+9ojhBdXvgxmW9MX0+DN9njnbX+NfKYPCETEpJnZIe8JntkRrj3yfvifO+yf+V/+H/Os1Pf6nkfknA0GvwDrPhlM</latexit>

The derivative of a straight composition of functions is the multiplication of their Jacobians

In what order?

10

slide-11
SLIDE 11

Forward vs Reverse

( Jh(y2) | {z }

n×q

Jg(y1) | {z }

q×p

| {z }

n×p

) Jf(x) | {z }

p×m

<latexit sha1_base64="JaQOtQ2WAS20Br+08aKQTmorcBM=">ADGXiclZJLb9QwEMed8Crh0S0cuViskLaXVIhUW6VuCBOBbG0mYVOc5k16rtpPYEdWXlc3DpV+mFAyCOcOLb4KRb0QcgGMny3zOe34xHzmspLMbxjyC8dv3GzVtrt6M7d+/dXx9sPHhnq8ZwmPBKVmY/Zxak0DBgRL2awNM5RL28oMXzvPRgrKv0WlzXMFJtrUQrO0LuyjSCOXJpXsrBL5TeXIhxhj3UGinbUtlHa6AJMbhgHd+GQKoaLvHSvsVomW1tm3mNE1RKLD0CfWC6axUv9T4Bdz7plJzw8Y9ZtdL6EP/4NvflHdDk6sH1GUn9W7MdMcoGw3gc90avimQlhmRlu9ngW1pUvFGgkUtm7TSJa5w5ZlBwCV2PFmrGD9gcpl5q5juaub5qS594T0HLyvilkfbe8xmOKds16292z7OXY53zd7Fpg+X2zAldNwianxYqG0mxot0/oYUwFEuvWDcCN8r5QvmB4n+N3VDSC4/+aqYbI2fj+PXT4c7b1bTWCOPyGMyIgl5RnbIS7JLJoQH4KT4FPwOTwOP4Zfwq+nV8NglfOQXLDw+089SAWq</latexit><latexit sha1_base64="JaQOtQ2WAS20Br+08aKQTmorcBM=">ADGXiclZJLb9QwEMed8Crh0S0cuViskLaXVIhUW6VuCBOBbG0mYVOc5k16rtpPYEdWXlc3DpV+mFAyCOcOLb4KRb0QcgGMny3zOe34xHzmspLMbxjyC8dv3GzVtrt6M7d+/dXx9sPHhnq8ZwmPBKVmY/Zxak0DBgRL2awNM5RL28oMXzvPRgrKv0WlzXMFJtrUQrO0LuyjSCOXJpXsrBL5TeXIhxhj3UGinbUtlHa6AJMbhgHd+GQKoaLvHSvsVomW1tm3mNE1RKLD0CfWC6axUv9T4Bdz7plJzw8Y9ZtdL6EP/4NvflHdDk6sH1GUn9W7MdMcoGw3gc90avimQlhmRlu9ngW1pUvFGgkUtm7TSJa5w5ZlBwCV2PFmrGD9gcpl5q5juaub5qS594T0HLyvilkfbe8xmOKds16292z7OXY53zd7Fpg+X2zAldNwianxYqG0mxot0/oYUwFEuvWDcCN8r5QvmB4n+N3VDSC4/+aqYbI2fj+PXT4c7b1bTWCOPyGMyIgl5RnbIS7JLJoQH4KT4FPwOTwOP4Zfwq+nV8NglfOQXLDw+089SAWq</latexit><latexit sha1_base64="JaQOtQ2WAS20Br+08aKQTmorcBM=">ADGXiclZJLb9QwEMed8Crh0S0cuViskLaXVIhUW6VuCBOBbG0mYVOc5k16rtpPYEdWXlc3DpV+mFAyCOcOLb4KRb0QcgGMny3zOe34xHzmspLMbxjyC8dv3GzVtrt6M7d+/dXx9sPHhnq8ZwmPBKVmY/Zxak0DBgRL2awNM5RL28oMXzvPRgrKv0WlzXMFJtrUQrO0LuyjSCOXJpXsrBL5TeXIhxhj3UGinbUtlHa6AJMbhgHd+GQKoaLvHSvsVomW1tm3mNE1RKLD0CfWC6axUv9T4Bdz7plJzw8Y9ZtdL6EP/4NvflHdDk6sH1GUn9W7MdMcoGw3gc90avimQlhmRlu9ngW1pUvFGgkUtm7TSJa5w5ZlBwCV2PFmrGD9gcpl5q5juaub5qS594T0HLyvilkfbe8xmOKds16292z7OXY53zd7Fpg+X2zAldNwianxYqG0mxot0/oYUwFEuvWDcCN8r5QvmB4n+N3VDSC4/+aqYbI2fj+PXT4c7b1bTWCOPyGMyIgl5RnbIS7JLJoQH4KT4FPwOTwOP4Zfwq+nV8NglfOQXLDw+089SAWq</latexit>

Jh(y2) | {z }

n×q

( Jg(y1) | {z }

q×p

Jf(x) | {z }

p×m

| {z }

q×m

)

<latexit sha1_base64="JFRosXlhA1ZCM9t94Snf5Ck5Vc=">ADGHiclZJNb9QwEIad8FXC17YcuViskLaXJamQWm6VuFScCmJpc0qcpzJrlXbSe0J6srK3+DCX+HCARDX3vg3OlWKi1FMFKUVzOeZ16PnNdSWIzjn0F4+at23fW7kb37j94+GiwvHeVo3hMOGVrMxhzixIoWGCAiUc1gaYyiUc5EevuvrBzBWVPodLmuYKTbXohScoU9l68HztF4wjZVyaV7Jwi6V/7kU4QR7ujNQtKO2jdJGF2Bywzi4VDFc5KV7nS1Gy2xrs20zp2mKQoGlx230P6xrwHMPTnrw8Tm49o3/YHbzWrPl6KQn1udE1UYXB6i/O+/AUTYxuO4D3pVJCsxJKvYzwanaVHxRoFGLpm10ySuceaYQcEldFYt1IwfsTlMvdTMO5m5fmpLn/lMQcvK+E8j7bMXOxTtjPrT3a3tJdrXfJPtWmD5c7MCV03CJqfDSobSbGi3TOhTDAUS69YNwI75XyBfP7RP+YuiUkl698VUy2xi/H8ZsXw923q2skSfkKRmRhGyTXbJH9smE8OBj8Dn4GnwLP4Vfwu/hj7OjYbDqeUx+i/D0F0lCBZY=</latexit><latexit sha1_base64="JFRosXlhA1ZCM9t94Snf5Ck5Vc=">ADGHiclZJNb9QwEIad8FXC17YcuViskLaXJamQWm6VuFScCmJpc0qcpzJrlXbSe0J6srK3+DCX+HCARDX3vg3OlWKi1FMFKUVzOeZ16PnNdSWIzjn0F4+at23fW7kb37j94+GiwvHeVo3hMOGVrMxhzixIoWGCAiUc1gaYyiUc5EevuvrBzBWVPodLmuYKTbXohScoU9l68HztF4wjZVyaV7Jwi6V/7kU4QR7ujNQtKO2jdJGF2Bywzi4VDFc5KV7nS1Gy2xrs20zp2mKQoGlx230P6xrwHMPTnrw8Tm49o3/YHbzWrPl6KQn1udE1UYXB6i/O+/AUTYxuO4D3pVJCsxJKvYzwanaVHxRoFGLpm10ySuceaYQcEldFYt1IwfsTlMvdTMO5m5fmpLn/lMQcvK+E8j7bMXOxTtjPrT3a3tJdrXfJPtWmD5c7MCV03CJqfDSobSbGi3TOhTDAUS69YNwI75XyBfP7RP+YuiUkl698VUy2xi/H8ZsXw923q2skSfkKRmRhGyTXbJH9smE8OBj8Dn4GnwLP4Vfwu/hj7OjYbDqeUx+i/D0F0lCBZY=</latexit><latexit sha1_base64="JFRosXlhA1ZCM9t94Snf5Ck5Vc=">ADGHiclZJNb9QwEIad8FXC17YcuViskLaXJamQWm6VuFScCmJpc0qcpzJrlXbSe0J6srK3+DCX+HCARDX3vg3OlWKi1FMFKUVzOeZ16PnNdSWIzjn0F4+at23fW7kb37j94+GiwvHeVo3hMOGVrMxhzixIoWGCAiUc1gaYyiUc5EevuvrBzBWVPodLmuYKTbXohScoU9l68HztF4wjZVyaV7Jwi6V/7kU4QR7ujNQtKO2jdJGF2Bywzi4VDFc5KV7nS1Gy2xrs20zp2mKQoGlx230P6xrwHMPTnrw8Tm49o3/YHbzWrPl6KQn1udE1UYXB6i/O+/AUTYxuO4D3pVJCsxJKvYzwanaVHxRoFGLpm10ySuceaYQcEldFYt1IwfsTlMvdTMO5m5fmpLn/lMQcvK+E8j7bMXOxTtjPrT3a3tJdrXfJPtWmD5c7MCV03CJqfDSobSbGi3TOhTDAUS69YNwI75XyBfP7RP+YuiUkl698VUy2xi/H8ZsXw923q2skSfkKRmRhGyTXbJH9smE8OBj8Dn4GnwLP4Vfwu/hj7OjYbDqeUx+i/D0F0lCBZY=</latexit>

Forward Reverse Cost Cost

qpm + nqm

<latexit sha1_base64="PsrxQgC2l/owSl1oC8PH6vetvM=">AB73icbVBNSwMxEJ2tX7V+VT16CRZBEMquFNRbwYvHKq6tEvJptk2NMluk6xQlv4KLx5UvPp3vPlvTNs9aOuDgcd7M8zMCxPOtHdb6ewsrq2vlHcLG1t7+zulfcPHnScKkJ9EvNYtUKsKWeS+oYZTluJoliEnDbD4fXUbz5RpVks7804oYHAfckiRrCx0uMoEegMyZHolitu1Z0BLRMvJxXI0eiWvzq9mKSCSkM41rtuYkJMqwMI5xOSp1U0wSTIe7TtqUSC6qDbHbwBJ1YpYeiWNmSBs3U3xMZFlqPRWg7BTYDvehNxf+8dmqiyBjMkNlWS+KEo5MjGafo96TFi+NgSTBSztyIywAoTYzMq2RC8xZeXiX9evaq6t7VK/S5PowhHcAyn4MEF1OEGuADAQHP8ApvjnJenHfnY95acPKZQ/gD5/MHZ/6PvA=</latexit><latexit sha1_base64="PsrxQgC2l/owSl1oC8PH6vetvM=">AB73icbVBNSwMxEJ2tX7V+VT16CRZBEMquFNRbwYvHKq6tEvJptk2NMluk6xQlv4KLx5UvPp3vPlvTNs9aOuDgcd7M8zMCxPOtHdb6ewsrq2vlHcLG1t7+zulfcPHnScKkJ9EvNYtUKsKWeS+oYZTluJoliEnDbD4fXUbz5RpVks7804oYHAfckiRrCx0uMoEegMyZHolitu1Z0BLRMvJxXI0eiWvzq9mKSCSkM41rtuYkJMqwMI5xOSp1U0wSTIe7TtqUSC6qDbHbwBJ1YpYeiWNmSBs3U3xMZFlqPRWg7BTYDvehNxf+8dmqiyBjMkNlWS+KEo5MjGafo96TFi+NgSTBSztyIywAoTYzMq2RC8xZeXiX9evaq6t7VK/S5PowhHcAyn4MEF1OEGuADAQHP8ApvjnJenHfnY95acPKZQ/gD5/MHZ/6PvA=</latexit><latexit sha1_base64="PsrxQgC2l/owSl1oC8PH6vetvM=">AB73icbVBNSwMxEJ2tX7V+VT16CRZBEMquFNRbwYvHKq6tEvJptk2NMluk6xQlv4KLx5UvPp3vPlvTNs9aOuDgcd7M8zMCxPOtHdb6ewsrq2vlHcLG1t7+zulfcPHnScKkJ9EvNYtUKsKWeS+oYZTluJoliEnDbD4fXUbz5RpVks7804oYHAfckiRrCx0uMoEegMyZHolitu1Z0BLRMvJxXI0eiWvzq9mKSCSkM41rtuYkJMqwMI5xOSp1U0wSTIe7TtqUSC6qDbHbwBJ1YpYeiWNmSBs3U3xMZFlqPRWg7BTYDvehNxf+8dmqiyBjMkNlWS+KEo5MjGafo96TFi+NgSTBSztyIywAoTYzMq2RC8xZeXiX9evaq6t7VK/S5PowhHcAyn4MEF1OEGuADAQHP8ApvjnJenHfnY95acPKZQ/gD5/MHZ/6PvA=</latexit>

nqp + npm

<latexit sha1_base64="kaAj2Joxew7rtxsfYL7H8PSMEWY=">AB73icbVBNSwMxEJ2tX7V+VT16CRZBEMpWBPVW8OKximsr7VKyabYNTbIxyQpl6a/w4kHFq3/Hm/GtN2Dtj4YeLw3w8y8SHFmrO9/e4Wl5ZXVteJ6aWNza3unvLt3b5JUExqQhCe6FWFDOZM0sMxy2lKaYhFx2oyGVxO/+US1Ym8syNFQ4H7ksWMYOukB/mo0AmSnTLFb/qT4EWS0nFcjR6Ja/Or2EpIJKSzg2pl3zlQ0zrC0jnI5LndRQhckQ92nbUYkFNWE2PXiMjpzSQ3GiXUmLpurviQwLY0Yicp0C24GZ9ybif147tfFmDGpUkslmS2KU45sgibfox7TlFg+cgQTzdytiAywxsS6jEouhNr8y4skOK1eVv2bs0r9Nk+jCAdwCMdQg3OowzU0IACAp7hFd487b14797HrLXg5TP78Afe5w9n/o+8</latexit><latexit sha1_base64="kaAj2Joxew7rtxsfYL7H8PSMEWY=">AB73icbVBNSwMxEJ2tX7V+VT16CRZBEMpWBPVW8OKximsr7VKyabYNTbIxyQpl6a/w4kHFq3/Hm/GtN2Dtj4YeLw3w8y8SHFmrO9/e4Wl5ZXVteJ6aWNza3unvLt3b5JUExqQhCe6FWFDOZM0sMxy2lKaYhFx2oyGVxO/+US1Ym8syNFQ4H7ksWMYOukB/mo0AmSnTLFb/qT4EWS0nFcjR6Ja/Or2EpIJKSzg2pl3zlQ0zrC0jnI5LndRQhckQ92nbUYkFNWE2PXiMjpzSQ3GiXUmLpurviQwLY0Yicp0C24GZ9ybif147tfFmDGpUkslmS2KU45sgibfox7TlFg+cgQTzdytiAywxsS6jEouhNr8y4skOK1eVv2bs0r9Nk+jCAdwCMdQg3OowzU0IACAp7hFd487b14797HrLXg5TP78Afe5w9n/o+8</latexit><latexit sha1_base64="kaAj2Joxew7rtxsfYL7H8PSMEWY=">AB73icbVBNSwMxEJ2tX7V+VT16CRZBEMpWBPVW8OKximsr7VKyabYNTbIxyQpl6a/w4kHFq3/Hm/GtN2Dtj4YeLw3w8y8SHFmrO9/e4Wl5ZXVteJ6aWNza3unvLt3b5JUExqQhCe6FWFDOZM0sMxy2lKaYhFx2oyGVxO/+US1Ym8syNFQ4H7ksWMYOukB/mo0AmSnTLFb/qT4EWS0nFcjR6Ja/Or2EpIJKSzg2pl3zlQ0zrC0jnI5LndRQhckQ92nbUYkFNWE2PXiMjpzSQ3GiXUmLpurviQwLY0Yicp0C24GZ9ybif147tfFmDGpUkslmS2KU45sgibfox7TlFg+cgQTzdytiAywxsS6jEouhNr8y4skOK1eVv2bs0r9Nk+jCAdwCMdQg3OowzU0IACAp7hFd487b14797HrLXg5TP78Afe5w9n/o+8</latexit>

= m(qp + nq)

<latexit sha1_base64="Fj7AFZDYpM3/JTcVuDvNC8XBago=">ACBnicbVBNSwMxEM3Wr1q/qh4FCRahIpRdEdSDUPDisYq1hbaUbDptQ7PZbTIrlqU3L/4VLx5UvPobvPlvTD8OWn0w8HhvJpN5fiSFQdf9clJz8wuLS+nlzMrq2vpGdnPr1oSx5lDmoQx1WcGpFBQRoESqpEGFvgSKn7vYuRX7kAbEaobHETQCFhHibgDK3UzO6e06SOcI/jpxJfxjAMhvl+RA+p6h80szm34I5B/xJvSnJkilIz+1lvhTwOQCGXzJia50bYSJhGwSUM/XYQMR4j3WgZqliAZhGMl4+pPtWadF2qG0pGP150TCAmMGgW87A4ZdM+uNxP+8Wozt0YiVBQjKD5Z1I4lxZCOQqEtoYGjHFjCuBb2r5R3mWYcbXQZG4I3e/JfUj4qnBXcq+Nc8XqaRprskD2SJx45IUVySUqkTDh5IE/khbw6j86z8+a8T1pTznRm/yC8/ENxgKY2A=</latexit><latexit sha1_base64="Fj7AFZDYpM3/JTcVuDvNC8XBago=">ACBnicbVBNSwMxEM3Wr1q/qh4FCRahIpRdEdSDUPDisYq1hbaUbDptQ7PZbTIrlqU3L/4VLx5UvPobvPlvTD8OWn0w8HhvJpN5fiSFQdf9clJz8wuLS+nlzMrq2vpGdnPr1oSx5lDmoQx1WcGpFBQRoESqpEGFvgSKn7vYuRX7kAbEaobHETQCFhHibgDK3UzO6e06SOcI/jpxJfxjAMhvl+RA+p6h80szm34I5B/xJvSnJkilIz+1lvhTwOQCGXzJia50bYSJhGwSUM/XYQMR4j3WgZqliAZhGMl4+pPtWadF2qG0pGP150TCAmMGgW87A4ZdM+uNxP+8Wozt0YiVBQjKD5Z1I4lxZCOQqEtoYGjHFjCuBb2r5R3mWYcbXQZG4I3e/JfUj4qnBXcq+Nc8XqaRprskD2SJx45IUVySUqkTDh5IE/khbw6j86z8+a8T1pTznRm/yC8/ENxgKY2A=</latexit><latexit sha1_base64="Fj7AFZDYpM3/JTcVuDvNC8XBago=">ACBnicbVBNSwMxEM3Wr1q/qh4FCRahIpRdEdSDUPDisYq1hbaUbDptQ7PZbTIrlqU3L/4VLx5UvPobvPlvTD8OWn0w8HhvJpN5fiSFQdf9clJz8wuLS+nlzMrq2vpGdnPr1oSx5lDmoQx1WcGpFBQRoESqpEGFvgSKn7vYuRX7kAbEaobHETQCFhHibgDK3UzO6e06SOcI/jpxJfxjAMhvl+RA+p6h80szm34I5B/xJvSnJkilIz+1lvhTwOQCGXzJia50bYSJhGwSUM/XYQMR4j3WgZqliAZhGMl4+pPtWadF2qG0pGP150TCAmMGgW87A4ZdM+uNxP+8Wozt0YiVBQjKD5Z1I4lxZCOQqEtoYGjHFjCuBb2r5R3mWYcbXQZG4I3e/JfUj4qnBXcq+Nc8XqaRprskD2SJx45IUVySUqkTDh5IE/khbw6j86z8+a8T1pTznRm/yC8/ENxgKY2A=</latexit>

= n(qp + pm)

<latexit sha1_base64="qCVs3i7yfWKSyoNHEwdEcOm/FL8=">ACNXicbZDPSgMxEMaz9V+t/6oevQSLUFHKVhT1IAhePKpYW2hLyWZn29Bsdk1mxbLsA/g0XvVRvHgSr76AB9Pag1Y/GPj4ZiYZfl4shUHXfXFyU9Mzs3P5+cLC4tLySnF17cZEieZQ45GMdMNjBqRQUEOBEhqxBhZ6Eupe/2zYr9+BNiJS1ziIoR2yrhKB4Axt1CmWTmjaQrjH0VOpz3S/qwFUprLybUx3aBxu2ym34o5E/5rq2JTIWBed4mfLj3gSgkIumTHNqhtjO2UaBZeQFVqJgZjxPutC01rFQjDtdHRBRrds4tMg0rYU0lH6cyNloTGD0LOTIcOemewNw/96zQSDo3YqVJwgKP79UZBIihEdkqG+0MBRDqxhXAt7K+U9phlHy6/Q8iGwjCcxpbrZamFsetWDmy5WcHiqk7C+Wtqe5Xjinu5Xzq9GnPLkw2yScqkSg7JKTknF6RGOHkgj+SJPDtPzqvz5rx/j+ac8c46+SXn4wtE86rd</latexit><latexit sha1_base64="qCVs3i7yfWKSyoNHEwdEcOm/FL8=">ACNXicbZDPSgMxEMaz9V+t/6oevQSLUFHKVhT1IAhePKpYW2hLyWZn29Bsdk1mxbLsA/g0XvVRvHgSr76AB9Pag1Y/GPj4ZiYZfl4shUHXfXFyU9Mzs3P5+cLC4tLySnF17cZEieZQ45GMdMNjBqRQUEOBEhqxBhZ6Eupe/2zYr9+BNiJS1ziIoR2yrhKB4Axt1CmWTmjaQrjH0VOpz3S/qwFUprLybUx3aBxu2ym34o5E/5rq2JTIWBed4mfLj3gSgkIumTHNqhtjO2UaBZeQFVqJgZjxPutC01rFQjDtdHRBRrds4tMg0rYU0lH6cyNloTGD0LOTIcOemewNw/96zQSDo3YqVJwgKP79UZBIihEdkqG+0MBRDqxhXAt7K+U9phlHy6/Q8iGwjCcxpbrZamFsetWDmy5WcHiqk7C+Wtqe5Xjinu5Xzq9GnPLkw2yScqkSg7JKTknF6RGOHkgj+SJPDtPzqvz5rx/j+ac8c46+SXn4wtE86rd</latexit><latexit sha1_base64="qCVs3i7yfWKSyoNHEwdEcOm/FL8=">ACNXicbZDPSgMxEMaz9V+t/6oevQSLUFHKVhT1IAhePKpYW2hLyWZn29Bsdk1mxbLsA/g0XvVRvHgSr76AB9Pag1Y/GPj4ZiYZfl4shUHXfXFyU9Mzs3P5+cLC4tLySnF17cZEieZQ45GMdMNjBqRQUEOBEhqxBhZ6Eupe/2zYr9+BNiJS1ziIoR2yrhKB4Axt1CmWTmjaQrjH0VOpz3S/qwFUprLybUx3aBxu2ym34o5E/5rq2JTIWBed4mfLj3gSgkIumTHNqhtjO2UaBZeQFVqJgZjxPutC01rFQjDtdHRBRrds4tMg0rYU0lH6cyNloTGD0LOTIcOemewNw/96zQSDo3YqVJwgKP79UZBIihEdkqG+0MBRDqxhXAt7K+U9phlHy6/Q8iGwjCcxpbrZamFsetWDmy5WcHiqk7C+Wtqe5Xjinu5Xzq9GnPLkw2yScqkSg7JKTknF6RGOHkgj+SJPDtPzqvz5rx/j+ac8c46+SXn4wtE86rd</latexit>

11

slide-12
SLIDE 12

Forward vs Reverse

Forward mode is good when there are few inputs.

  • Easy to implement: dual numbers.

x → ✓ y1, dy1 dx ◆ → ✓ y2, dy2 dx ◆ → ✓ y3, dy3 dx ◆

<latexit sha1_base64="PFsqI4r5+MEFXWPxzrIbvPhf4=">ACnXichZFNSxBEIZ7JiazYdrPOTgpZMlYECWmVUx3gRBPIhoyGaFnWXp6amZbanZ+iuMS7N/AJ/oT/Duwd7Pw7rB6Sg4OWtp+nirbiUwmAQ3Hn+m5W371bX3jc+fPz0eb258eWvKSrNocsLWeirmBmQkEXBUq4KjWwPJbQi8fH03nvGrQRhfqDkxIGOcuUSAVn6Kxh8/aGRlpkI2RaF/9oJCHF7ckw3KFRqhm3idO1TW7qOfXzVbqzRHf+S+8u0bvL9LDZCtrBrOhLES5EiyzqYth8iJKCVzko5JIZ0w+DEgeWaRcQt2IKgMl42OWQd9JxXIwAzsLraY/nJPQtNCuFdKZu/zCstyYSR47Mmc4Ms9nU/O1Wb/C9NfAClVWCIrP0orSbGg0wvQRGjgKCdOMK6F25XyEXOBoLtTI0ogdbecrWMTpseZBlC1VlcWxfGTtDedx3UDRdX+Dycl6LbaR+2g8u91tHvRW5rZIt8J9skJAfkiJySC9IlnNx7Xz3qfOpf+Kf+edz1PcWbzbJk/J7jybizgI=</latexit><latexit sha1_base64="PFsqI4r5+MEFXWPxzrIbvPhf4=">ACnXichZFNSxBEIZ7JiazYdrPOTgpZMlYECWmVUx3gRBPIhoyGaFnWXp6amZbanZ+iuMS7N/AJ/oT/Duwd7Pw7rB6Sg4OWtp+nirbiUwmAQ3Hn+m5W371bX3jc+fPz0eb258eWvKSrNocsLWeirmBmQkEXBUq4KjWwPJbQi8fH03nvGrQRhfqDkxIGOcuUSAVn6Kxh8/aGRlpkI2RaF/9oJCHF7ckw3KFRqhm3idO1TW7qOfXzVbqzRHf+S+8u0bvL9LDZCtrBrOhLES5EiyzqYth8iJKCVzko5JIZ0w+DEgeWaRcQt2IKgMl42OWQd9JxXIwAzsLraY/nJPQtNCuFdKZu/zCstyYSR47Mmc4Ms9nU/O1Wb/C9NfAClVWCIrP0orSbGg0wvQRGjgKCdOMK6F25XyEXOBoLtTI0ogdbecrWMTpseZBlC1VlcWxfGTtDedx3UDRdX+Dycl6LbaR+2g8u91tHvRW5rZIt8J9skJAfkiJySC9IlnNx7Xz3qfOpf+Kf+edz1PcWbzbJk/J7jybizgI=</latexit><latexit sha1_base64="PFsqI4r5+MEFXWPxzrIbvPhf4=">ACnXichZFNSxBEIZ7JiazYdrPOTgpZMlYECWmVUx3gRBPIhoyGaFnWXp6amZbanZ+iuMS7N/AJ/oT/Duwd7Pw7rB6Sg4OWtp+nirbiUwmAQ3Hn+m5W371bX3jc+fPz0eb258eWvKSrNocsLWeirmBmQkEXBUq4KjWwPJbQi8fH03nvGrQRhfqDkxIGOcuUSAVn6Kxh8/aGRlpkI2RaF/9oJCHF7ckw3KFRqhm3idO1TW7qOfXzVbqzRHf+S+8u0bvL9LDZCtrBrOhLES5EiyzqYth8iJKCVzko5JIZ0w+DEgeWaRcQt2IKgMl42OWQd9JxXIwAzsLraY/nJPQtNCuFdKZu/zCstyYSR47Mmc4Ms9nU/O1Wb/C9NfAClVWCIrP0orSbGg0wvQRGjgKCdOMK6F25XyEXOBoLtTI0ogdbecrWMTpseZBlC1VlcWxfGTtDedx3UDRdX+Dycl6LbaR+2g8u91tHvRW5rZIt8J9skJAfkiJySC9IlnNx7Xz3qfOpf+Kf+edz1PcWbzbJk/J7jybizgI=</latexit>

12

Reverse mode is good when there are few outputs.

  • Hard to implement: execution is reversed.

x → y1 → y2 → y3 → dy3 dy2 → dy3 dy1 → dy3 dx

<latexit sha1_base64="QdhX18/+A7E+bYo918Uf7uHg8S4=">ACm3icfVHbSsNAEN3Ee71VxSdBF4vg5SkKupbQR9EfFAxKjSlbDaTdOlmE3Y3agn5AD/Rr/AHfHB7edAqDsxy5pwZdjgTZJwp7Tjvlj01PTM7N79QWVxaXlmtrq0/qDSXFDya8lQ+BUQBZwI8zTSHp0wCSQIOj0HvfKA/PoNULBX3up9BOyGxYBGjRBuqU317xb5kcVcTKdMX3O+4E3Vjoj78UfuRJLQIDV0O3kb5n+r+o76WnWrNqTvDwL+BOwY1NI6bTvXTD1OaJyA05USplutkul0QqRnlUFb8XEFGaI/E0DJQkARUuxhaVuI9w4Q4SqVJofGQ/T5RkESpfhKYzoTorprUBuRfWivX0Wm7YCLNQg6+ijKOdYpHviPQyaBat43gFDJzK6YdolxQpsrVfwQInPJ4TpFSGQvlgCiLGQclIUx48CpH5t0yoqxy505zfwGvWzunN7VGvejX2bR1toF+0jF52gJrpEN8hDFH1Ym9a2tWNv2xf2lX09arWt8cwG+hG29wVrqM5p</latexit><latexit sha1_base64="QdhX18/+A7E+bYo918Uf7uHg8S4=">ACm3icfVHbSsNAEN3Ee71VxSdBF4vg5SkKupbQR9EfFAxKjSlbDaTdOlmE3Y3agn5AD/Rr/AHfHB7edAqDsxy5pwZdjgTZJwp7Tjvlj01PTM7N79QWVxaXlmtrq0/qDSXFDya8lQ+BUQBZwI8zTSHp0wCSQIOj0HvfKA/PoNULBX3up9BOyGxYBGjRBuqU317xb5kcVcTKdMX3O+4E3Vjoj78UfuRJLQIDV0O3kb5n+r+o76WnWrNqTvDwL+BOwY1NI6bTvXTD1OaJyA05USplutkul0QqRnlUFb8XEFGaI/E0DJQkARUuxhaVuI9w4Q4SqVJofGQ/T5RkESpfhKYzoTorprUBuRfWivX0Wm7YCLNQg6+ijKOdYpHviPQyaBat43gFDJzK6YdolxQpsrVfwQInPJ4TpFSGQvlgCiLGQclIUx48CpH5t0yoqxy505zfwGvWzunN7VGvejX2bR1toF+0jF52gJrpEN8hDFH1Ym9a2tWNv2xf2lX09arWt8cwG+hG29wVrqM5p</latexit><latexit sha1_base64="QdhX18/+A7E+bYo918Uf7uHg8S4=">ACm3icfVHbSsNAEN3Ee71VxSdBF4vg5SkKupbQR9EfFAxKjSlbDaTdOlmE3Y3agn5AD/Rr/AHfHB7edAqDsxy5pwZdjgTZJwp7Tjvlj01PTM7N79QWVxaXlmtrq0/qDSXFDya8lQ+BUQBZwI8zTSHp0wCSQIOj0HvfKA/PoNULBX3up9BOyGxYBGjRBuqU317xb5kcVcTKdMX3O+4E3Vjoj78UfuRJLQIDV0O3kb5n+r+o76WnWrNqTvDwL+BOwY1NI6bTvXTD1OaJyA05USplutkul0QqRnlUFb8XEFGaI/E0DJQkARUuxhaVuI9w4Q4SqVJofGQ/T5RkESpfhKYzoTorprUBuRfWivX0Wm7YCLNQg6+ijKOdYpHviPQyaBat43gFDJzK6YdolxQpsrVfwQInPJ4TpFSGQvlgCiLGQclIUx48CpH5t0yoqxy505zfwGvWzunN7VGvejX2bR1toF+0jF52gJrpEN8hDFH1Ym9a2tWNv2xf2lX09arWt8cwG+hG29wVrqM5p</latexit>
slide-13
SLIDE 13

Forward vs Reverse

Deep learning involves computing the gradient of millions of parameters with respect to a loss. We need reverse mode.

✓ ← ✓ − ✏@L @✓ where ✓ = (W1, W2, . . . , b1, b2, . . .)

<latexit sha1_base64="pS/V8h9CqT4Uo4yd/AVrDRTdq+g=">ACrHicbVFNb9QwEHXCVwkf3cKRi8WKqkjLNqmogANSJS4cOBRE2EraOU4k421zofsCWVl5Y/wz/gvHCykSgtI4309GbePwmbZQ0GIa/P/W7Tt37+3dDx48fPR4f3Lw5JupWy0gFrWq9UXKDShZQYwSFVw0GniZKlikmw9fEdtJF19RW3DSQlX1cyl4Kjo1aTnwLQE4PmYIcudb1JR2pV5RBY6SqK8pyzYVlDdcouaKs5FgIruynrvCDrKOMhYwhB9oLwvQDs3ezfw/U6Y5vZosYpmdLE6mVFWZDWaGU17Jv3LvOxWk2k4D4egN0E0gikZ43w1+c2yWrQlVCgUN2YZhQ0mtl9PKOgC1hpouNjwNSwdrHgJrGDhx194ZiM5rV2WSEd2KsKy0tjtmXqOvtPmOu1nvxfbdli/jaxsmpahErsHspbRbGm/UFoJjUIVFsHuNDS7UpFwZ3f6M4WsAxyd9phHZtxvVlrgKqzep121pkxC+enLsMucHZF1825CeKT+bt5+Pn19OzL6NseUaekyMSkTfkjHwk5yQmwiPeoXfshf6xH/tLP9m1+t6oeUr+CT/A1m30Bc=</latexit><latexit sha1_base64="pS/V8h9CqT4Uo4yd/AVrDRTdq+g=">ACrHicbVFNb9QwEHXCVwkf3cKRi8WKqkjLNqmogANSJS4cOBRE2EraOU4k421zofsCWVl5Y/wz/gvHCykSgtI4309GbePwmbZQ0GIa/P/W7Tt37+3dDx48fPR4f3Lw5JupWy0gFrWq9UXKDShZQYwSFVw0GniZKlikmw9fEdtJF19RW3DSQlX1cyl4Kjo1aTnwLQE4PmYIcudb1JR2pV5RBY6SqK8pyzYVlDdcouaKs5FgIruynrvCDrKOMhYwhB9oLwvQDs3ezfw/U6Y5vZosYpmdLE6mVFWZDWaGU17Jv3LvOxWk2k4D4egN0E0gikZ43w1+c2yWrQlVCgUN2YZhQ0mtl9PKOgC1hpouNjwNSwdrHgJrGDhx194ZiM5rV2WSEd2KsKy0tjtmXqOvtPmOu1nvxfbdli/jaxsmpahErsHspbRbGm/UFoJjUIVFsHuNDS7UpFwZ3f6M4WsAxyd9phHZtxvVlrgKqzep121pkxC+enLsMucHZF1825CeKT+bt5+Pn19OzL6NseUaekyMSkTfkjHwk5yQmwiPeoXfshf6xH/tLP9m1+t6oeUr+CT/A1m30Bc=</latexit><latexit sha1_base64="pS/V8h9CqT4Uo4yd/AVrDRTdq+g=">ACrHicbVFNb9QwEHXCVwkf3cKRi8WKqkjLNqmogANSJS4cOBRE2EraOU4k421zofsCWVl5Y/wz/gvHCykSgtI4309GbePwmbZQ0GIa/P/W7Tt37+3dDx48fPR4f3Lw5JupWy0gFrWq9UXKDShZQYwSFVw0GniZKlikmw9fEdtJF19RW3DSQlX1cyl4Kjo1aTnwLQE4PmYIcudb1JR2pV5RBY6SqK8pyzYVlDdcouaKs5FgIruynrvCDrKOMhYwhB9oLwvQDs3ezfw/U6Y5vZosYpmdLE6mVFWZDWaGU17Jv3LvOxWk2k4D4egN0E0gikZ43w1+c2yWrQlVCgUN2YZhQ0mtl9PKOgC1hpouNjwNSwdrHgJrGDhx194ZiM5rV2WSEd2KsKy0tjtmXqOvtPmOu1nvxfbdli/jaxsmpahErsHspbRbGm/UFoJjUIVFsHuNDS7UpFwZ3f6M4WsAxyd9phHZtxvVlrgKqzep121pkxC+enLsMucHZF1825CeKT+bt5+Pn19OzL6NseUaekyMSkTfkjHwk5yQmwiPeoXfshf6xH/tLP9m1+t6oeUr+CT/A1m30Bc=</latexit>

13

slide-14
SLIDE 14

OO vs SCT: Operator Overloading

def f(x): i = 0 while i < 3: i = i + 1 x = tanh(x) x = x * 10 return x i = 0 i = i + 1 x = tanh(x) i = i + 1 x = tanh(x) i = i + 1 x = tanh(x) x = x * 10

Trace Backprop

  • Overload every operation to log itself on a tape.
  • At the end, we walk the tape backward.
  • “Define-by-run”, “Dynamic graph”
  • Easy to implement, but lots of overhead
  • Discourages composing small & cheap operations

Program Tape

14

slide-15
SLIDE 15

OO vs SCT: Source Code Transformation

  • Transform a function that computes a value into a new function that

computes the derivative.

  • Operate on source code or intermediate representation
  • Applies the chain rule to code
  • Standard language optimizations apply: can eliminate overhead
  • Easier to apply to functional languages
  • Reverse mode AD interacts badly with mutation and side effects
  • Requires deep analysis and optimization to remove dead code

15

def bprop_pow(x, y, out, dout): dx = dout * y * x ** (y - 1) dy = dout * out * log(x) return dx, dy

What if we don’t need dy?

slide-16
SLIDE 16

The Needs

What we need from a language for deep learning

Autodiff

What it is, how it works, what the challenges are

Representation

The best representation for our needs

Type system

Flexible inference for performance and robustness

16

slide-17
SLIDE 17

About syntax

17

Myia is an intermediate representation

  • High level
  • No syntax of its own
  • Multiple languages may target it

Python frontend

  • Why? Most used language in DL
  • Productive for research and prototyping
  • Translate functional subset to Myia
  • Control flow: if, while, for, def, lambda
  • Data: lists, tuples, arrays, @dataclass
  • Not supported: mutation, side effects, eval
  • One issue: translate dynamically typed code

def fact(x): if x <= 1: return 1 else: return x * fact(x - 1)

Output Operation Input Constant Free var.

slide-18
SLIDE 18

Needs

18

Requirements for our representation

  • Powerful enough to represent recursion
  • Minimal
  • Easy to parallelize
  • Easy to optimize
  • Easy to extend

Solutions

  • Functional (ANF)
  • Graph-based
  • Typed
slide-19
SLIDE 19

Why functional programming?

19

Easier to transform

  • Referential transparency: same expression, same result

Easier to think about

  • No side effects

Easier to optimize

  • Order of operations can be changed
  • Parallelizable
  • Common subexpression elimination easy

Type system is easier

  • No side effects

Easier for automatic differentiation

slide-20
SLIDE 20

Why graphs?

20

Easy to parallelize

  • Only data flow relationships

Easy to optimize

  • Direct use-def pointers (no names)
  • Dead code elimination is trivial
  • Inlining is easy

Output Operation Input

slide-21
SLIDE 21

Why static typing?

Guarantees

  • Correctness of the user’s program
  • Type correctness of code transforms (autodiff)

Performance

  • No runtime type checking = better performance
  • Leverage shape information for optimization

User experience

  • Prevent errors late in process
slide-22
SLIDE 22

The Needs

What we need from a language for deep learning

Autodiff

What it is, how it works, what the challenges are

Representation

The best representation for our needs

Type system

Flexible inference for performance and robustness

22

slide-23
SLIDE 23

Myia’s Types

Scalars: Int/UInt/Float<8/16/32/64>, Bool Tuples: Tuple<T1, T2, …>

  • Heterogeneously typed, static length

Lists: List<T>

  • Homogeneously typed, dynamic length, fast append

Arrays: Array<T, Shape<D1, D2, …>?

  • Homogeneously typed, shape part of the type

Functions: Function<Args<TIn1, TIn2, ../>, TOut> Struct types are reduced to tuples in pre-processing.

23

slide-24
SLIDE 24

Why inference?

Annotations are annoying

  • Polymorphic types are awkward to express
  • Function types are awkward to express
  • Impede rapid prototyping
  • Duck typing is more natural
  • This is why people like Python

Type/shape inference

  • Infer from the input types from entry point
  • Implicit polymorphism
  • Feels dynamic
  • Functions are re-compiled when they are given new input types
slide-25
SLIDE 25

Myia’s inference pipeline

  • 1. Transform inputs into abstract inputs
  • Represent type and shape — no concrete values
  • More types: structs, polymorphic functions
  • 2. Run abstract interpreter on abstract inputs
  • Bounded input signatures for each function
  • Recursive functions become fixed points
  • 3. Specialize functions to their possible signatures
  • If function called with int, make int version, etc.
  • Higher order uses require signature uniqueness
  • 4. Update or re-run inference after optimizations or AD
slide-26
SLIDE 26

Error reporting

Abstract inferrer shows compile-time tracebacks for type/shape errors.

slide-27
SLIDE 27

Debugging

Tracking correspondence to source code

  • Through parsing
  • Through optimization
  • Through automatic differentiation
  • Through macros/code generation

Debugging tools we need

  • Custom debugger for step by step execution
  • Watching variables and gradients
  • Breakpoints that trigger during the reverse phase
  • Profiling and reporting which parts of the code are “hot”
slide-28
SLIDE 28

In Conclusion: Myia’s focus

General purpose, including recursion Automatic differentiation

  • Code transform
  • Optimizable, higher order gradients

Type and shape inference

  • Can handle duck typed code

Good debugging facilities

  • Step debugger, profiling
  • Gradient debugging

⭐ us on GitHub: https://github.com/mila-iqia/myia