Feed-forward Networks Network Training Error Backpropagation Applications
Artificial Neural Networks
Oliver Schulte - CMPT 726
Artificial Neural Networks Oliver Schulte - CMPT 726 Feed-forward - - PowerPoint PPT Presentation
Feed-forward Networks Network Training Error Backpropagation Applications Artificial Neural Networks Oliver Schulte - CMPT 726 Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Neural networks
Feed-forward Networks Network Training Error Backpropagation Applications
Artificial Neural Networks
Oliver Schulte - CMPT 726
Feed-forward Networks Network Training Error Backpropagation Applications
Neural Networks
human/animal brains
Feed-forward Networks Network Training Error Backpropagation Applications
Uses of Neural Networks
Feed-forward Networks Network Training Error Backpropagation Applications
Applications
There are many, many applications.
http://en.wikipedia.org/wiki/TD-Gammon http://en.wikipedia.org/wiki/Backgammon
http://www.cs.cmu.edu/afs/cs/usr/tjochem/ www/nhaa/nhaa_home_page.html
Feed-forward Networks Network Training Error Backpropagation Applications
Outline
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks Network Training Error Backpropagation Applications
Outline
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks
y(x, w) = f
M
wjφj(x) for fixed non-linear basis functions φ(·)
functions, and learning their parameters
we let each basis function be another non-linear function of linear combination of the inputs: φj(x) = f
M
. . .
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks
y(x, w) = f
M
wjφj(x) for fixed non-linear basis functions φ(·)
functions, and learning their parameters
we let each basis function be another non-linear function of linear combination of the inputs: φj(x) = f
M
. . .
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks
combinations: aj =
D
w(1)
ji xi + w(1) j0
These aj are known as activations
zj = h(aj)
from Russell and Norvig, AIMA2e
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks
combinations: aj =
D
w(1)
ji xi + w(1) j0
These aj are known as activations
zj = h(aj)
from Russell and Norvig, AIMA2e
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks
combinations: aj =
D
w(1)
ji xi + w(1) j0
These aj are known as activations
zj = h(aj)
from Russell and Norvig, AIMA2e
Feed-forward Networks Network Training Error Backpropagation Applications
Activation Functions
classification)
i(xi − wji)2
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks
x0 x1 xD z0 z1 zM y1 yK w(1)
MD
w(2)
KM
w(2)
10
hidden units inputs
feed-forward network (DAG)
yk(x, w) = h
M
w(2)
kj h
D
w(1)
ji xi + w(1) j0
k0
Feed-forward Networks Network Training Error Backpropagation Applications
A general network
wkj z1 wji z2 zk zc
x1 x2 xi xd
x1 x2 xi xd y1 y2 yj ynH t1 t2 tk tc
target t input x
hidden input
Feed-forward Networks Network Training Error Backpropagation Applications
The XOR Problem Revisited
1
1
z=+1 z=-1 z=-1
Feed-forward Networks Network Training Error Backpropagation Applications
The XOR Problem Solved
bias hidden j
input i
1 1 1 1 .5
.7
x1 x2
1
1
1
1
1
1
1
1
1
y1 y2 z zk
wkj wji
x1 x2 x1 x2 x1 x2 y1 y2
Feed-forward Networks Network Training Error Backpropagation Applications
Hidden Units Compute Basis Functions
Network function is roughly the sum of activation functions.
Feed-forward Networks Network Training Error Backpropagation Applications
Hidden Units As Feature Extractors
sample training patterns learned input-to-hidden weights
Feed-forward Networks Network Training Error Backpropagation Applications
Outline
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks Network Training Error Backpropagation Applications
Network Training
parameters (weights)?
network performs, optimize against it
E(w) =
N
{y(xn, w) − tn}2
Feed-forward Networks Network Training Error Backpropagation Applications
Network Training
parameters (weights)?
network performs, optimize against it
E(w) =
N
{y(xn, w) − tn}2
Feed-forward Networks Network Training Error Backpropagation Applications
Parameter Optimization
w1 w2 E(w) wA wB wC ∇E
nasty
Feed-forward Networks Network Training Error Backpropagation Applications
Descent Methods
a descent method: w(τ+1) = w(τ) + ηw(τ)
is particularly effective
Feed-forward Networks Network Training Error Backpropagation Applications
Descent Methods
a descent method: w(τ+1) = w(τ) + ηw(τ)
is particularly effective
Feed-forward Networks Network Training Error Backpropagation Applications
Descent Methods
a descent method: w(τ+1) = w(τ) + ηw(τ)
is particularly effective
Feed-forward Networks Network Training Error Backpropagation Applications
Computing Gradients
complicated
with respect to hidden weights.
Feed-forward Networks Network Training Error Backpropagation Applications
Outline
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks Network Training Error Backpropagation Applications
Error Backpropagation
derivatives ∂En
∂wji for all nodes in the network. Intuition:
nodes is easy.
error for nodes in previous layer.
signal through the network.
Feed-forward Networks Network Training Error Backpropagation Applications
Error at the output nodes
storing all activations aj
nodes is easy
yk = g(ak) = g(
i wkizi):
∂En ∂wki = ∂ ∂wki 1 2(tn − yk)2 = −(tn − yk)g′(ak)zi
wki ← wki + ηδkzi.
Feed-forward Networks Network Training Error Backpropagation Applications
Error at the output nodes
storing all activations aj
nodes is easy
yk = g(ak) = g(
i wkizi):
∂En ∂wki = ∂ ∂wki 1 2(tn − yk)2 = −(tn − yk)g′(ak)zi
wki ← wki + ηδkzi.
Feed-forward Networks Network Training Error Backpropagation Applications
Error at the hidden nodes
weighted sum of contributions to the output errors.
δj = g′(aj)
wkjδk.
wji ← wji + ηδjzi.
Feed-forward Networks Network Training Error Backpropagation Applications
Backpropagation Picture
wkj ω1
... ...
ω2 ω3 ωk ωc
hidden input
wij δ1 δ2 δ3 δk δc δj
The error signal at a hidden unit is proportional to the error signals at the units it influences: δj = g′(aj) ×
wkjδk
Feed-forward Networks Network Training Error Backpropagation Applications
The Backpropagation Algorithm
activation levels ai and output levels zi.
hidden node.
vector wji. Demo AIspace http://aispace.org/neural/.
Feed-forward Networks Network Training Error Backpropagation Applications
The Backpropagation Algorithm
activation levels ai and output levels zi.
hidden node.
vector wji. Demo AIspace http://aispace.org/neural/.
Feed-forward Networks Network Training Error Backpropagation Applications
Correctness Proof for Backpropagation Algorithm.
aj#
#
zi=#g(ai)# ai# wji#
wji = δjzi.
Theorem
For each node j, we have δj = − ∂En
aj .
wji = − ∂En aj · ∂aj ∂wji = δj · zi.
Feed-forward Networks Network Training Error Backpropagation Applications
Multi-variate Chain Rule
f" x" u" y"
differentiable wrt u and v: ∂f ∂u = ∂f ∂x ∂x ∂u + ∂f ∂y ∂y ∂u and ∂f ∂v = ∂f ∂x ∂x ∂v + ∂f ∂y ∂y ∂v
Feed-forward Networks Network Training Error Backpropagation Applications
Proof of Theorem, I
aj .
the nodes after node j.
∂aj = ∂ ∂aj En(aj1, aj2, . . . , ajm) where
{ji} are the indices of the nodes that receive input from j.
∂En ∂aj =
m
∂En ∂ak ∂ak ∂aj
∂aj = wkj · g′(zj).
ak# zj=#g(aj)# aj# wkj#
Feed-forward Networks Network Training Error Backpropagation Applications
Proof of Theorem, I
aj .
the nodes after node j.
∂aj = ∂ ∂aj En(aj1, aj2, . . . , ajm) where
{ji} are the indices of the nodes that receive input from j.
∂En ∂aj =
m
∂En ∂ak ∂ak ∂aj
∂aj = wkj · g′(zj).
ak# zj=#g(aj)# aj# wkj#
Feed-forward Networks Network Training Error Backpropagation Applications
Proof of Theorem, II
aj .
true for output nodes. (Exercise).
δk = − ∂En
ak for all nodes k that receive input from j.
−∂En ∂aj =
m
−∂En ∂ak ∂ak ∂aj =
m
δk ∂ak ∂aj =
m
δkwkjg′(zj) = δj. where step 1 applies the inductive hypothesis, step 2 the result from the previous slide, and step 3 the definition of δj.
Feed-forward Networks Network Training Error Backpropagation Applications
Other Learning Topics
Feed-forward Networks Network Training Error Backpropagation Applications
Outline
Feed-forward Networks Network Training Error Backpropagation Applications
Feed-forward Networks Network Training Error Backpropagation Applications
Applications of Neural Networks
Feed-forward Networks Network Training Error Backpropagation Applications
Hand-written Digit Recognition
Feed-forward Networks Network Training Error Backpropagation Applications
LeNet-5
INPUT 32x32
Convolutions Subsampling Convolutions
C1: feature maps 6@28x28
Subsampling
S2: f. maps 6@14x14 S4: f. maps 16@5x5 C5: layer 120 C3: f. maps 16@10x10 F6: layer 84
Full connection Full connection Gaussian connections
OUTPUT 10
http://www.codeproject.com/KB/library/NeuralNetRecognition.aspx
Feed-forward Networks Network Training Error Backpropagation Applications 4>6 3>5 8>2 2>1 5>3 4>8 2>8 3>5 6>5 7>3 9>4 8>0 7>8 5>3 8>7 0>6 3>7 2>7 8>3 9>4 8>2 5>3 4>8 3>9 6>0 9>8 4>9 6>1 9>4 9>1 9>4 2>0 6>1 3>5 3>2 9>5 6>0 6>0 6>0 6>8 4>6 7>3 9>4 4>6 2>7 9>7 4>3 9>4 9>4 9>4 8>7 4>2 8>4 3>5 8>4 6>5 8>5 3>8 3>8 9>8 1>5 9>8 6>3 0>2 6>5 9>5 0>7 1>6 4>9 2>1 2>8 8>5 4>9 7>2 7>2 6>5 9>7 6>1 5>6 5>0 4>9 2>8
Feed-forward Networks Network Training Error Backpropagation Applications
Conclusion
classification
basis functions
boundaries
minimum