Newton Methods for Neural Networks: Gauss Newton Matrix-vector Product
Chih-Jen Lin
National Taiwan University Last updated: June 1, 2020
Chih-Jen Lin (National Taiwan Univ.) 1 / 81
Newton Methods for Neural Networks: Gauss Newton Matrix-vector - - PowerPoint PPT Presentation
Newton Methods for Neural Networks: Gauss Newton Matrix-vector Product Chih-Jen Lin National Taiwan University Last updated: June 1, 2020 Chih-Jen Lin (National Taiwan Univ.) 1 / 81 Outline Backward setting 1 Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 1 / 81
1
2
Chih-Jen Lin (National Taiwan Univ.) 2 / 81
Backward setting
1
2
Chih-Jen Lin (National Taiwan Univ.) 3 / 81
Backward setting Jacobian evaluation
1
2
Chih-Jen Lin (National Taiwan Univ.) 4 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 5 / 81
Backward setting Jacobian evaluation
l
Chih-Jen Lin (National Taiwan Univ.) 6 / 81
Backward setting Jacobian evaluation
∂zL+1,i
1
∂vec(W m)T
∂zL+1,i
nL+1
∂vec(W m)T
1
∂Sm,i φ(pad(Z m,i))T)T
∂zL+1,i
nL+1
∂Sm,i φ(pad(Z m,i))T)T
Chih-Jen Lin (National Taiwan Univ.) 7 / 81
Backward setting Jacobian evaluation
1
∂Sm,i
convbm conv
nL+1
∂Sm,i
convbm conv
Chih-Jen Lin (National Taiwan Univ.) 8 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 9 / 81
Backward setting Jacobian evaluation
1
nL+1
Chih-Jen Lin (National Taiwan Univ.) 10 / 81
Backward setting Jacobian evaluation
pool.
Chih-Jen Lin (National Taiwan Univ.) 11 / 81
Backward setting Jacobian evaluation
j
j
pool,
Chih-Jen Lin (National Taiwan Univ.) 12 / 81
Backward setting Jacobian evaluation
pool.
Chih-Jen Lin (National Taiwan Univ.) 13 / 81
Backward setting Jacobian evaluation
φ Pm pad
Chih-Jen Lin (National Taiwan Univ.) 14 / 81
Backward setting Jacobian evaluation
1
∂Sm,i
φ Pm pad
nL+1
∂Sm,i
φ Pm pad
Chih-Jen Lin (National Taiwan Univ.) 15 / 81
Backward setting Jacobian evaluation
1
nL+1
Chih-Jen Lin (National Taiwan Univ.) 16 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 17 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 18 / 81
Backward setting Jacobian evaluation
pool.
φ Pm pad,
Chih-Jen Lin (National Taiwan Univ.) 19 / 81
Backward setting Jacobian evaluation
pool.
1
∂Sm,i φ(pad(Z m,i))T)T
∂zL+1,i
nL+1
∂Sm,i φ(pad(Z m,i))T)T
Chih-Jen Lin (National Taiwan Univ.) 20 / 81
Backward setting Jacobian evaluation
1
∂Sm,i
φ Pm pad
nL+1
∂Sm,i
φ Pm pad
Chih-Jen Lin (National Taiwan Univ.) 21 / 81
Backward setting Jacobian evaluation
pool)
φ Pm pad
Chih-Jen Lin (National Taiwan Univ.) 22 / 81
Backward setting Jacobian evaluation
1
∂Sm,i φ(pad(Z m,i))T)T
∂zL+1,i
nL+1
∂Sm,i φ(pad(Z m,i))T)T
Chih-Jen Lin (National Taiwan Univ.) 23 / 81
Backward setting Jacobian evaluation
1
∂Sm,i φ(pad(Z m,i))T)T
∂zL+1,i
nL+1
∂Sm,i φ(pad(Z m,i))T)T
Chih-Jen Lin (National Taiwan Univ.) 24 / 81
Backward setting Jacobian evaluation
1
nL+1
1
nL+1
Chih-Jen Lin (National Taiwan Univ.) 25 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 26 / 81
Backward setting Jacobian evaluation
pool)Tvec(∆:,:,1)
pool)Tvec(∆:,:,l)
dm+1×am
convbm convnL+1l
Chih-Jen Lin (National Taiwan Univ.) 27 / 81
Backward setting Jacobian evaluation
pool.
1
∂Sm,1
∂zL+1,1
nL+1
∂Sm,1
∂zL+1,l
nL+1
∂Sm,l
convbm convnL+1l
Chih-Jen Lin (National Taiwan Univ.) 28 / 81
Backward setting Jacobian evaluation
convbm convnL+1l×1
∂zL+1,1
1
∂Sm,1
∂zL+1,1
nL+1
∂Sm,1
∂zL+1,l
nL+1
∂Sm,l
Chih-Jen Lin (National Taiwan Univ.) 29 / 81
Backward setting Jacobian evaluation
1
nL+1
nL+1
Chih-Jen Lin (National Taiwan Univ.) 30 / 81
Backward setting Jacobian evaluation
1)TPm φ Pm pad
nL+1)TPm φ Pm pad
nL+1)TPm φ Pm pad
dmambm×nL+1×l
Chih-Jen Lin (National Taiwan Univ.) 31 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 32 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 33 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 34 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 35 / 81
Backward setting Jacobian evaluation
1
nL+1
convbm conv + L
Chih-Jen Lin (National Taiwan Univ.) 36 / 81
Backward setting Jacobian evaluation
Chih-Jen Lin (National Taiwan Univ.) 37 / 81
Backward setting Gauss-Newton Matrix-vector products
1
2
Chih-Jen Lin (National Taiwan Univ.) 38 / 81
Backward setting Gauss-Newton Matrix-vector products
l
Chih-Jen Lin (National Taiwan Univ.) 39 / 81
Backward setting Gauss-Newton Matrix-vector products
l
l
L
Chih-Jen Lin (National Taiwan Univ.) 40 / 81
Backward setting Gauss-Newton Matrix-vector products
Chih-Jen Lin (National Taiwan Univ.) 41 / 81
Backward setting Gauss-Newton Matrix-vector products
1
∂Sm,i
convbm conv
nL+1
∂Sm,i
convbm conv
Chih-Jen Lin (National Taiwan Univ.) 42 / 81
Backward setting Gauss-Newton Matrix-vector products
Chih-Jen Lin (National Taiwan Univ.) 43 / 81
Backward setting Gauss-Newton Matrix-vector products
1
A
convbm conv
T
1
am
convbm conv
Chih-Jen Lin (National Taiwan Univ.) 44 / 81
Backward setting Gauss-Newton Matrix-vector products
am
convbm conv
Chih-Jen Lin (National Taiwan Univ.) 45 / 81
Backward setting Gauss-Newton Matrix-vector products
L
L
Chih-Jen Lin (National Taiwan Univ.) 46 / 81
Backward setting Gauss-Newton Matrix-vector products
Chih-Jen Lin (National Taiwan Univ.) 47 / 81
Backward setting Gauss-Newton Matrix-vector products
1
convbm conv
nL+1
convbm conv
Chih-Jen Lin (National Taiwan Univ.) 48 / 81
Backward setting Gauss-Newton Matrix-vector products
nL+1
jvec
j
convbm conv
nL+1
j
j
convbm conv
nL+1
j
j
convbm conv
Chih-Jen Lin (National Taiwan Univ.) 49 / 81
Backward setting Gauss-Newton Matrix-vector products
convbm conv
convbm conv
Chih-Jen Lin (National Taiwan Univ.) 50 / 81
Backward setting Gauss-Newton Matrix-vector products
Chih-Jen Lin (National Taiwan Univ.) 51 / 81
Backward setting Gauss-Newton Matrix-vector products
m=1 Jm,1✈ m
m=1 Jm,l✈ m
Chih-Jen Lin (National Taiwan Univ.) 52 / 81
Backward setting Gauss-Newton Matrix-vector products
∂③L+1,1 ∂vec(Sm,1)T vec
am
convbm conv
∂③L+1,l ∂vec(Sm,l)T vec
am
convbm conv
∂③L+1,1 ∂vec(Sm,1)T ♣m,1
∂③L+1,l ∂vec(Sm,l)T ♣m,l
Chih-Jen Lin (National Taiwan Univ.) 53 / 81
Backward setting Gauss-Newton Matrix-vector products
am
convbm conv
Chih-Jen Lin (National Taiwan Univ.) 54 / 81
Backward setting Gauss-Newton Matrix-vector products
am
convbm conv
am
convbm conv
convbm convl;
Chih-Jen Lin (National Taiwan Univ.) 55 / 81
Backward setting Gauss-Newton Matrix-vector products
∂③L+1,1 ∂vec(Sm,1)T ♣m,1
∂③L+1,l ∂vec(Sm,l)T ♣m,l
Chih-Jen Lin (National Taiwan Univ.) 56 / 81
Backward setting Gauss-Newton Matrix-vector products
1
nL+1
convbm conv×nL+1
dm+1am
convbm conv×nL+1 .
Chih-Jen Lin (National Taiwan Univ.) 57 / 81
Backward setting Gauss-Newton Matrix-vector products
Chih-Jen Lin (National Taiwan Univ.) 58 / 81
Backward setting Gauss-Newton Matrix-vector products
Chih-Jen Lin (National Taiwan Univ.) 59 / 81
Forward + backward settings
1
2
Chih-Jen Lin (National Taiwan Univ.) 60 / 81
Forward + backward settings R operator
1
2
Chih-Jen Lin (National Taiwan Univ.) 61 / 81
Forward + backward settings R operator
Chih-Jen Lin (National Taiwan Univ.) 62 / 81
Forward + backward settings R operator
Chih-Jen Lin (National Taiwan Univ.) 63 / 81
Forward + backward settings R operator
11✈
1t✈
k1✈
kt✈
Chih-Jen Lin (National Taiwan Univ.) 64 / 81
Forward + backward settings R operator
Chih-Jen Lin (National Taiwan Univ.) 65 / 81
Forward + backward settings R operator
Chih-Jen Lin (National Taiwan Univ.) 66 / 81
Forward + backward settings R operator
m
Chih-Jen Lin (National Taiwan Univ.) 67 / 81
Forward + backward settings R operator
Chih-Jen Lin (National Taiwan Univ.) 68 / 81
Forward + backward settings R operator
φ Pm,i padR{vec
convbm conv
Chih-Jen Lin (National Taiwan Univ.) 69 / 81
Forward + backward settings R operator
am
convbm conv}
am
convbm conv}
am
convbm conv
Wφ(pad(Z m,i)) + W mR{φ(pad(Z m,i))}+
b 1T am
convbm conv,
Chih-Jen Lin (National Taiwan Univ.) 70 / 81
Forward + backward settings R operator
W,
b .
Chih-Jen Lin (National Taiwan Univ.) 71 / 81
Forward + backward settings R operator
W (a matrix form) and ✈ m b
poolσ(Sm,i)}
poolR{vec
Chih-Jen Lin (National Taiwan Univ.) 72 / 81
Forward + backward settings R operator
1
nL+1
Chih-Jen Lin (National Taiwan Univ.) 73 / 81
Forward + backward settings Gauss-Newton matrix-vector product
1
2
Chih-Jen Lin (National Taiwan Univ.) 74 / 81
Forward + backward settings Gauss-Newton matrix-vector product
Chih-Jen Lin (National Taiwan Univ.) 75 / 81
Forward + backward settings Gauss-Newton matrix-vector product
Chih-Jen Lin (National Taiwan Univ.) 76 / 81
Forward + backward settings Gauss-Newton matrix-vector product
Chih-Jen Lin (National Taiwan Univ.) 77 / 81
Forward + backward settings Gauss-Newton matrix-vector product
1
nL+1
Chih-Jen Lin (National Taiwan Univ.) 78 / 81
Forward + backward settings Gauss-Newton matrix-vector product
Chih-Jen Lin (National Taiwan Univ.) 79 / 81
Forward + backward settings Gauss-Newton matrix-vector product
Wφ(pad(Z m,i)), W mR{φ(pad(Z m,i))},
Chih-Jen Lin (National Taiwan Univ.) 80 / 81
Forward + backward settings Gauss-Newton matrix-vector product
Chih-Jen Lin (National Taiwan Univ.) 81 / 81