Neural Networks: Computation + Gradient Descent
LING572 Advanced Statistical Methods in NLP February 27 2020
1
Neural Networks: Computation + Gradient Descent LING572 Advanced - - PowerPoint PPT Presentation
Neural Networks: Computation + Gradient Descent LING572 Advanced Statistical Methods in NLP February 27 2020 1 Todays Outline Computation: the forward pass Functional form / matrix notation Parameters and Hyperparameters
LING572 Advanced Statistical Methods in NLP February 27 2020
1
2
3
4
5
6
p
q
p
q
7
p
q
p
q
8
p
q
p
q
9
p
q
p
q
10
11
12
00
01
0n0
10
11
1n0
n10
n11
n1n0
Shape: : number of neurons in layer 0 (input) : number of neurons in layer 1
Shape: (n0,1)
1
n1
Shape: (n1,1)
biases
13
n
i=1
14
15
16
17
σ(x) = 1 1 + e−x = ex ex + 1
tanh(x) = ex − e−x ex + e−x = 2σ(2x) − 1
18
19
i
20
21
22
23
24
25
( 1.25,0)
(1,1)
(0, 5)
26
27
28
29
30
31
32
33
00
01
0n0
10
11
1n0
n10
n11
n1n0
Shape: : number of neurons in layer 0 (input) : number of neurons in layer 1
Shape: (n0,1)
1
n1
Shape: (n1,1)
34
00
01
0n1
10
11
1n1
n00
n01
n0n1
Shape: : number of neurons in layer 0 (input) : number of neurons in layer 1
1
n0
1
1
n0
1
1
n0
Shape: : batch_size
1
n1]
Shape: Added to each row of
matrices/tensors to be a batch size
35
36
37
38
source
39
Bergstra and Bengio 2012
40