Deep networks
CS 446
Deep networks CS 446 The ERM perspective These lectures will - - PowerPoint PPT Presentation
Deep networks CS 446 The ERM perspective These lectures will follow an ERM perspective on deep networks: Pick a model/predictor class (network architecture). (We will spend most of our time on this!) Pick a loss/risk. (We will almost
CS 446
1 / 20
2 / 20
T [ x
1 ] ,
T
1:d = W L · · · W 1,
3 / 20
Tx),
2 4 6 0.2 0.4 0.6 0.8 1
4 / 20
Tx),
2 4 6 0.2 0.4 0.6 0.8 1
4 / 20
i=1 with W i ∈ Rdi−1×di are the weights, and (bi)L i=1 are the biases.
i=1 with σi : Rdi → Rdi are called nonlinearties, or activations, or
5 / 20
1 1+exp(−z).
6 / 20
i=1, the weights and biases, are the parameters.
i=1,
7 / 20
W
n
W 1∈Rd×d1 ,b1∈Rd1
W L∈RdL−1×dL ,bL∈RdL
n
i=1)
W 1∈Rd×d1 ,b1∈Rd1
W L∈RdL−1×dL ,bL∈RdL
n
8 / 20
1 ]
9 / 20
10 / 20
10 / 20
10 / 20
10 / 20
1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00
0.000 1.500 3.000 4.500 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00
2 .
6 .
. 0.000 0.000 8 . 8.000 16.000
1 ].
11 / 20
12 / 20
12 / 20
12 / 20
x∈[0,1]d
13 / 20
x∈[0,1]d
13 / 20
d
14 / 20
d
14 / 20
d
14 / 20
1
2
k
1
2
k
15 / 20
1
2
k
1
2
k
15 / 20
1
2
k
1
2
k
15 / 20
16 / 20
17 / 20
18 / 20
19 / 20
20 / 20