An Accelerated Variance Reducing Stochastic Method with Douglas-Rachford Splitting
Jingchang Liu November 12, 2018
University of Science and Technology of China 1
An Accelerated Variance Reducing Stochastic Method with - - PowerPoint PPT Presentation
An Accelerated Variance Reducing Stochastic Method with Douglas-Rachford Splitting Jingchang Liu November 12, 2018 University of Science and Technology of China 1 Table of Contents Background Moreau Envelop and Douglas-Rachford (DR)
University of Science and Technology of China 1
2
x∈Rd f (x) + h(x) := 1 n n
f (x) = argminy∈Rd
1 2γ y − x2
γ (x − proxγ f (x)).
2 y − x2.
2 y − x2. 3
h(x − γ · ), where can be obtained from:
4
5
y
f (x))/γ.
f (xk) = xk − γ∇f γ(xk). 6
x∈Rd f (x) := 1 n n
j
j − n
i /n),
fj(zk j )
j
j − xk+1)/γ,
j
j + n
i /n
j
j . 7
8
x∈Rd f (x) + h(x),
f (2xk − y k),
h(y k+1).
h(2proxγ f (y) − y) − proxγ f (y).
f (y) satisfies
y (y)) + ∂g(proxγ y (y)). 9
10
j
j + 1
n
i
h(y k),
j
j + xk − y k) − proxfj(zk j + xk − y k)
j − xk − y k.
11
j
j − 1
n
i
fj(zk j ),
j
j − xk+1).
j = n i=1 g k i /n in Prox2-SAGA,
f (2xk − y k),
h(y k+1). 12
i }i=1,...,n) is the fixed point of the Prox2-SAGA
h(y ∞) is a minimizer of f + h.
fi (z∞ i
i
h(y ∞), we have
n
i
n
i
13
j = 1 k
t=1 g t j , then for Prox2-SAGA with step size γ ≤ 1/L, at any
j − g ∗ j
i − g ∗ i
µn,
9L2+3µL−3L 2µL
i −g ∗ i )
14
15
16
17
20 40 60 80 epoch 10-6 10-4 10-2 100
svmguide3
Prox2-SAGA Prox-SAGA, Prox-SDCA Prox-SGD
10 20 30 40 50 60 70 epoch 10-6 10-4 10-2 100
rcv1
Prox2-SAGA Prox-SAGA, Prox-SDCA Prox-SGD
2 4 6 8 10 12 14 epoch 10-5 10-4 10-3 10-2 10-1 100
covtype
Prox2-SAGA Prox-SAGA, Prox-SDCA Prox-SGD
2 4 6 8 10 12 14 epoch 10-6 10-4 10-2 100
ijcnn1
Prox2-SAGA Prox-SAGA, Prox-SDCA Prox-SGD
18
19