Empirical Loss Minimization Traffic sign - STOP Sample - - PowerPoint PPT Presentation

empirical loss minimization traffic sign stop sample i i
SMART_READER_LITE
LIVE PREVIEW

Empirical Loss Minimization Traffic sign - STOP Sample - - PowerPoint PPT Presentation

Empirical Loss Minimization Traffic sign - STOP Sample i.i.d. points Stochastic Gradient Descent Lon Bottou, Frank E Curtis, Jorge Nocedal Optimization methods for large-scale machine learning SVRG:


slide-1
SLIDE 1

č

slide-2
SLIDE 2

č

slide-3
SLIDE 3

Empirical Loss Minimization

slide-4
SLIDE 4

Traffic sign - STOP

slide-5
SLIDE 5
slide-6
SLIDE 6

Sample i.i.d. points

slide-7
SLIDE 7

Stochastic Gradient Descent

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
  • Léon Bottou, Frank E Curtis, Jorge Nocedal

Optimization methods for large-scale machine learning

slide-11
SLIDE 11

SVRG: Stochastic Variance Reduced Gradient

slide-12
SLIDE 12
  • Unbiased stochastic gradient:
slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15

SAG/SAGA

slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18

SARAH

č

slide-19
SLIDE 19
slide-20
SLIDE 20
slide-21
SLIDE 21
slide-22
SLIDE 22
slide-23
SLIDE 23

RCV Dataset

SVRG and SARAH need full gradient after restart Variance of SARAH goes to zero Variance of SVRG is decreased after each restart

slide-24
SLIDE 24
slide-25
SLIDE 25

SARAH+ Practical Variant

slide-26
SLIDE 26

good performance across many datasets

slide-27
SLIDE 27

Numerical Experiments

slide-28
SLIDE 28

One has to tune parameters to get a good performance! Not for SARAH+!

slide-29
SLIDE 29
slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32

Summary

slide-33
SLIDE 33
slide-34
SLIDE 34

Convex Case

slide-35
SLIDE 35
slide-36
SLIDE 36

Non-Convex Case

slide-37
SLIDE 37
slide-38
SLIDE 38
slide-39
SLIDE 39

Any Questions?