L ea rn i n g D ee p K e rn el s L ea rn i n g D ee p K e rn el s f - - PowerPoint PPT Presentation

l ea rn i n g d ee p k e rn el s l ea rn i n g d ee p k e
SMART_READER_LITE
LIVE PREVIEW

L ea rn i n g D ee p K e rn el s L ea rn i n g D ee p K e rn el s f - - PowerPoint PPT Presentation

L ea rn i n g D ee p K e rn el s L ea rn i n g D ee p K e rn el s f or E xpon e nt ial F a m il y D e ns i t ie s f or E xpon e nt ial F a m il y D e ns i t ie s Li K. Wenliang D. J. Sutherland H. Strathmann A. Gretton Gatsby unit, University


slide-1
SLIDE 1

Learning Deep Kernels Learning Deep Kernels for Exponential Family Densities for Exponential Family Densities

Li K. Wenliang D. J. Sutherland H. Strathmann

  • A. Gretton

Gatsby unit, University College London Poster #221

slide-2
SLIDE 2

Kernel exponential families Kernel exponential families

Classic exponential family: Gaussian:

Learning deep kernels for exponential family densities Poster #221

slide-3
SLIDE 3

Kernel exponential families Kernel exponential families

Classic exponential family: Gaussian: Fit depends only on (and )

Learning deep kernels for exponential family densities Poster #221

slide-4
SLIDE 4

Kernel exponential families Kernel exponential families

Classic exponential family: Gaussian: Fit depends only on (and ) Kernel exponential family:

Learning deep kernels for exponential family densities Poster #221

slide-5
SLIDE 5

Kernel exponential families Kernel exponential families

Classic exponential family: Gaussian: Fit depends only on (and ) Kernel exponential family: Reproducing property:

Learning deep kernels for exponential family densities Poster #221

slide-6
SLIDE 6

Kernel exponential families Kernel exponential families

Classic exponential family: Gaussian: Fit depends only on (and ) Kernel exponential family: Reproducing property: So

Learning deep kernels for exponential family densities Poster #221

slide-7
SLIDE 7

Why kernel exponential families Why kernel exponential families

Learning deep kernels for exponential family densities Poster #221

slide-8
SLIDE 8

Why kernel exponential families Why kernel exponential families

Any density with

Learning deep kernels for exponential family densities Poster #221

slide-9
SLIDE 9

Why kernel exponential families Why kernel exponential families

Any density with Much richer class; e.g. with Gaussian , dense in all continuous distributions on compact domains

Learning deep kernels for exponential family densities Poster #221

slide-10
SLIDE 10

Why kernel exponential families Why kernel exponential families

Any density with Much richer class; e.g. with Gaussian , dense in all continuous distributions on compact domains

Learning deep kernels for exponential family densities Poster #221

slide-11
SLIDE 11

Why kernel exponential families Why kernel exponential families

Any density with Much richer class; e.g. with Gaussian , dense in all continuous distributions on compact domains

Learning deep kernels for exponential family densities Poster #221

slide-12
SLIDE 12

Why kernel exponential families Why kernel exponential families

Any density with Much richer class; e.g. with Gaussian , dense in all continuous distributions on compact domains Fit with score matching

Learning deep kernels for exponential family densities Poster #221

slide-13
SLIDE 13

Choosing a kernel with meta-learning Choosing a kernel with meta-learning

Fit quality depends a lot on kernel choice Also on the regularization weight Need to t these parameters

Learning deep kernels for exponential family densities Poster #221

slide-14
SLIDE 14

Choosing a kernel with meta-learning Choosing a kernel with meta-learning

Fit quality depends a lot on kernel choice Also on the regularization weight Need to t these parameters … but need to use held-out data to avoid trivially overtting

Learning deep kernels for exponential family densities Poster #221

slide-15
SLIDE 15

Choosing a kernel with meta-learning Choosing a kernel with meta-learning

Fit quality depends a lot on kernel choice Also on the regularization weight Need to t these parameters … but need to use held-out data to avoid trivially overtting Meta-learning: take

  • f whole t on a minibatch

Learning deep kernels for exponential family densities Poster #221

slide-16
SLIDE 16

Deep kernels Deep kernels

Simple kernels, e.g. , aren't enough:

Learning deep kernels for exponential family densities Poster #221

slide-17
SLIDE 17

Deep kernels Deep kernels

Simple kernels, e.g. , aren't enough: But we can learn lots of parameters with gradient descent: with a neural net, something simple

Learning deep kernels for exponential family densities Poster #221

slide-18
SLIDE 18

Deep kernels Deep kernels

Simple kernels, e.g. , aren't enough: But we can learn lots of parameters with gradient descent: with a neural net, something simple

Learning deep kernels for exponential family densities Poster #221

Combining a deep architecture with a kernel machine that takes the higher-level learned representation as input can be quite powerful.

— Y. Bengio & Y. LeCun, “ ”, 2007 Scaling Learning Algorithms towards AI

slide-19
SLIDE 19

Results Results

Learns local dataset geometry: better ts On real data: slightly worse likelihoods, maybe better “shapes” than deep likelihood models

Learning deep kernels for exponential family densities Poster #221

slide-20
SLIDE 20

Results Results

Learns local dataset geometry: better ts On real data: slightly worse likelihoods, maybe better “shapes” than deep likelihood models

Learning deep kernels for exponential family densities Poster #221

slide-21
SLIDE 21

Results Results

Learns local dataset geometry: better ts On real data: slightly worse likelihoods, maybe better “shapes” than deep likelihood models

Learning deep kernels for exponential family densities Poster #221