[PPT] - Upscaling Beyond SuperResolution Using a Novel DeepLearning System PowerPoint Presentation

SLIDE 1

Upscaling Beyond Super–Resolution Using a Novel Deep–Learning System

Pablo Navarrete Michelini

pnavarre@boe.com.cn

Hanwen Liu

lhw@boe.com.cn BOE Technology Group Co., Ltd.

SLIDE 2

BOE Technology Group Co., Ltd.

SLIDE 3

BOE Ultra–HD Panels

SLIDE 4

Chapter I : “The Layer and the System”

SLIDE 5

Standard Upscaling

For example, a simple linear interpolation can be done with F = 1/4 1/2 1/4

1/2 1 1/2 1/4 1/2 1/4

.

SLIDE 6

Standard Upscaling

Efficient implementation avoids multiplying zeros. Break F into many filters Wi.

SLIDE 7

Classic Upscalers

Classic Upscalers: Nearest Neighbor, Linear, Bicubic, Lanczos, . . . Advanced Upscalers: Directional filters (NEDI), wavelets, . . .

(a) Original (b) Nearest Neighbor (c) Bicubic Figure: Classic Upscalers

SLIDE 8

GTC–2016 MuxOut

SLIDE 9

GTC–2016 MuxOut

SLIDE 10

GTC–2016 MuxOut

SLIDE 11

GTC–2016 MuxOut

SLIDE 12

GTC–2016 MuxOut

SLIDE 13

GTC–2016 MuxOut

SLIDE 14

Similar Approaches

SRCNN

Dong C., et.al., “Learning a Deep Convolutional Network for Image Super–Resolution.” Sept 2014.

BOE MuxOut

Navarrete P., et.al., “Upscaling with Deep Convolutional Networks and Muxout Layers.” May 2016.

Google RAISR

Romano Y., “RAISR: Rapid and Accurate Image Super Resolution.” Jun 2016.

Twitter ESPCN

Shi W., et.al, “Real–Time Single Image and Video Super–Resolution Using an Efficient Sub–Pixel Convolutional Neural Network.” Sept 2016.

Twitter GAN

Ledig C., et.al., “Photo-Realistic Single Image Super–Resolution Using a Generative Adversarial Network.” Sept 2016.

Twitter GAN

Sønderby C.K., et.al, “Amortised MAP Inference for Image Super-resolution.” Oct 2016.

SLIDE 15

Similar Approaches

Twitter ESPCN:

“Sub–pixel convolution layer” = “MuxOut r × r”. Differences:

MuxOut considers several groups of r 2 features. MuxOut is design to factorize r and use as several layers within the network.

Figure from: Shi W., et.al, “Real–Time Single Image and Video Super–Resolution Using an Efficient Sub–Pixel Convolutional Neural Network.” Sept 2016.

SLIDE 16

Similar Approaches

Google RAISR:

Uses a ML–approach to learn adaptive filters. Not based on convolutional–networks.

Similarity:

We will show how to interpret the convolutional–network approach as an adaptive filter.

Figure from: Romano Y., “RAISR: Rapid and Accurate Image Super Resolution.” Jun 2016.

SLIDE 17

MuxOut Layer – Old Version

Problems of MuxOut:

Reduces processing features Works very well only with easy content (e.g. text). Why? Filter parameters W ahve 2 tasks: Downsampling: Which combination of a–b–c–d works better? Filter: Which values work better for interpolation?

SLIDE 18

MuxOut Layer – New Version

New Version:

Consider all (or most) possible combinations of features Can keep the same number of processing features. Filter parameters can focus on interpolation. SGD algorithms converge fast and stable.

SLIDE 19

MuxOut Usage

A convolutional–layer block typically means:

conv(xcin) = σ

cin xcin ∗ W cin,cout + bcout
We will consider convolution and activation independently:

conv(xcin) =

cin

xcin ∗ W cin,cout activ(xc) = σ (xc + bc)

And we use MuxOut like:

SLIDE 20

First Approach

Problem: Large upscaling factors color might be misaligned with lumminance.

SLIDE 21

Full–Color Configuration

Idea: RGB Input + RGB Output Problem: MuxOut mixes color channels. Need to process separately:

SLIDE 22

Chroma Sub–sampling Configuration

Note: HVS is less sensitive to the position and motion of color than luminance.

SLIDE 23

MSE vs SSIM

Traditional approach: Loss(X, Y ) = MSE(X, Y ) = 1 H · W

H,W

i,j=0

(Xi,j − Yi,j)2 Problem: Not well correlated with HVS Why not PSNR? → PSNR is unbounded. Loss(X, Y ) = SSIM(X, Y ) = (2µXµY + C1)(2σXY + C2) (µ2

X + µ2 Y + C1)(σ2 X + σ2 Y + C2)

Well correlated with HVS. Differentiable. Behaves well with SGD.

SLIDE 24

Results

(a) Standard (PSNR 24.82 dB – SSIM 0.8463) (b) Ours (PSNR 27.31 dB – SSIM 0.8990)

SLIDE 25

Chapter II : “The Analysis”

SLIDE 26

Analysis

Linear Systems: Interpolation filter given by impulse response. CN: Is not Linear because of ReLU.

(c) Activity Recorder (d) Mask Layer

Use an input image and record activity. Replace all activations (ReLU) by a “Mask layer”. The system becomes linear! Check impulse response.

SLIDE 27

Analysis

SLIDE 28

Analysis

SLIDE 29

Analysis

SLIDE 30

Analysis

SLIDE 31

Analysis

SLIDE 32

Analysis

SLIDE 33

Analysis

SLIDE 34

Analysis

SLIDE 35

Analysis

SLIDE 36

Analysis

SLIDE 37

Analysis

SLIDE 38

Analysis

SLIDE 39

Analysis

SLIDE 40

Chapter III : “Hyper–Resolution”

SLIDE 41

Hyper–Resolution

We say that x and y are aliases if Downscale(x) = Downscale(y) Many realistic images are aliased. MSE, SSIM, etc aim for only one alias. MSE, SSIM traget removes the innovation process! (e.g. linear regression). Give up the original content. We just want it to “look real”.

SLIDE 42

Hyper–Resolution

Generator (Upscaler) Discriminator

SLIDE 43

Generative Adversarian Networks (GAN)

Increasing attention and significant progress in the last year. We will refer to the following important references:

WGAN:

Arjovsky M., et.al., “Wasserstein GAN.” Jan 2017.

Improved WGAN:

Gulrajani I., et.al., “Improved Training of Wasserstein GANs.” March 2017.

Losses: LD = E [D(xfake)] − E [D(xreal)] + λgpE

(∇ˆ

xD(ˆ

x)2 − 1)2 LG = −E [D(xfake)] + λLR∆ (Downscale(G(xLR)), xLR)

SLIDE 44

Generative Adversarian Networks (GAN)

We do not want to reveal the high–resolution content during the Upscaler’s training. We do not want to generate artificial images with no reference to the input. We ask the upscaler to be able to recover the low–resolution input with a standard downscaler (e.g. area). ∆ ( Downscale(G(xLR)), xLR ) with: ∆ (x, y) = MSE(x, y)

r

∆ (x, y) = 1 − SSIM(x, y)

SLIDE 45

Results

(e) Standard (PSNR 29.78 dB) (f) Original (PSNR ∞) (g) Ours (PSNR 25.68 dB)

SLIDE 46

Results

(h) Standard (PSNR 29.78 dB) (i) Original (PSNR ∞) (j) Ours (PSNR 25.68 dB)

SLIDE 47

Results

(k) Standard (PSNR 29.78 dB) (l) Original (PSNR ∞) (m) Ours (PSNR 25.68 dB)

SLIDE 48

Conclusions

Overview: System: Proposed improved MuxOut. Analysis: Novel approach to visualize CN as adaptive filter. Super–Resolution: Proposed SSIM loss and process color input/output. Hyper–Resolution: Hallucinating details using GAN can produce results comparable to original content. Next Steps: Larger upscaling factors. Use analysis to improve design and test for other problems. Improve generalization of GAN approach.

SLIDE 49

Questions & Answers

Thank you!

LinkedIn:

https://www.linkedin.com/in/pnavarre

ResearchGate:

https://www.researchgate.net/profile/Pablo_Navarrete_Michelini