[PPT] - Image Super-Resolution Using Deep Convolutional Networks Chen PowerPoint Presentation

SLIDE 1

Image ¡Super-‑Resolution ¡Using ¡ Deep ¡Convolutional ¡Networks

Chen ¡Change ¡Loy 吕健勤 Chinese ¡University ¡of ¡Hong ¡Kong

www.ie.cuhk.edu.hk/~ccloy/

SLIDE 2

Waifu2x

http://waifu2x.udp.jp/

2

SLIDE 3

Waifu2x

3

SLIDE 4

Waifu2x

Original 2x ¡upscaling

4

SLIDE 5

Waifu2x

5

SLIDE 6

Waifu2x

http://ejohn.org/blog/using-‑waifu2x-‑to-‑upscale-‑japanese-‑prints/

6

SLIDE 7

Waifu2x

7

SLIDE 8

Waifu2x

https://github.com/nagadomi/waifu2x

8

SLIDE 9

Outline

SRCNN
Image ¡super-‑resolution
Image ¡Super-‑Resolution ¡Using ¡Deep ¡Convolutional ¡Networks
C. ¡Dong, ¡C. ¡C. ¡Loy, ¡K. ¡He, ¡and ¡X. ¡Tang

IEEE ¡Transactions ¡on ¡Pattern ¡Analysis ¡and ¡Machine ¡Intelligence, ¡2015

Learning ¡a ¡Deep ¡Convolutional ¡Network ¡for ¡Image ¡Super-‑Resolution
C. ¡Dong, ¡C. ¡C. ¡Loy, ¡K. ¡He, ¡X. ¡Tang

in ¡Proceedings ¡of ¡European ¡Conference ¡on ¡Computer ¡Vision, ¡pp ¡184-‑199, ¡2014

Boosting ¡optical ¡character ¡recognition
Boosting ¡Optical ¡Character ¡Recognition: ¡A ¡Super-‑Resolution ¡Approach
C. ¡Dong, ¡X. ¡Zhu, ¡Y. ¡Deng, ¡C. ¡C. ¡Loy, ¡Y. ¡Qiao

Technical ¡report, ¡arXiv:1506.02211, ¡2015

ARCNN
Compression ¡artifacts ¡reduction
Compression ¡Artifacts Reduction ¡by ¡a ¡Deep ¡Convolutional ¡Network
C. ¡Dong, ¡Y. ¡Deng, ¡C. ¡C. ¡Loy, ¡X. ¡Tang

Technical ¡report, ¡arXiv:1504.06993, ¡2015

9

SLIDE 10

Image ¡Super-‑Resolution ¡Using ¡ Deep ¡Convolutional ¡Networks

C. ¡Dong, ¡C. ¡C. ¡Loy, ¡K. ¡He, ¡X. ¡Tang

IEEE ¡Transactions ¡on ¡Pattern ¡Analysis ¡and ¡Machine ¡Intelligence, ¡2015 European ¡Conference ¡on ¡Computer ¡Vision, ¡2014

SLIDE 11

Single ¡image ¡super-resolution ¡单帧图像超分辨率重建

2x ¡upscaling Low ¡resolution ¡(LR) High ¡resolution ¡(HR)

Reconstruct ¡a ¡high-‑resolution ¡image ¡ from ¡a ¡given ¡low-‑resolution ¡image

11

SLIDE 12

Applications

Digital ¡high ¡definition ¡TV ¡– From ¡SDTV ¡

to ¡HDTV

Medical ¡Imaging
Satellite ¡imaging
CCTV ¡surveillance ¡(car ¡plate ¡or ¡face)
Airborne ¡surveillance

12

SLIDE 13

Single ¡image ¡super-resolution ¡单帧图像超分辨率重建

2x ¡upscaling Low ¡resolution ¡(LR) High ¡resolution ¡(HR)

external databases

Example-‑based ¡ SR

13

SLIDE 14

Example-based ¡methods

Exploit ¡internal ¡similarities ¡of ¡the ¡

same ¡image ¡

Glasner, D., Bagon, S., Irani, M.: Super-resolution from a single image. In: IEEE International Conference on Computer

Vision. pp. 349–356 (2009)

14

SLIDE 15

Example-based ¡methods

Learn ¡mapping ¡functions ¡from ¡external ¡low-‑ and ¡high-‑resolution ¡

exemplar ¡pairs ¡

[1] ¡J. ¡Yang ¡et ¡al., ¡T .: ¡Coupled ¡dictionary ¡ training ¡for ¡image ¡super-‑resolution. ¡TIP ¡21(8), ¡3467-‑3478 ¡(2012) [2] ¡J. ¡Yang ¡et ¡al., ¡Image ¡super-‑resolution ¡via ¡sparse ¡representation. ¡ TIP ¡19(11), ¡2861-‑2873 ¡(2010)

1. Overlapping patches are densely cropped from the input image and pre-processed
2. Patches are encoded by a low-resolution dictionary
3. The sparse coefficients are passed into a high-resolution dictionary for reconstructing high-

resolution patches

4. Constructed patches are aggregated (e.g., by weighted averaging) to produce the final output

15

SLIDE 16

Contributions

We ¡directly ¡learn ¡an ¡end-‑to-‑end ¡mapping ¡between ¡low-‑ and ¡high-‑

resolution ¡images, ¡with ¡no ¡extra ¡pre/post-‑processing ¡beyond ¡the ¡

ptimization ¡
We ¡shows ¡that ¡the ¡traditional ¡sparse ¡coding ¡SR ¡method ¡can ¡be ¡

viewed ¡as ¡deep ¡convolutional ¡neural ¡network.

We ¡demonstrate ¡deep ¡learning ¡is ¡useful ¡in ¡the ¡classical ¡computer ¡

vision ¡problemof ¡super-‑resolution, ¡and ¡can ¡achieve ¡good ¡quality ¡and ¡

speed. ¡

16

SLIDE 17

Contributions

Source: ¡Dong ¡et ¡al., ¡Image ¡Super-‑Resolution ¡Using ¡Deep ¡Convolutional ¡Networks, ¡TPAMI ¡2015

17

SLIDE 18

Super-resolution ¡CNN ¡(SRCNN)

Put ¡together ¡operations ¡that ¡were ¡traditionally ¡treated ¡individually
Patch ¡extraction ¡and ¡representation
Non-‑linear ¡mapping ¡
Reconstruction ¡

18

SLIDE 19

Super-resolution ¡CNN ¡(SRCNN)

Patch ¡extraction ¡and ¡representation F1(Y) = max (0, W1 ∗ Y + B1)

f1 × f1 × n1 n1 -‑dimensional ¡biases

filters

19

SLIDE 20

Examine ¡the ¡learned ¡filters

Laplacian/Gaussian ¡ filters Edge ¡detectors Texture ¡extractor ¡ Dead ¡filters ¡similar ¡to ¡those ¡observed ¡in ¡Zeiler ECCV ¡2014 Patterns ¡may ¡emerge ¡given ¡long ¡enough ¡training ¡time

Zeiler, ¡M.D., ¡Fergus, ¡R.: ¡Visualizing ¡and ¡understanding ¡convolutional ¡neural ¡networks. ¡ECCV ¡(2014)

20

SLIDE 21

Super-resolution ¡CNN ¡(SRCNN)

Non-‑linear ¡mapping

‑dimensional ¡biases

filters F2(Y) = max (0, W2 ∗ F1(Y) + B2) n2 n1 × 1 × 1 × n2

21

SLIDE 22

Super-resolution ¡CNN ¡(SRCNN)

Reconstruction 1-‑dimensional ¡biases filters F(Y) = W3 ∗ F2(Y) + B3 n2 × f3 × f3

22

SLIDE 23

Relation ¡to ¡the ¡sparse-coding-based ¡methods ¡

responses

f patch
f

neighbouring patches Patch extraction and representation Non-linear mapping Reconstruction

Sparse ¡coding SRCNN Extract f1 × f1 low-‑ resolution ¡ patch Mean ¡subtraction Projected ¡onto a ¡(low-‑resolution) ¡dictionary, ¡ size ¡n1 Equivalent ¡to ¡applying ¡n1 linear ¡filters ¡(f1 × f1) ¡on ¡the ¡input ¡ image ¡ Mean ¡subtraction ¡is absorbed ¡

23

SLIDE 24

Relation ¡to ¡the ¡sparse-coding-based ¡methods ¡

responses

f patch
f

neighbouring patches Patch extraction and representation Non-linear mapping Reconstruction

Sparse ¡coding SRCNN Apply ¡sparse ¡coding ¡solver ¡on ¡the ¡projected ¡n1 coefficients ¡ The ¡outputs ¡are ¡n2 coefficients representing ¡ the the ¡high-‑ resolution ¡patch. Iterative ¡algorithm Equivalent ¡to ¡non-‑linear ¡mapping Feed-‑forward

24

SLIDE 25

Relation ¡to ¡the ¡sparse-coding-based ¡methods ¡

responses

f patch
f

neighbouring patches Patch extraction and representation Non-linear mapping Reconstruction

Sparse ¡coding SRCNN n2 coefficients ¡are ¡then ¡projected ¡onto ¡another ¡(high-‑ resolution) ¡dictionary ¡ to ¡produce ¡a ¡high-‑resolution ¡ The ¡over-‑lapping ¡high-‑resolution ¡patches ¡are ¡then ¡ averaged Equivalent ¡to ¡to ¡linear ¡ convolutions ¡on ¡the ¡n2 feature ¡maps

25

SLIDE 26

Loss ¡function ¡

Estimate
Minimizing ¡the ¡loss ¡between ¡the ¡reconstructed ¡images ¡F ¡(Y; ¡Θ) ¡and ¡the ¡

corresponding ¡ground ¡truth ¡high-‑resolution ¡images ¡X

The ¡loss ¡is ¡minimized ¡using ¡stochastic ¡gradient ¡descent ¡with ¡the ¡standard ¡

backpropagation ¡

Θ = {W1, W2, W3, B1, B2, B3} L(Θ) = 1 n

n

X

i=1

||F(Yi; Θ) − Xi||2

26

SLIDE 27

Training ¡data

Yang’s ¡paper ¡[1]
91 ¡images ¡(on ¡average ¡200x200).
Total ¡24,800 ¡patches ¡(33x33). ¡
ImageNet
395,909 ¡images ¡from ¡the ¡ILSVRC ¡2013 ¡ImageNet

detection ¡training ¡partition ¡

Decomposed ¡into ¡5 ¡million ¡sub-‑images ¡using ¡a ¡stride ¡
f ¡33 ¡

[1] ¡J. ¡Yang, ¡et ¡al., ¡"Image ¡super-‑resolution ¡as ¡sparse ¡representation ¡of ¡raw ¡image ¡patches." CVPR ¡2008. ¡

27

SLIDE 28

Training ¡data

More ¡training ¡data ¡leads ¡to ¡better ¡performance

28

SLIDE 29

Filter ¡number

In ¡general, ¡the ¡performance ¡would ¡improve ¡if ¡we ¡increase ¡the ¡

network ¡width, ¡i.e., ¡adding ¡more ¡filters, ¡at ¡the ¡cost ¡of ¡running ¡time.

29

SLIDE 30

Filter ¡size

A ¡larger ¡filter ¡size ¡leads ¡to ¡better ¡results. ¡
A ¡reasonably ¡larger ¡filter ¡size ¡could ¡grasp ¡richer ¡structural ¡

information, ¡which ¡in ¡turn ¡lead ¡to ¡better ¡results ¡

Trade-‑off ¡between ¡performance ¡and ¡speed ¡

30

SLIDE 31

Deeper ¡structure

The ¡deeper ¡the ¡better? ¡
Sensitive ¡to ¡the ¡initialization ¡parameters ¡and ¡learning ¡rate. ¡

31

SLIDE 32

Comparisons ¡with ¡existing ¡methods ¡

Structure ¡9-‑5-‑5 ¡trained ¡on ¡ImageNet
Upscaling ¡factors ¡x2, ¡x3, ¡x4
The ¡Set5 ¡(5 ¡images), ¡Set14 ¡(14 ¡images) ¡and ¡BSD200 ¡
Evaluation ¡metrics
PSNR, ¡SSIM, ¡IFC, ¡NQM, ¡MSSIM, ¡WPSNR ¡

Yang, C.Y., Ma, C., Yang, M.H.: Single-image super-resolution: A benchmark. ECCV 2014

32

SLIDE 33

Comparisons ¡with ¡existing ¡methods ¡

SC – Sparse ¡coding by ¡Yang ¡et ¡al. ¡TIP ¡2010 NE+LLE – Neighbour embedding ¡+ ¡non-‑negative ¡least ¡squares by ¡Chang ¡CVPR ¡2004 KK – Sparse ¡regression ¡and ¡natural ¡image ¡prior ¡by ¡Kim ¡and ¡Kwon ¡TPAMI ¡2010 ANR – Anchored ¡Neighbourhood ¡Regression by ¡Timofte et ¡al. ¡ICCV ¡2013 A+ – Adjusted ¡Anchored ¡Neighbourhood Regression by ¡Timofte et ¡al. ¡ACCV ¡2014 Set5

33

SLIDE 34

SC / 25.58 dB ANR / 25.90 dB SRCNN / 27.58 dB

Dong ¡et ¡al., ¡Learning ¡a ¡Deep ¡Convolutional ¡Network ¡for ¡Image ¡Super-‑Resolution, ¡ECCV ¡2014

34

SLIDE 35

/ SRCNN / 34.91 dB ANR / 34.60 dB SC / 34.11 dB

Dong ¡et ¡al., ¡Learning ¡a ¡Deep ¡Convolutional ¡Network ¡for ¡Image ¡Super-‑Resolution, ¡ECCV ¡2014

35

SLIDE 36

SC / 33.32 dB ANR / 33.82 dB SRCNN / 34.35 dB

Dong ¡et ¡al., ¡Learning ¡a ¡Deep ¡Convolutional ¡Network ¡for ¡Image ¡Super-‑Resolution, ¡ECCV ¡2014

36

SLIDE 37

Speed ¡comparison

SC – Sparse ¡coding by ¡Yang ¡et ¡al. ¡TIP ¡2010 NE+LLE – Neighbour embedding ¡+ ¡non-‑negative ¡least ¡squares by ¡Chang ¡CVPR ¡2004 KK – Sparse ¡regression ¡and ¡natural ¡image ¡prior ¡by ¡Kim ¡and ¡Kwon ¡TPAMI ¡2010 ANR – Anchored ¡Neighbourhood ¡Regression by ¡Timofte et ¡al. ¡ICCV ¡2013 A+ – Adjusted ¡Anchored ¡Neighbourhood Regression by ¡Timofte et ¡al. ¡ACCV ¡2014

37

SLIDE 38

Color ¡channels

Cb, ¡Cr ¡channels ¡barely ¡help ¡in ¡improving ¡the ¡performance. ¡
RGB ¡channels ¡achieves ¡marginally ¡better ¡performance ¡than ¡using ¡Y ¡
nly

38

SLIDE 39

Plausible ¡extensions

Deeper ¡network
Better ¡initialization
More ¡training ¡data
Handle ¡multiple ¡upscaling ¡factors ¡simultaneously
Multiple ¡objectives
E.g. ¡predict ¡image ¡boundary ¡simultaneously
Faster ¡convergence ¡but ¡performances ¡converge ¡given ¡long ¡enough ¡training ¡

time ¡

39

SLIDE 40

Summary

Deep ¡model ¡formulation ¡inspired ¡by ¡classical ¡approaches.
Super-‑resolution ¡CNN
Lightweight
Little ¡pre/post-‑processing ¡beyond ¡optimization
Performance ¡gain ¡by ¡additional ¡layers/filters
Potential ¡in ¡other ¡low-‑level ¡vision ¡problems
Caffe model ¡and ¡code
http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html

40

SLIDE 41

Boosting ¡Optical ¡Character ¡Recognition: ¡A ¡ Super-‑Resolution ¡Approach ¡

C. ¡Dong, ¡X. ¡Zhu, ¡Y. ¡Deng, ¡C. ¡C. ¡Loy, ¡Y. ¡Qiao

Technical ¡report, ¡arXiv:1506.02211, ¡ 2015 Winning ¡approach ¡in ¡ICDAR ¡Competition ¡on ¡Text ¡Image ¡Super-‑Resolution

SLIDE 42

Boosting ¡Optical ¡Character ¡Recognition

ICDAR2015 ¡Competition ¡on ¡Text ¡Image ¡Super-‑Resolution

Low ¡resolution High ¡resolution

42

SLIDE 43

Boosting ¡Optical ¡Character ¡Recognition

ICDAR2015 ¡Competition ¡on ¡Text ¡Image ¡Super-‑Resolution

ICDAR2015-‑SR-‑dataset

43

SLIDE 44

Boosting ¡optical ¡character ¡recognition

Method RMSE PSNR SSIM OCR Bicubic 19.04 23.50 0.879 60.64 Lanczos3 16.97 24.65 0.902 64.36 Zeyde et ¡al. 13.05 27.21 0.941 69.72 ASRS ¡[IJDAR ¡2015] 12.86 26.98 0.950 71.25 A+ ¡[ACCV ¡2014] 10.03 29.50 0.966 73.10 ¡ Orange ¡Labs ¡[VISAPP2015] 11.27 28.25 0.953 74.12 SRCNN ¡(Favour PSNR) 7.24 31.99 0.981 76.10 SRCNN ¡(Favour ¡OCR) 7.52 31.75 0.980 77.19

OCR ¡performance ¡using ¡original ¡HR ¡image ¡= ¡78.81%

44

SLIDE 45

Boosting ¡optical ¡character ¡recognition

45

SLIDE 46

Compression ¡Artifacts ¡Reduction ¡by ¡ a ¡Deep ¡Convolutional ¡Network

C. ¡Dong, ¡Y. ¡Deng, ¡C. ¡C. ¡Loy, ¡X. ¡Tang

Technical ¡report, ¡arXiv:1504.06993, ¡ 2015

SLIDE 47

Before After

47

SLIDE 48

Artifact ¡Reduction ¡CNN ¡(AR-CNN)

48

SLIDE 49

Transfer ¡schemes

Transfer ¡shallow ¡to ¡deeper ¡model ¡
Transfer ¡high ¡to ¡low ¡quality ¡
Transfer ¡standard ¡to ¡real ¡use ¡case ¡

49

SLIDE 50

Results

SA-DCT: A. Foi, V. Katkovnik, and K. Egiazarian. Pointwise shape- adaptive DCT for high-quality denoising and deblocking of grayscale and color images. TIP, 16(5):1395–1411, 2007 PSNR-B: C. Yim and A. C. Bovik. Quality assessment of deblocked images. TIP, 20(1):88–98, 2011

50

SLIDE 51

3264 ¡x ¡2448 600 ¡x ¡450 ¡

Image ¡compression ¡by ¡Twitter

51

SLIDE 52

Before After

52

SLIDE 53

Before After

53

SLIDE 54

Before After

54

SLIDE 55

Before After

55

SLIDE 56

Before After

56

SLIDE 57

Thanks!

DENG, Yubin 邓煜煜彬彬 Dong Chao 董超超

SLIDE 58

Backup

SLIDE 59

SRCNN – Set5

59

SLIDE 60

SRCNN – Set14

60

SLIDE 61

SRCNN – BSD200

61

Image ¡Super-­‑Resolution ¡Using ¡ Deep ¡Convolutional ¡Networks

Waifu2x

Waifu2x

Waifu2x

Waifu2x

Waifu2x

Waifu2x

Waifu2x

Outline

Image ¡Super-­‑Resolution ¡Using ¡ Deep ¡Convolutional ¡Networks

Single ¡image ¡super-resolution ¡单帧图像超分辨率重建

Applications

Single ¡image ¡super-resolution ¡单帧图像超分辨率重建

Example-based ¡methods

Example-based ¡methods

Contributions

Contributions

Super-resolution ¡CNN ¡(SRCNN)

Super-resolution ¡CNN ¡(SRCNN)

Examine ¡the ¡learned ¡filters

Super-resolution ¡CNN ¡(SRCNN)

Super-resolution ¡CNN ¡(SRCNN)

Relation ¡to ¡the ¡sparse-coding-based ¡methods ¡

Relation ¡to ¡the ¡sparse-coding-based ¡methods ¡

Relation ¡to ¡the ¡sparse-coding-based ¡methods ¡

Loss ¡function ¡

Training ¡data

Training ¡data

Filter ¡number

Filter ¡size

Deeper ¡structure

Comparisons ¡with ¡existing ¡methods ¡

Comparisons ¡with ¡existing ¡methods ¡

Speed ¡comparison

Color ¡channels

Plausible ¡extensions

Summary

Boosting ¡Optical ¡Character ¡Recognition: ¡A ¡ Super-­‑Resolution ¡Approach ¡

Boosting ¡Optical ¡Character ¡Recognition

Boosting ¡Optical ¡Character ¡Recognition

Boosting ¡optical ¡character ¡recognition

Boosting ¡optical ¡character ¡recognition

Compression ¡Artifacts ¡Reduction ¡by ¡ a ¡Deep ¡Convolutional ¡Network

Artifact ¡Reduction ¡CNN ¡(AR-CNN)

Transfer ¡schemes

Results

Image ¡compression ¡by ¡Twitter

Thanks!

Backup

SRCNN – Set5

SRCNN – Set14

SRCNN – BSD200

Image ¡Super-‑Resolution ¡Using ¡ Deep ¡Convolutional ¡Networks

Image ¡Super-‑Resolution ¡Using ¡ Deep ¡Convolutional ¡Networks

Single ¡image ¡super-resolution ¡单帧图像超分辨率重建

Single ¡image ¡super-resolution ¡单帧图像超分辨率重建

Example-based ¡methods

Example-based ¡methods

Super-resolution ¡CNN ¡(SRCNN)

Super-resolution ¡CNN ¡(SRCNN)

Examine ¡the ¡learned ¡filters

Super-resolution ¡CNN ¡(SRCNN)

Super-resolution ¡CNN ¡(SRCNN)

Relation ¡to ¡the ¡sparse-coding-based ¡methods ¡

Relation ¡to ¡the ¡sparse-coding-based ¡methods ¡

Relation ¡to ¡the ¡sparse-coding-based ¡methods ¡

Loss ¡function ¡

Training ¡data

Training ¡data

Filter ¡number

Filter ¡size

Deeper ¡structure

Comparisons ¡with ¡existing ¡methods ¡

Comparisons ¡with ¡existing ¡methods ¡

Speed ¡comparison

Color ¡channels

Plausible ¡extensions

Boosting ¡Optical ¡Character ¡Recognition: ¡A ¡ Super-‑Resolution ¡Approach ¡

Boosting ¡Optical ¡Character ¡Recognition

Boosting ¡Optical ¡Character ¡Recognition

Boosting ¡optical ¡character ¡recognition

Boosting ¡optical ¡character ¡recognition

Artifact ¡Reduction ¡CNN ¡(AR-CNN)

Transfer ¡schemes