Unreproducible Research is Reproducible Xavier Bouthillier Csar - - PowerPoint PPT Presentation

unreproducible research is reproducible
SMART_READER_LITE
LIVE PREVIEW

Unreproducible Research is Reproducible Xavier Bouthillier Csar - - PowerPoint PPT Presentation

Unreproducible Research is Reproducible Xavier Bouthillier Csar Laurent Pascal Vincent Take Home There is a spectrum of notions of reproducibility in science. Current focus in DL is on one end of the spectrum . Inferential


slide-1
SLIDE 1

Unreproducible Research is Reproducible

Xavier Bouthillier César Laurent Pascal Vincent

slide-2
SLIDE 2

Take Home

Bouthillier, Laurent & Vincent

  • There is a spectrum of notions of reproducibility in

science.

  • Current focus in DL is on one end of the spectrum.
  • Inferential reproducibility is currently neglected

but fundamental for empirical research.

slide-3
SLIDE 3

Reproducibility Model Ranking

Bouthillier, Laurent & Vincent

slide-4
SLIDE 4

Executions

Model Ranking Reproducibility

Bouthillier, Laurent & Vincent

slide-5
SLIDE 5

Model Ranking

Executions

Reproducibility

Bouthillier, Laurent & Vincent

slide-6
SLIDE 6

Method

reproducibility

Result

reproducibility

Inferential

reproducibility

Reproducibility Spectrum

Bouthillier, Laurent & Vincent

Terminology by Goodman et al. (2016)

slide-7
SLIDE 7

Method Reproducibility

Bouthillier, Laurent & Vincent

Each black dot can be precisely reproduced

Test Error Rate

slide-8
SLIDE 8

Method Reproducibility

Bouthillier, Laurent & Vincent

Reproducible method != Reproducible conclusions: One cannot conclude that model A is better than model B with only 2 points!

Test Error Rate

slide-9
SLIDE 9

Result Reproducibility

Bouthillier, Laurent & Vincent

Test performance distributions can be reproduced

Test Error Rate

slide-10
SLIDE 10

Result Reproducibility

Bouthillier, Laurent & Vincent

Test performance distributions can be reproduced Reproducible results != Reproducible conclusions: One cannot conclude that model A is better than model B with only 1 dataset!

Test Error Rate

slide-11
SLIDE 11

Inferential Reproducibility

Bouthillier, Laurent & Vincent

P(model | rank) Model ranking statistics over 6 datasets

A conclusion regarding which is the best architecture cannot be reproduced on different datasets.

slide-12
SLIDE 12

Research Methodologies and Reproducibility Method & Result Reproducibility Inferential Reproducibility

Exploratory & Constructive Research Empirical & Confirmatory Research

Bouthillier, Laurent & Vincent

slide-13
SLIDE 13

Thank you!

Come see our poster Thu June 13th 06:30 -- 09:00 PM @ Pacific Ballroom #14

slide-14
SLIDE 14

Goodman, S. N., Fanelli, D., and Ioannidis, J. P. A. What does research reproducibility mean? Science Translational Medicine, 8(341):341ps12–341ps12, 2016. ISSN 1946-6234. doi: 10.1126/scitranslmed.aaf5027. Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup,D., and Meger, D. Deep reinforcement learning that matters. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018 Lucic, M., Kurach, K., Michalski, M., Gelly, S., and Bousquet, O. Are gans created equal? a large-scale study. In Advances in neural information processing systems, pp. 698–707, 2018. Melis, G., Dyer, C., and Blunsom, P. On the state of the art of evaluation in neural language models. In International Conference on Learning Representations, 2018.

References