Stan: Probabilistic Modeling Language, MCMC Sampler, and Optimizer - PowerPoint PPT Presentation
Stan: Probabilistic Modeling Language, MCMC Sampler, and Optimizer Development Team: Andrew Gelman, Bob Carpenter , Matt Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, Allen Riddell MCMski 2014
Stan: Probabilistic Modeling Language, MCMC Sampler, and Optimizer Development Team: Andrew Gelman, Bob Carpenter , Matt Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, Allen Riddell MCMski 2014 mc-stan.org
Goals / Aims • Scalability – model complexity, number of parameters, data size • Efficiency – fast iterations, low memory, high effective sample sizes • Robustness – numerical routines, model structure (i.e., posterior geometry) • Usability – general purpose, clear modeling language, integration (R, Python, command line), expose log prob & gradients/Hessians & I/O
History • Derived from BUGS • declarative → imperative • untyped → strong static typing • Gibbs sampling → adaptive (R)HMC & optimization • interpreted → compiled • restrictive licenses (proprietary/GPL) → liberal (BSD)
Technical Implementation • Model Specification – (trans) data, (trans) parameters, log prob, generated quantities • Sampling via Adaptive Hamiltonian Monte Carlo – warmup converges & estimates mass matrix and step size – (Geo)NUTS adapts number of steps • Optimization via BFGS Quasi-Newton • Translated to C++ with Template Metaprogramming – constraints to transforms + Jacobians; declarations to I/O – automatic differentitation for gradients & Hessians – custom probability and special functions
Strengths • high effective sample size/second (HMC / RHMC) • expressive language vs. BUGS; extensible like JAGS • extensive doc & example models • active, helpful user community • large, diverse development team • integrated into R, Python, command-line (shell) • reusable template lib (auto-diff, distributions & funs, models)
Limitations • no discrete parameters (can marginalize) • no implicit missing data (code as parameters) • not parallelized within chains • language limited relative to black boxes (cf., emcee) • limited data types and constraints • C++ template code is complex for user extension • sampling slow, nonscalable; optimization brittle or approx
Current and Future Development • (stiff) diff eq solving by integration • Riemann manifold HMC (more complex geometry) • approximate inference: [stochastic] VB, EP , max marginal • structured matrices: Cholesky correlation, sparse • L-BFGS optimization (more scalable) • more robust adaptation (cross chain?) • parallelization within and across chains • better probabilistic testing for correctness • faster, cleaner C++ code & more useful interfaces
How Stan Got its Name • “Stan” is not an acronym; Gelman mashed up 1. Eminem song about a stalker fan, and 2. Stanislaw Ulam (1909–1984), co-inventor of Monte Carlo method (and hydrogen bomb). Ulam holding the Fermiac, Enrico Fermi’s physical Monte Carlo simulator for random neutron diffusion
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.