Merge pull request #412 from probml/paper

slinderman · web-flow · commit b9fcb9359c9e · 2025-03-26T04:30:09.000-07:00
Add JOSS Paper now that review is complete
diff --git a/.github/workflows/draft-pdf.yml b/.github/workflows/draft-pdf.yml
@@ -0,0 +1,34 @@
+name: Draft PDF
+on:
+  push:
+    paths:
+      - paper/**
+      - .github/workflows/draft-pdf.yml
+
+jobs:
+  paper:
+    runs-on: ubuntu-latest
+    name: Paper Draft
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Build draft PDF
+        uses: openjournals/openjournals-draft-action@master
+        with:
+          journal: joss
+          # This should be the path to the paper within your repo.
+          paper-path: paper/paper.md
+      - name: Upload
+        uses: actions/upload-artifact@v4
+        with:
+          name: paper
+          # This is the output path where Pandoc will write the compiled
+          # PDF. Note, this should be the same directory as the input
+          # paper.md
+          path: paper/paper.pdf
+      - name: Commit PDF to repository
+        uses: EndBug/add-and-commit@v9
+        with:
+          message: '(auto) Paper PDF Draft'
+          # This should be the path to the paper within your repo.
+          add: 'paper/paper.pdf' # 'paper/*.pdf' to commit all PDFs in the paper directory
diff --git a/.gitignore b/.gitignore
@@ -9,7 +9,6 @@ dist/
 # ignore figures unless manually added
 *.png
 *.jpg
-*.pdf
 *-dot
 *.DS_Store
 .vscode/
diff --git a/paper/paper.bib b/paper/paper.bib
@@ -0,0 +1,234 @@
+@article{vyas2020computation,
+  title={Computation through neural population dynamics},
+  author={Vyas, Saurabh and Golub, Matthew D and Sussillo, David and Shenoy, Krishna V},
+  journal={Annual review of neuroscience},
+  volume={43},
+  number={1},
+  pages={249--275},
+  year={2020},
+  publisher={Annual Reviews},
+  doi={10.1146/annurev-neuro-092619-094115}
+}
+
+ @book{murphy2023probabilistic,
+ author = "Kevin P. Murphy",
+ title = "Probabilistic Machine Learning: Advanced Topics",
+ publisher = "MIT Press",
+ year = 2023,
+ url = "http://probml.github.io/book2"
+}
+
+@book{sarkka2023bayesian,
+  title={Bayesian filtering and smoothing},
+  author={S{\"a}rkk{\"a}, Simo and Svensson, Lennart},
+  volume={17},
+  year={2023},
+  publisher={Cambridge University Press},
+  doi={10.1017/CBO9781139344203}
+}
+
+
+@misc{jax,
+  author = {James Bradbury and Roy Frostig and Peter Hawkins and Matthew James Johnson and Chris Leary and Dougal Maclaurin and George Necula and Adam Paszke and Jake Vander{P}las and Skye Wanderman-{M}ilne and Qiao Zhang},
+  title = {{JAX}: composable transformations of {P}ython+{N}um{P}y programs},
+  url = {http://github.com/google/jax},
+  version = {0.3.13},
+  year = {2018},
+}
+
+@inproceedings{zhao2023revisiting,
+  title={Revisiting structured variational autoencoders},
+  author={Zhao, Yixiu and Linderman, Scott},
+  booktitle={International Conference on Machine Learning},
+  pages={42046--42057},
+  year={2023},
+  organization={PMLR},
+  doi={10.48550/arXiv.2305.16543}
+}
+
+@article{lee2023switching,
+  title={Switching autoregressive low-rank tensor models},
+  author={Lee, Hyun Dong and Warrington, Andrew and Glaser, Joshua and Linderman, Scott},
+  journal={Advances in Neural Information Processing Systems},
+  volume={36},
+  pages={57976--58010},
+  year={2023},
+  doi={10.48550/arXiv.2306.03291}
+}
+
+@inproceedings{chang2023low,
+  title = 	 {Low-rank extended {K}alman filtering for online learning of neural networks from streaming data},
+  author =       {Chang, Peter G. and Dur\'an-Mart\'in, Gerardo and Shestopaloff, Alex and Jones, Matt and Murphy, Kevin P},
+  booktitle = 	 {Proceedings of The 2nd Conference on Lifelong Learning Agents},
+  pages = 	 {1025--1071},
+  year = 	 {2023},
+  editor = 	 {Chandar, Sarath and Pascanu, Razvan and Sedghi, Hanie and Precup, Doina},
+  volume = 	 {232},
+  series = 	 {Proceedings of Machine Learning Research},
+  month = 	 {22--25 Aug},
+  publisher =    {PMLR},
+  doi={10.48550/arXiv.2305.19535},
+}
+
+
+@article{weinreb2024keypoint,
+  author = {Weinreb, Caleb and Pearl, Jonah E. and Lin, Sherry and Osman, Mohammed Abdal Monium and Zhang, Libby and Annapragada, Sidharth and Conlin, Eli and Hoffmann, Red and Makowska, Sofia and Gillis, Winthrop F. and Jay, Maya and Ye, Shaokai and Mathis, Alexander and Mathis, Mackenzie W. and Pereira, Talmo and Linderman, Scott W. and Datta, Sandeep Robert},
+  date = {2024/07/01},
+  id = {Weinreb2024},
+  journal = {Nature Methods},
+  number = {7},
+  pages = {1329--1339},
+  title = {Keypoint-{M}o{S}eq: parsing behavior by linking point tracking to pose dynamics},
+  volume = {21},
+  year = {2024},
+  doi={10.1038/s41592-024-02318-2},
+}
+
+@misc{pyhsmm,
+  author = {Matthew James Johnson},
+  title = {{PyHSMM}: Bayesian inference in HSMMs and HMMs},
+  url = {https://github.com/mattjj/pyhsmm},
+  version = {0.0.0},
+  year = {2020},
+}
+
+@misc{eeasensors,
+  author = {Adrien Corenflos and Simo Särkkä},
+  title = {Code Companion for {B}ayesian {F}iltering and {S}moothing},
+  url = {https://github.com/EEA-sensors/Bayesian-Filtering-and-Smoothing},
+  version = {1.0},
+  year = {2021},
+}
+
+
+@misc{ssm,
+author = {Linderman, Scott and Antin, Benjamin and Zoltowski, David and Glaser, Joshua},
+title = {{SSM: Bayesian Learning and Inference for State Space Models}},
+url = {https://github.com/lindermanlab/ssm},
+version = {0.0.1},
+year = {2020}
+}
+
+@misc{jsl,
+author = {Duran-Martin, Gerardo and Murphy, Kevin and Kara, Aleyna},
+title = {{JSL: JAX State-Space models (SSM) Library}},
+url={https://github.com/probml/JSL},
+year={2022}
+}
+
+@inproceedings{seabold2010statsmodels,
+  title={statsmodels: {E}conometric and statistical modeling with python},
+  author={Seabold, Skipper and Perktold, Josef},
+  booktitle={9th Python in Science Conference},
+  year={2010},
+  doi={10.25080/majora-92bf1922-011}
+}
+
+@misc{hmmlearn,
+  author={Ron Weiss and Shiqiao Du and Jaques Grobler and David Cournapeau and Fabian Pedregosa and Gael Varoquaux and Andreas Mueller and Bertrand Thirion and Daniel Nouri and Gilles Louppe and Jake Vanderplas and John Benediktsson and Lars Buitinck and Mikhail Korobov and Robert McGibbon and Stefano Lattarini and Vlad Niculae and Alexandre Gramfort and Sergei Lebedev and Daniela Huppenkothen and Christopher Farrow and Alexandr Yanenko and Antony Lee and Matthew Danielson and Alex Rockhill},
+  title={hmmlearn},
+  url={https://github.com/hmmlearn/hmmlearn},
+  version={0.3.2},
+  year={2024},
+  }
+
+@ARTICLE{durbin1998biological,
+  title     = "Biological sequence analysis: {P}robabilistic models of proteins
+               and nucleic acids",
+  author    = "Durbin, Richard and Eddy, Sean R and Krogh, Anders and Mitchison,
+               Graeme",
+  publisher = "Cambridge University Press",
+  month     =  apr,
+  year      =  1998,
+  doi={10.1017/cbo9780511790492},
+}
+
+@article{patterson2008state,
+  title={State-space models of individual animal movement},
+  author={Patterson, Toby A and Thomas, Len and Wilcox, Chris and Ovaskainen, Otso and Matthiopoulos, Jason},
+  journal={Trends in ecology \& evolution},
+  volume={23},
+  number={2},
+  pages={87--94},
+  year={2008},
+  publisher={Elsevier},
+  doi={10.1016/j.tree.2007.10.009}
+}
+
+@article{jacquier2002bayesian,
+  title={Bayesian analysis of stochastic volatility models},
+  author={Jacquier, Eric and Polson, Nicholas G and Rossi, Peter E},
+  journal={Journal of Business \& Economic Statistics},
+  volume={20},
+  number={1},
+  pages={69--87},
+  year={2002},
+  publisher={Taylor \& Francis},
+  doi={10.1198/073500102753410408}
+}
+
+@article{ott2004local,
+  title={A local ensemble {K}alman filter for atmospheric data assimilation},
+  author={Ott, Edward and Hunt, Brian R and Szunyogh, Istvan and Zimin, Aleksey V and Kostelich, Eric J and Corazza, Matteo and Kalnay, Eugenia and Patil, DJ and Yorke, James A},
+  journal={Tellus A: Dynamic Meteorology and Oceanography},
+  volume={56},
+  number={5},
+  pages={415--428},
+  year={2004},
+  publisher={Taylor \& Francis},
+  doi={10.3402/tellusa.v56i5.14462}
+}
+
+@article{stone1975parallel,
+  title={Parallel tridiagonal equation solvers},
+  author={Stone, Harold S},
+  journal={ACM Transactions on Mathematical Software (TOMS)},
+  volume={1},
+  number={4},
+  pages={289--307},
+  year={1975},
+  publisher={ACM New York, NY, USA},
+  doi={10.1145/355656.355657}
+}
+
+@article{sarkka2020temporal,
+  title={Temporal parallelization of {B}ayesian smoothers},
+  author={S{\"a}rkk{\"a}, Simo and Garc{\'\i}a-Fern{\'a}ndez, {\'A}ngel F},
+  journal={IEEE Transactions on Automatic Control},
+  volume={66},
+  number={1},
+  pages={299--306},
+  year={2020},
+  publisher={IEEE},
+  doi={10.1109/TAC.2020.2976316}
+}
+
+@article{hassan2021temporal,
+  title={Temporal parallelization of inference in hidden {M}arkov models},
+  author={Hassan, Syeda Sakira and S{\"a}rkk{\"a}, Simo and Garc{\'\i}a-Fern{\'a}ndez, {\'A}ngel F},
+  journal={IEEE Transactions on Signal Processing},
+  volume={69},
+  pages={4875--4887},
+  year={2021},
+  publisher={IEEE},
+  doi={10.1109/TSP.2021.3103338}
+}
+
+
+@misc{sts-jax,
+  author={Xinglong Li and Kevin Murphy},
+  title={Structural Time Series (STS) in JAX},
+  url={https://github.com/probml/sts-jax},
+  year={2022},
+  }
+
+@article{dalle2024hiddenmarkovmodels,
+  title={{HiddenMarkovModels.jl: Generic, fast and reliable state space modeling}},
+  author={Dalle, Guillaume},
+  journal={Journal of Open Source Software},
+  volume={9},
+  number={96},
+  pages={6436},
+  year={2024},
+  doi={10.21105/joss.06436}
+}
diff --git a/paper/paper.md b/paper/paper.md
@@ -0,0 +1,89 @@
+---
+title: 'Dynamax: A Python package for probabilistic state space modeling with JAX'
+tags:
+  - Python
+  - State space models
+  - dynamics
+  - JAX
+
+authors:
+  - name: Scott W. Linderman
+    orcid: 0000-0002-3878-9073
+    affiliation: "1" # (Multiple affiliations must be quoted)
+    corresponding: true
+  - name: Peter Chang
+    affiliation: "2"
+  - name: Giles Harper-Donnelly
+    affiliation: "3"
+  - name: Aleyna Kara
+    affiliation: "4"
+  - name: Xinglong Li
+    affiliation: "5"
+  - name: Gerardo Duran-Martin
+    affiliation: "6"
+  - name: Kevin Murphy
+    affiliation: "7"
+    corresponding: true
+affiliations:
+ - name: Department of Statistics and Wu Tsai Neurosciences Institute, Stanford University, USA
+   index: 1
+ - name: CSAIL, Massachusetts Institute of Technology, USA
+   index: 2
+ - name: Cambridge University, England, UK
+   index: 3
+ - name: Computer Science Department, Technical University of Munich Garching, Germany
+   index: 4
+ - name: Statistics Department, University of British Columbia, Canada
+   index: 5
+ - name: Queen Mary University of London, England, UK
+   index: 6
+ - name: Google DeepMind, USA
+   index: 7
+ 
+date: 19 July 2024
+bibliography: paper.bib
+
+---
+
+# Summary
+
+State space models (SSMs) are fundamental tools for modeling sequential data. They are broadly used across engineering disciplines like signal processing and control theory, as well as scientific domains like neuroscience [@vyas2020computation], genetics [@durbin1998biological], ecology [@patterson2008state], computational ethology [@weinreb2024keypoint], economics [@jacquier2002bayesian], and climate science [@ott2004local]. Fast and robust tools for state space modeling are crucial to researchers in all of these application areas.
+
+State space models specify a probability distribution over a sequence of observations, $y_1, \ldots y_T$, where $y_t$ denotes the observation at time $t$. The key assumption of an SSM is that the observations arise from a sequence of _latent states_, $z_1, \ldots, z_T$, which evolve according to a _dynamics model_ (aka transition model). An SSM may also use inputs (aka controls or covariates), $u_1,\ldots,u_T$, to steer the latent state dynamics and influence the observations. 
+For example, in a neuroscience application from @vyas2020computation, $y_t$ represents a vector of spike counts from $\sim 1000$ measured neurons, and $z_t$ is a lower dimensional latent state that changes slowly over time and captures correlations among the measured neurons. If sensory inputs to the neural circuit are known, they can be encoded in $u_t$. 
+In the computational ethology application of @weinreb2024keypoint, $y_t$ represents a vector of 3D locations for several key points on an animal's body, and $z_t$ is a discrete behavioral state that specifies how the animal's posture changes over time.
+In both examples, there are two main objectives: First, we aim to infer the latent states $z_t$ that best explain the observed data; formally, this is called _state inference_. 
+Second, we need to estimate the dynamics that govern how latent states evolve; formally, this is part of the _parameter estimation_ process. 
+`Dynamax` provides algorithms for state inference and parameter estimation in a variety of SSMs. 
+
+There are a few key design choices to make when constructing an SSM:
+
+- What is the type of latent state? E.g., is $z_t$ a continuous or discrete random variable?
+- How do the latent states evolve over time? E.g., are the dynamics linear or nonlinear?
+- How are the observations distributed? E.g., are they Gaussian, Poisson, etc.?
+
+Some design choices are so common they have their own names. Hidden Markov models (HMM) are SSMs with discrete latent states, and linear dynamical systems (LDS) are SSMs with continuous latent states, linear dynamics, and additive Gaussian noise.  `Dynamax` supports canonical SSMs and allows the user to construct bespoke models as needed, simply by inheriting from a base class and specifying a few model-specific functions. For example, see the _Creating Custom HMMs_ tutorial in the Dynamax documentation. 
+
+Finally, even for canonical models, there are several algorithms for state inference and parameter estimation. `Dynamax` provides robust implementations of several low-level inference algorithms to suit a variety of applications, allowing users to choose among a host of models and algorithms for their application. More information about state space models and algorithms for state inference and parameter estimation can be found in the textbooks by @murphy2023probabilistic and @sarkka2023bayesian. 
+
+
+# Statement of need
+
+`Dynamax` is an open-source Python package for state space modeling. Since it is built with `JAX` [@jax], it supports just-in-time (JIT) compilation for hardware acceleration on CPU, GPU, and TPU machines. It also supports automatic differentiation for gradient-based model learning. While other libraries exist for state space modeling in Python [@pyhsmm; @ssm; @eeasensors; @seabold2010statsmodels; @hmmlearn] and Julia [@dalle2024hiddenmarkovmodels],  `Dynamax` provides a diverse combination of low-level inference algorithms and high-level modeling objects that can support a wide range of research applications in JAX. Additionally, `Dynamax` implements parallel message passing algorithms that leverage the associative scan (a.k.a., parallel scan) primitive in JAX to take full advantage of modern hardware accelerators. Currently, these primitives are not natively supported in other frameworks like PyTorch. While various subsets of these models and algorithms may be found in other libraries, Dynamax is a "one stop shop" for state space modeling in JAX.
+
+The API for `Dynamax` is divided into two parts: a set of core, functionally pure, low-level inference algorithms, and a high-level, object oriented module for constructing and fitting probabilistic SSMs. The low-level inference API provides message passing algorithms for several common types of SSMs. For example, `Dynamax` provides `JAX` implementations for:
+
+- Forward-Backward algorithms for discrete-state hidden Markov models (HMMs), 
+- Kalman filtering and smoothing algorithms for linear Gaussian SSMs, 
+- Extended and unscented generalized Kalman filtering and smoothing for nonlinear and/or non-Gaussian SSMs, and
+- Parallel message passing routines that leverage GPU or TPU acceleration to perform message passing in $O(\log T)$ time on a parallel machine [@stone1975parallel; @sarkka2020temporal; @hassan2021temporal]. Note that these routines are not simply parallelizing over batches of time series, but rather using a parallel algorithm with sublinear depth or span. 
+
+The high-level model API makes it easy to construct, fit, and inspect HMMs and linear Gaussian SSMs. Finally, the online `Dynamax` documentation and tutorials provide a wealth of resources for state space modeling experts and newcomers alike.
+
+`Dynamax` has supported several publications. The low-level API has been used in machine learning research [@zhao2023revisiting; @lee2023switching; @chang2023low]. Special purpose libraries have been built on top of `Dynamax`, like the Keypoint-MoSeq library for modeling animal behavior [@weinreb2024keypoint] and the Structural Time Series in JAX library, `sts-jax` [@sts-jax]. Finally, the `Dynamax` tutorials are used as reference examples in a major machine learning textbook [@murphy2023probabilistic].  
+
+# Acknowledgements
+
+A significant portion of this library was developed while S.W.L. was a Visiting Faculty Researcher at Google and P.C., G.H.D., A.K., and X.L. were Google Summer of Code participants. 
+
+# References
diff --git a/paper/paper.pdf b/paper/paper.pdf