Dynamic linear models

I came across A book called dynamic linear models with r. I’m curious if there are any PyMC3 resources for learning to construct such models? For those unfamiliar dlm is basically time series meets differential equations

Hi @jbuddy_13!
Yes, @brandonwillard wrote a very good blog post showing how to do that with PyMC3 and Theano – spoiler: this should become easier with the coming 4.0.0 :drum:
And you’re in luck, I just interviewed Brandon on my podcast, and we talked just about that :scream:
Hope that helps :vulcan_salute:

5 Likes

My blog post is rather Theano-specific–i.e. there’s no PyMC3 involved.

Fortunately, the RandomVariables used in that post are now available in our Theano-PyMC fork, so there’s no need for symbolic-pymc.

We’re working on some major updates to PyMC3 that will replace its current use of Distributions with RandomVariables. That, along with some updates to the samplers in PyMC3, will make it possible to specify Dynamic Linear Models in a similar–if not the same–way and make use of PyMC3’s sampler suite.

Otherwise, DLMs can be discretized diff. eqs for some stochastic processes, but the usefulness of that relationship–as a description of DLMs–diverges quite rapidly from there.

3 Likes

Hi @brandonwillard, I think I’ll wait for the release you’ve mentioned. I’m not an expert in DLMs (or the theory therein), but would like to play around with high level API and see if it’s relevant in my line of work.

If I did want to do some more reading on the theory behind the subject, any recommendations?

Am I right that the models specified in the linked blog post could be specified and fit using regular old pymc3 (no Theano scan), but with two caveats:

  1. You can only generate the smoothed posterior distributions, not the filtered, and

  2. The sampling might not be as efficient as when using FFBS

In my opinion, “Bayesian Forecasting and Dynamic Models” is the best reference for DLMs.

I don’t believe any of the filtering or smoothing could be done without a Scan Op, unless the models were of a very specific and simple variety (e.g. they had Markov dependencies that could be reduced to vectorized operations).

PyMC3 operates almost exclusively on log-likelihoods, which are absent from that exposition. In order to use PyMC3’s log-likelihood driven samplers, one would need to convert those “sample-space” (i.e. random variable) graphs to “measure-space” (i.e. log-likelihood) graphs.

Here’s a rough draft describing the manual conversion of such graphs into PyMC3-compatible Distributions. I had started to create some tools to automate the process, but quickly noticed that certain aspects of PyMC3 make that unnecessarily difficult and inconsistent. The correct approach is to make these changes from within PyMC3, and that’s what we’re starting to do.

2 Likes

My go to reference is Bayesian Filtering and Smoothing by Simo Särkkä

2 Likes

Maybe that’s what I’m thinking of - a very specific and simple variety. The example I have in mind is a Dynamic Factor model with factor loadings fixed through time, e.g. this example.

Hi @brandonwillard,
Thank you for this great piece of software!

Some time ago, I tried to implement a DLM using the scan function, but without luck.

What is the update status on PyMC3 for implementing DLMs?

For reference, I read your posts Theano Model Graphs - Brandon T. Willard and
Dynamic Linear Models in Theano - Brandon T. Willard

Thanks!

We’re focusing our efforts on PyMC v4 and the Theano fork, Aesara, that powers it. Within the Aesara ecosystem, we have a project called AePPL that is now capable of generating log-likelihoods for arbitrary Scan Ops: https://github.com/aesara-devs/aeppl/pull/24.

In other words, we’ve just implemented the features from my earlier comment in AePPL, and this functionality will be available in PyMC v4 at some point.

If you’re interested, please, give that new AePPL feature a try and report any issues you have along the way.

AePPL looks great! I am so glad of posting the question. Just in time to try it. I will report any issues. Thanks again!

Hi @brandonwillard , thank you for the wonderful blog post – I found it invaluable coming from some experience in PyMC3 but none in DLM/DFMs or time-series data for that matter.

In the post you tease:

In the future, we plan to build optimizations that automate much of the work done here (and more). This exposition sets the foundation for such work by first motivating the use and generality of Bayesian frameworks like DLMs, then by demonstrating the analytic steps that produce customized, efficient samplers. These are the steps that would undergo automation in the future.

Did any of those ideas make it into the current release of pymc or can I accomplish what I need with pytensor/aeppl? I am helping an econ friend with a project who wants to use these types of models and my current plan is to implement it using your blog post as a guide.

I also can’t thank @junpenglao @RavinKumar and @aloctavodia enough for their incredible work on Bayesian Modeling and Computation in Python. I have never worked on time-series data before and without chapter 6 in your book I would be lost. Thank you!

1 Like

@jvivian, I’m truly glad that my post was helpful!

Unfortunately, I am not involved or associated with the PyMC group in any capacity, especially their forks/copies of my work (e.g. “pytensor”). All I can say is that the ideas mentioned in my post(s) are planned for the Aesara projects, and that none of our group’s work will appear in PyMC-related projects unless its put there or copied by others.

Hi @brandonwillard ,

Thank you for the clarification and contributions in this space. I’ll start with a basic implementation in aesera modeled on your theano-example and go from there. Cheers