Defining random for a CustomDist time series

I’ve created a custom distribution that captures slowly-evolving day-of-week seasonality. It has only one parameter, sigma_omega, the standard deviation of the innovations. Gist here: PyMC day of week seasonality · GitHub

I’d like to define a random function so that:
a) I can still sample the posterior predictive, and
b) when I do the extend-the-index-into-the-future trick, PyMC automatically generates a forecast for me during sampling.

The challenge is that I’d like it to take into account the observed data when it does a) and b). E.g. for b), the random function shouldn’t create any time series with the sigma_omega, it should create take into account the observed pattern so far.

More concretely, if in my observed data, Mondays have very high values, unobserved Mondays should be high (subject to innovations of magnitude sigma_omega, of course). It shouldn’t produce a sample with very low Mondays, just because the sample has the correct sigma_omega.

Is this possible?

Likelihoods are fixed during sampling. If you have a normal likelihood you will have a normal posterior predictive.

Your distribution should be parametrized in a way that the random and logp methods match.

It sounds like you need to define some extra parameters that are learned during sample and which influence the magnitude of certain points (not just the magnitude of innovations) so that both the logp and random methods can reflect that behavior of the data.

Otherwise it seems like you are asking PyMC to model some aspect of your data out of thin air.

Thanks @ricardoV94, after sleeping on it I realized I needed to take a step backward.

In pymc3, when I used a GaussianRandomWalk, the sampler would “predict the future” for me if I passed missing data to the observed. Please see this gist.

In the new PyMC, this no longer works. Please see this gist. In that gist, I also show that when the GRW is a latent var, the sampler can predict it’s future; it just refuses to do so when the GRW is observed.

So my questions are:

  1. Am I right, or am I simply defining the model incorrectly?
  2. Why is this the new behavior?

In the 2nd gist, I go on to show that just as with the GRW, PyMC can predict the future of my CustomDist for day-of-week seasonality, but only when it is latent, not observed directly. This was the source of my original question in this thread: is it possible to have PyMC predict the future for my CustomDist class, just as it did for the GRW in pymc3?

The new behavior is because GRW is no longer a simple Distribution. What PyMC does whe you have some observed data and missing entries is pull those apart and create two variables: one fully observed and one fully unobserved. To be able to do this it needs to know how to pull apart the parameters as well.

It doesn’t know for the GRW, but it does for the Normal, which your last model uses.

In your case it seems like your model is just GRW + daily intercepts. I don’t think you need a custom distribution at all

Untested code:

coords={'days': df.index.values, 'weekdays':range(7)}
with pm.Model(coords=coords) as m:
    innovation_mag = pm.HalfNormal('innovation_mag', sigma=.2)
    latent = pm.GaussianRandomWalk('latent', sigma=innovation_mag, dims="days")
    
    dow_innovation_mag = pm.HalfNormal('dow_innovation_mag', sigma=.2)
    dow = pm.Normal(
            'weekly_seasonality',
             sigma=dow_innovation_mag,          
             dims="weekdays"
    )
    mu=latent
    for i in range(7):
      mu = at.set_subtensor(mu[i::7], mu[i::7] + dow[i])
    
    obs_noise_mag = pm.HalfNormal('obs_noise_mag', sigma=.2)
    observed = pm.Normal('observed', 
                         mu=mu, 
                         sigma=obs_noise_mag, 
                         observed=df.value_with_dow_noisy)

If your observed has np.nan for the future entries PyMC will sample those as well.

These types of seasonality models are just begging for implementation as linear state space systems. Doing so also gives forecasting for free, because you can just sample the relevant matrices and iterate the transition/observation equations from the end of the data.

I implemented the model is a gist here. Obviously it’s still slow (10 minutes to sample 1 parameter, lol), but it’s a nice proof of concept.

1 Like

Thanks. I’d like to understand more about why the GRW is no longer a simple Distribution. Mostly I am curious, but I work with time series a lot so if something fundamental has changed, I’d like to understand it. Is there any documentation about this? Can you provide a high-level summary as to what changed here? That will help me as I look through the code.

The biggest change is that automatic imputation of multivariate distributions (which all timeseries are) is not supported in pymc>=4.

Everything else should work the same.

1 Like