Modelling a timeseries of cumulative maximum

I’m back with more questions!

I want to make two plots for the model.

  1. A posterior predictive plot, showing a high density interval of the model conditional on the posterior for \mu, \sigma on top of the plot of actual observations to see whether the model is doing a good job at predicting those
  2. An extrapolation plot,

For 1) I tried following the example here.

My code is:

# Sample parameters
N = 100
mu = 10.0
sigma = 1.0

import numpy as np
import pymc3 as pm
import arviz as az

# Random sampling
def random(point=None, size=None):
  # Retrive parameters
  mu = point['mu']
  sigma = point['sigma']

  # Generate sample
  ts = np.arange(N)
  obss = []
  obs = -np.math.inf
  for t in ts:
    aux = mu + sigma*np.random.randn()
    obs = np.max([obs, aux])
    obss.append(obs)
  obss = np.array(obss)

  return obss

obss = random({'mu': mu, 'sigma':sigma})

### Separate the data where there were jumps in the running max
### and the data where the running max stayed constant
jump_mask = np.insert(np.diff(obss) > 0,0,True) 
jump_data = obss[jump_mask] # Slice out the data where a new record is set
flat_data = obss[~jump_mask] # Slice the data where the record is maintained

# Logarithmic likelihood of the joint distribution of data
def logp(jump_data, flat_data, mu, sigma):
  x_dist = pm.Normal.dist(mu=mu, sigma=sigma)
  
  # Add likelihood contribution from the jump data
  log_likelihood = pm.math.sum(x_dist.logp(jump_data))

  # Add likelihood contribution from the flat data
  log_likelihood += pm.math.sum(x_dist.logcdf(flat_data))
  
  return log_likelihood

# PyMC3 model definition
with pm.Model() as model:
  mu = pm.Uniform('mu', 0., 20.)
  sigma = pm.Uniform('sigma', 0.5, 1.5)
  Ys = pm.DensityDist('Ys', logp, random=random,
                      observed = {'jump_data':jump_data, 
                                  'flat_data':flat_data, 
                                  'mu': mu, 
                                  'sigma': sigma}
                      )
  trace = pm.sample(1000, chains=2, tune=1000,
                    return_inferencedata=True, 
                    idata_kwargs={"density_dist_obs": False})

  post_pred = pm.sample_posterior_predictive(trace, var_names=["Ys"])
  idata = az.concat(trace, az.from_pymc3(posterior_predictive=post_pred))
  az.plot_ppc(idata)

When I execute the above snippet, I am getting an internal Theano error MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(mu_interval__), was not provided and not given a value..

For 2… I don’t even know how to start.

PS: Still very interested in your thoughts about the divergence and what parameters to use for the NUTS sampling if you think more about that!