I’m back with more questions!
I want to make two plots for the model.
- A posterior predictive plot, showing a high density interval of the model conditional on the posterior for \mu, \sigma on top of the plot of actual observations to see whether the model is doing a good job at predicting those
- An extrapolation plot,
For 1) I tried following the example here.
My code is:
# Sample parameters
N = 100
mu = 10.0
sigma = 1.0
import numpy as np
import pymc3 as pm
import arviz as az
# Random sampling
def random(point=None, size=None):
# Retrive parameters
mu = point['mu']
sigma = point['sigma']
# Generate sample
ts = np.arange(N)
obss = []
obs = -np.math.inf
for t in ts:
aux = mu + sigma*np.random.randn()
obs = np.max([obs, aux])
obss.append(obs)
obss = np.array(obss)
return obss
obss = random({'mu': mu, 'sigma':sigma})
### Separate the data where there were jumps in the running max
### and the data where the running max stayed constant
jump_mask = np.insert(np.diff(obss) > 0,0,True)
jump_data = obss[jump_mask] # Slice out the data where a new record is set
flat_data = obss[~jump_mask] # Slice the data where the record is maintained
# Logarithmic likelihood of the joint distribution of data
def logp(jump_data, flat_data, mu, sigma):
x_dist = pm.Normal.dist(mu=mu, sigma=sigma)
# Add likelihood contribution from the jump data
log_likelihood = pm.math.sum(x_dist.logp(jump_data))
# Add likelihood contribution from the flat data
log_likelihood += pm.math.sum(x_dist.logcdf(flat_data))
return log_likelihood
# PyMC3 model definition
with pm.Model() as model:
mu = pm.Uniform('mu', 0., 20.)
sigma = pm.Uniform('sigma', 0.5, 1.5)
Ys = pm.DensityDist('Ys', logp, random=random,
observed = {'jump_data':jump_data,
'flat_data':flat_data,
'mu': mu,
'sigma': sigma}
)
trace = pm.sample(1000, chains=2, tune=1000,
return_inferencedata=True,
idata_kwargs={"density_dist_obs": False})
post_pred = pm.sample_posterior_predictive(trace, var_names=["Ys"])
idata = az.concat(trace, az.from_pymc3(posterior_predictive=post_pred))
az.plot_ppc(idata)
When I execute the above snippet, I am getting an internal Theano error MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(mu_interval__), was not provided and not given a value..
For 2… I don’t even know how to start.
PS: Still very interested in your thoughts about the divergence and what parameters to use for the NUTS sampling if you think more about that!