Inference data conversion Error

I am trying to perform prior predictive checks for my extreme value model, but fail to do so with the error mentioned below. I am running Pymc version 3.11.4 and arviz v. 0.11.4.
The model set up is:

def gev_logp(value, loca, sig, xi):
    scaled = (value - loca) / sig
    logp_xi_not_zero = -(tt.log(sig)
             + ((xi + 1) / xi) * tt.log1p(xi * scaled)
             + (1 + xi * scaled) ** (-1/xi))
    logp_xi_zero = -tt.log(sig) + (xi+1)*(-(value - loca)/sig) - tt.exp(-(value - loca)/sig)
    logp = tt.switch(tt.abs_(xi) > 1e-4  , logp_xi_not_zero, logp_xi_zero)#1e-4 
    return tt.sum(logp)


with pm.Model() as model_gev:
loca = pm.Normal('loca', mu=1, sigma=100)
sig= pm.Normal('sig',mu=1, sigma=100)
xi = pm.TruncatedNormal('xi', mu=0, sigma=0.4, lower=-0.6, upper=0.6) 
gev = pm.DensityDist('gev', gev_logp, observed = {'value':data, 'loca':loca, 'sig':sig, 'xi':xi})
z_p = pm.Deterministic('z_p', loca - sig/xi*(1 - (-np.log(1-p))**(-xi)))
trace = pm.sample(3000, cores=4, chains=4, tune=2000, return_inferencedata=True, idata_kwargs={"density_dist_obs": False}, target_accept=0.99)

The Inference Data i.e., the Trace contains posterior, sample_stats, and log_likelihood.
I tried to run prior predictive checks with the code below

idata= pm.sample_prior_predictive(samples=1000, model=model_gev)
az.plot_ppc(idata, group="prior", figsize=(12, 6))
ax = plt.gca()
ax.set_xlim([2, 6])
ax.set_ylim([0, 2]);

The error I get is

TypeError: `data` argument must have the group "prior_predictive" for ppcplot

In pymc3 3.11.x, sample_prior_predictive returns a dictionary with the prior samples, therefore, to be able to use plot_ppc you need to convert it to InferenceData so that both prior predictive and observed data are available for plotting. You can do that with:

with model_gev:
    idata = az.from_pymc3(prior=pm.sample_prior_predictive(samples=1000))
1 Like

@OriolAbril I already tried to convert idata to inference data but I get a stop Iteration error with it.

With this command above? What error do you get? Can you share the output of print(pm.sample_prior_predictive(...)) ?

Yes, I tried it before. Here is the error:

StopIteration                             Traceback (most recent call last)
/var/folders/5l/8x6p9r_x6xvdknqb0982r5hh0000gn/T/ipykernel_5628/ in <module>
      1 with model_gev:
----> 2     idata = az.from_pymc3(prior=pm.sample_prior_predictive(samples=1000))

/opt/anaconda3/lib/python3.8/site-packages/arviz/data/ in from_pymc3(trace, prior, posterior_predictive, log_likelihood, coords, dims, model, save_warmup, density_dist_obs)
    578     InferenceData
    579     """
--> 580     return PyMC3Converter(
    581         trace=trace,
    582         prior=prior,

/opt/anaconda3/lib/python3.8/site-packages/arviz/data/ in __init__(self, trace, prior, posterior_predictive, log_likelihood, predictions, coords, dims, model, save_warmup, density_dist_obs)
    166                 )
--> 168             aelem = arbitrary_element(get_from)
    169             self.ndraws = aelem.shape[0]

/opt/anaconda3/lib/python3.8/site-packages/arviz/data/ in arbitrary_element(dct)
    147         def arbitrary_element(dct: Dict[Any, np.ndarray]) -> np.ndarray:
--> 148             return next(iter(dct.values()))
    150         if trace is None:


And I just get {} with

idata= pm.sample_prior_predictive(samples=500, model=model_gev)

I am not sure why you get an empty dict when sampling the prior, but this is what needs to be fixed. You have no prior samples so consequently you can’t convert them nor plot them.

Is there any output printed or warning when sampling the prior? I assume your model is the one above but correctly indented?

@OriolAbril Exactly, I am getting the same problem when I try to perform posterior predictive checks. I just get an empty dictionary. I don’t see any warnings; the model setup is same as above.

It makes sense that the posterior predictive is empty. You are using a DensityDist so the gev variable which would be the only one in the posterior predictive will not have a random method and pymc can’t generate samples for it. But the output of sample_prior_predictive should have loca, sig and xi I think.

Can you try passing variable names explicitly for the prior?

If you need prior samples for gev, then I think you need to remove the observed kwarg from the densitydist, ensure the shape is right and then sample using nuts which will then sample from the prior.

@OriolAbril I am now able to sample from the prior by

with model_gev:
    idata= pm.sample_prior_predictive(samples=1000,var_names=['xi','loca','sig'])

However, I still can’t convert the idata to InferenceData with az.from_pymc3. I get the following error:

ValueError                                Traceback (most recent call last)
/var/folders/5l/8x6p9r_x6xvdknqb0982r5hh0000gn/T/ipykernel_7624/ in <module>
      1 with model_gev:
----> 2     idata = az.from_pymc3(prior=pm.sample_prior_predictive(samples=1000),var_names=['xi','loca','sig'] )

/opt/anaconda3/lib/python3.8/site-packages/pymc3/ in sample_prior_predictive(samples, model, var_names, random_seed)
   1942     names = get_default_varnames(vars_, include_transformed=False)
   1943     # draw_values fails with auto-transformed variables. transform them later!
-> 1944     values = draw_values([model[name] for name in names], size=samples)
   1946     data = {k: v for k, v in zip(names, values)}

/opt/anaconda3/lib/python3.8/site-packages/pymc3/distributions/ in draw_values(params, point, size)
    789                     # This may fail for autotransformed RVs, which don't
    790                     # have the random method
--> 791                     value = _draw_value(next_, point=point, givens=temp_givens, size=size)
    792                     givens[] = (next_, value)
    793                     drawn[(next_, size)] = value

/opt/anaconda3/lib/python3.8/site-packages/pymc3/distributions/ in _draw_value(param, point, givens, size)
    988                 return dist_tmp.random(point=point, size=size)
    989             else:
--> 990                 return param.distribution.random(point=point, size=size)
    991         else:
    992             if givens:

/opt/anaconda3/lib/python3.8/site-packages/pymc3/distributions/ in random(self, point, size, **kwargs)
    621             return samples
    622         else:
--> 623             raise ValueError(
    624                 "Distribution was not passed any random method. "
    625                 "Define a custom random method and pass it as kwarg random"

ValueError: Distribution was not passed any random method. Define a custom random method and pass it as kwarg random

The error message you are sharing comes from prior predictive sampling, not from conversion, see:

not 100% sure but it seems to indicate that you can’t sample from the prior due to this missing random method.

@OriolAbril I am not sure what’s wrong here. Sometimes, I am able to sample from the prior and sometimes it fails and I get these error messages. I was able to get the dictionary of prior samples with the same command, but if I rerun the model, it starts producing errors. Here is the output from one of the successful sampling runs

{'z_p': array([ -293.55628613,   -11.21258165,  -212.67209372, ...,
          712.71902619, -4229.54844337,   124.87198188]),
 'sig': array([ -92.2917161 ,    5.90225754,  -47.4513203 , ...,   32.56991493,
        -225.82495554,   35.12457156]),
 'xi': array([-0.36085082,  0.01659489, -0.53413003, ...,  0.56778793,
         0.52019784, -0.2345415 ]),
 'loca': array([ -86.4266613 ,  -39.42707952, -131.4457019 , ...,  -11.48877585,
          88.21003654,   26.02548712])}

And I tried to create the PPC plot after successful sampling, it still returned the same error

TypeError: `data` argument must have the group "prior_predictive" for ppcplot