I am unable to save traces to file using
gp.marginal_likelihood is conditioned on random variables. To see this, consider the following toy example:
import numpy as np
import pymc3 as pm
x = np.linspace(-1, 1, 1000)
y = np.random.normal(loc=0.0, scale=0.001, size=len(x)) + 0.1 + np.sin(x)
with pm.Model() as model:
mean = pm.Normal("mean", mu=0.1, sd=0.5)
gp = pm.gp.Marginal(cov_func=pm.gp.cov.Linear(input_dim=1, c=1))
y_pred = gp.marginal_likelihood('y_pred', X=x.reshape(-1, 1), y=y.flatten() - mean, noise=0.1)
trace = pm.sample(return_inferencedata=True, tune=100, draws=300)
This raises the error
ValueError: unable to infer dtype on variable 'gp'; xarray cannot serialize arbitrary Python objects
Has anyone encountered this problem before, knows how it is meant to be solved in PyMC, or has a suggestion for a workaround?
Thank you all very much in advance.
You should be able to fix this adding
trace.posterior = trace.posterior.astype(float) between creation and saving.
Still, can you share the output of
print(trace.posterior)? It is very strange that it gets assigned an
object dtype instead of a numeric one, and that must come from earlier on, when converting to inferencedata arviz leaves the input arrays completely untouched and you can for example pass dask arrays instead of numpy without any problem (xarray supports both).
Dimensions: (chain: 2, draw: 300)
* chain (chain) int64 0 1
* draw (draw) int64 0 1 2 3 4 5 6 7 8 ... 292 293 294 295 296 297 298 299
mean (chain, draw) float64 0.9928 0.997 0.999 ... 0.9992 1.02 0.9917
The problem is, however, not with the posterior, but with
AttributeError: 'TensorVariable' object has no attribute 'item'
while the following works perfectly fine, but doesn’t save the observed data.
trace.to_netcdf(filename='test', groups=["posterior", "log_likelihood", "sample_stats"])
trace.observed_data.astype(float) doesn’t work, raising
Cannot interpret '<built-in function array>' as a data type
Not sure what is going on here, maybe @michaelosthege could help? It looks like gp.marginal_likelihood registers an observed variable that is not retrieved properly.
My only guess is it might be because all other distributions do not allow FreeRVs as to be passed with
observed kwarg and therefore the converter doesn’t support this case, not sure why it would be allowed here though.
y parameter to the
gp.marginal_likelihood is a
Looks like PyMC can deal with that when it comes to the differentiation & sampling, but like Oriol correctly pointed out, the code that converts to
InferenceData never expected that the
observed may be symbolic.
When it comes to PyMC
v3, this will be a
wontfix bug, but there’s a chance the problem persists with
v4 in which case something should be done about it.
However, there may be a valid reparametrization of the model as a
pm.gp.Latent gaussian process.
Essentially something in the direction of
gp = pm.gp.Latent(mean, cov)
pred = gp.prior(X=x.reshape(-1, 1))
L = pm.Normal("likelihood", mu=pred + mean, sigma=0.1, observed=y)
mean_func=pm.gp.mean.Constant(mean) should also work.here.