UserWarning: The effect of Potentials on other parameters is ignored during prior predictive sampling. This is likely to lead to invalid or biased predictive samples

# our model

with pm.Model() as marginal_gp_model:

    # Specify the covariance function.
    length_scale = pm.HalfCauchy("length_scale", 0.1, shape=(11,))
    # width_scale = pm.HalfCauchy("width_scale", 1)
    cov_func = pm.gp.cov.Matern52(11, ls=length_scale)


    # Specify the GP.  The default mean function is `Zero`.
    gp = pm.gp.MarginalApprox(cov_func=cov_func, approx="FITC")
    Xu = pm.gp.util.kmeans_inducing_points(20, train_X.values)

    # The scale of the noise term can be provided,
    sigma = pm.HalfCauchy("sigma",beta= 0.01)
    y_ = gp.marginal_likelihood("y_", X=train_X.values, Xu=Xu, y=train_Y.values, sigma=sigma)

I am getting the error here,

with marginal_gp_model:
    f_pred = gp.conditional("f_pred", test_X.values, given={"X": train_X.values, "y": train_Y.values})
    pred_samples = pm.sample_posterior_predictive(trace, random_seed=42, model=marginal_gp_model, var_names=["f_pred"])

TLDR: Ignore the warning, the results you get from gp.conditional are correct :+1:

FITC, DTC, and VFE using MarginalApprox all effectively result in approximations to a Normal likelihood. The approximate likelihood is implemented by way of pm.Potential, which is how you add arbitrary terms to the log probability that NUTS uses to sample in pm.sample. But, since there isn’t a way to draw samples from arbitrary terms added to the log probability, PyMC of course can’t draw samples. That warning (not error!) will show whenever you’ve included a pm.Potential term in your model, and since you know your model best, you’ll know whether these potential terms will cause problems.

The GP approximations will work fine, because gp.conditional creates a new distribution (not a potential) whose samples are what you’re after.

So, a place where the warning should be paid attention to is something like the following two models, both of which are equivalent:

y = 3 + np.random.randn(500)

with pm.Model() as model1:
    mu = pm.Normal("mu", mu=0.0, sigma=10.0)
    pm.Normal("likelihood", mu=mu, sigma=1.0, observed=y)

with pm.Model() as model2:
    mu = pm.Flat("mu")
    pm.Potential("mu_prior", pm.logp(rv=pm.Normal.dist(mu=0.0, sigma=10.0), value=mu))
    pm.Potential("likelihood", pm.logp(rv=pm.Normal.dist(mu=mu, sigma=1.0), value=y))

The results are the same if you try pm.sample() from both models. But, if you try posterior predictive sampling from model2, it wont work. But you can’t from model2, because PyMC just sees the potential terms, not the random variable mu.

Note that PyMC won’t sample a pm.Flat, but you could replace it with a Normal and a huge sigma and get the same effect.

2 Likes

Hey,
So, I tried running my model which is

with pm.Model() as model_3:
    # Specify the covariance function.
    length_scale_3 = pm.Normal("length_scale_3", 0, 10)
    # width_scale = pm.HalfCauchy("width_scale", ls=[llength_scale])


    cov_func = pm.gp.cov.ExpQuad(train_X.shape[1], ls=[length_scale_3])


    # Specify the GP.  The default mean function is `Zero`.
    gp_3 = pm.gp.MarginalApprox(cov_func=cov_func, approx="FITC")
    Xu = pm.gp.util.kmeans_inducing_points(20, train_X.values)

    # The scale of the noise term can be provided,
    sigma = pm.HalfCauchy("sigma", beta= 5)
    y_3 = gp_3.marginal_likelihood("y_3", X=train_X.values, Xu=Xu, y=train_Y.values, sigma=sigma)

and while running this,

with model_3:
    pred_samples_3 = pm.sample_posterior_predictive(trace_3,  model=gp_3, var_names=["f_pred_3"], return_inferencedata=True)

I am getting the error
AttributeError: ‘MarginalApprox’ object has no attribute ‘potentials’

I am using PyMC3 v4.4.0
I need to check the performance of the model and extract feature importance.

What am I doing wrong here?