Computing out-of-sample deviance

I have been trying to compute deviance out-of-sample. The following code works okay in-sample,

waic = pm.stats.waic(trace_, model_)
deviance = waic.WAIC - waic.p_WAIC

where by in-sample I am referring to a model whose variables are defined explicitly with Theano variables set to the training dataset, and then estimated via NUTs.

When I call the set_value method on my Theano variables to swap-in the values from the test dataset and then repeat the above code snipped, I get the following error,

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/Dropbox/data_science/workspace/python/pymc3-example-project/venv/lib/python3.6/site-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs)
    902             outputs =\
--> 903                 self.fn() if output_subset is None else\
    904                 self.fn(output_subset=output_subset)

ValueError: Input dimension mis-match. (input[3].shape[0] = 3340, input[4].shape[0] = 835)

which suggests to me that the test data has not been set correctly (the test dataset has 835 observations, while the training data has 3340). Note, that I observe sample_ppc correctly sampling from the posterior predictive distribution for the test dataset.

I experience the same phenomenon when I ‘borrow’ the general approach found here Evaluate logposterior at sample points

If someone could point me in the right direction, it would be greatly appreciated!

This is an interesting use case. I think in principle it should work, for example:

y = theano.shared(np.asarray([6.,8.,8.]))
with pm.Model() as m:
    p=pm.Beta('p', 1.,1.)
    obs = pm.Binomial('y', p=p, n=10, observed=y)
    tr = pm.sample()

In [8]: pm.waic(tr, m)
Out[8]: WAIC_r(WAIC=10.390547057421303, WAIC_se=0.929771615409605, p_WAIC=0.5304594854066094, var_warn=0)

y.set_value(np.ones(4)*8)

In [12]: pm.waic(tr, m)
Out[12]: WAIC_r(WAIC=12.335752053177222, WAIC_se=0.0, p_WAIC=0.43684314201765617, var_warn=0)

So I would suggest checking your model to see if there are:
1, Random variables with explicit shape
2, Deterministic node wrapped in pm.Deterministic. If so remove the wrap:
ymu = pm.Deterministic('mu', X.dot(beta)) --> ymu = X.dot(beta)

Thank you very much for the quick reply - it’s greatly appreciated.

I am using RVs with explicit shape, but in this instance I don’t see why it should cause a problem.

To be more specific, the code I am using to switch between train and test datasets is,

from theano import shared

def load_test_data():
    lots.set_value(X_test[:, 0])
    mean_rsv.set_value(X_test[:, 1])
    avg_start_bid.set_value(X_test[:, 2])
    BPL.set_value(X_test[:, 3])
    auction_mech.set_value(X_test[:, 4].astype('int32'))
    country.set_value(X_test[:, 5].astype('int32'))
    start_bids.set_value(X_test[:, 6].astype('int32'))
    return None

def load_train_data():
    lots.set_value(X_train[:, 0])
    mean_rsv.set_value(X_train[:, 1])
    avg_start_bid.set_value(X_train[:, 2])
    BPL.set_value(X_train[:, 3])
    auction_mech.set_value(X_train[:, 4].astype('int32'))
    country.set_value(X_train[:, 5].astype('int32'))
    start_bids.set_value(X_train[:, 6].astype('int32'))
    return None

Y = y_train
lots = shared(X_train[:, 0])
mean_rsv = shared(X_train[:, 1])
avg_start_bid = shared(X_train[:, 2])
BPL = shared(X_train[:, 3])
auction_mech = shared(X_train[:, 4].astype('int32'))
country = shared(X_train[:, 5].astype('int32'))
start_bids = shared(X_train[:, 6].astype('int32'))

And the model is defined as,

load_train_data()
ror_model_basic = pm.Model()

n_entities = len(set(X_df['country'].tolist()))
n_auction_mech = len(set(X_df['auction_mech'].tolist()))

with ror_model_basic:

    # priors for unknown model parameters
    alpha_country = pm.Normal('alpha_country', mu=0, sd=0.25, shape=n_entities)
    alpha_auction_mech = pm.Normal('alpha_auction_mech', mu=0, sd=0.25, shape=n_auction_mech)
    alpha_start_bids = pm.Normal('alpha_start_bids', mu=0, sd=0.25, shape=2)
    
    beta_lots = pm.Normal('beta_lots', mu=0, sd=0.25, shape=1)
    beta_mean_rsv = pm.Normal('beta_mean_res', mu=0, sd=0.25, shape=1)
    beta_BPL = pm.Normal('beta_BPL', mu=0, sd=0.25, shape=1)

    sigma = pm.HalfCauchy('sigma', beta=2)

    # expected value
    mu = (alpha_country[country] + alpha_auction_mech[auction_mech] + alpha_start_bids[start_bids]
          + beta_lots[0] * lots + beta_mean_rsv[0] * mean_rsv + beta_BPL[0] * BPL)    

    # likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y)

If you’re curious and want some context, see https://github.com/AlexIoannides/pymc3-example-project (but bear in mind I’m working on a new branch).

… and I believe that I have just spotted the error - I am not changing Y between train and test dataset. Is this actually correct?

This error obviously wasn’t a problem for pm.sample_ppc, as I’m guessing that doesn’t need Y (it must only need the independent variables) - is my understanding correct?

Many thanks.

1 Like

Yes you are correct :wink: And yes pm.sample_ppc only does forward pass and does not evaluate on the new y.
Glad you find the solution!

Nice one - thanks @junpenglao!

For reference, I was interested to calculate the deviance out-of-sample as an alternative to looking at metrics such as RMSE/MAD on test data, and comparing with WAIC that aims to predict the out-of-sample deviance.

1 Like