Making predictions with a different number of data points than used for estimating the parameters

Is it possible to estimate the parameters of my model using N data points, and then to switch out the shared inputs theano layer with other values (for out of sample prediction, for example) with a dataset that has a different number of data points?

I don’t think theano.set_value allows that? And if not, is there a workaround?

If you are using a theano.shared variable as input, you can do that by input_X.set_value, as long as there is no shape information related to this input_X is hard coded in your model.

There must be some kind of weirdness in my code that i’m not seeing. I don’t believe I’m explicitly hard-coding the length of my training set in my model.

Should I need to use set_value on the input_Y as well? Because when I do, it works. When I only set_value on the input_X, I get the following error:

File “…/BayesianNeuralNetwork.py”, line 98, in predict
preds = pm.sample_ppc(self.trace, model=self.model)
File “~/.local/lib/python2.7/site-packages/pymc3/sampling.py”, line 1129, in sample_ppc
draw_values(vars, point=model.test_point, size=size)))
File “~/.local/lib/python2.7/site-packages/pymc3/distributions/distribution.py”, line 321, in draw_values
evaluated[param_idx] = _draw_value(param, point=point, givens=givens.values(), size=size)
File “~/.local/lib/python2.7/site-packages/pymc3/distributions/distribution.py”, line 405, in _draw_value
return dist_tmp.random(point=point, size=size)
File “~/.local/lib/python2.7/site-packages/pymc3/distributions/continuous.py”, line 427, in random
size=size)
File “~/.local/lib/python2.7/site-packages/pymc3/distributions/distribution.py”, line 524, in generate_samples
‘’’.format(size=size, dist_shape=dist_shape, broadcast_shape=broadcast_shape))

TypeError: Attempted to generate values with incompatible shapes:
size: 1
dist_shape: (1000,)
broadcast_shape: (1000, 1)

Exception KeyError: KeyError(<weakref at 0x7f14ff3402b8; to ‘tqdm’ at 0x7f14fbe35b10>,) in <bound method tqdm.del of 0%| | 0/1000 [00:00<?, ?it/s]> ignored

In this error message, the 1000 stands for the length of the prediction dataset (while my training dataset’s length was 500).

Should I need to use set_value on the input_Y as well?

Yes, you need to set both for the dimensionality of the model to be internally consistent.