Thanks for the pointer. Am I correct in understanding that your example works because they’re all zeros?
I need each row to be a different defined and tractable series so as to sample the posterior parameters on specific realisations.
An example might help.
Say the data is the realisation of a known gaussian process, but each realisation has some (normally distributed) shift. We aim to find the parameters of that shift, and wish to obtain a posterior for the shift of any individual realisation.
Data generation and model:
n_subjects = int(1e3)
n_obs = 200
cov = 2. * pm.gp.cov.RatQuad(1, 0.2, 0.1)
X = np.linspace(0, 2, n_obs)[:,None]
K = cov(X).eval()
fulldata = pm.MvNormal.dist(mu=np.zeros(K.shape[0]), cov=K).random(size=n_subjects) + \
pm.Normal.dist(2, 3).random(size=(n_subjects,1))
batchsize= 128
batches = Minibatch(fulldata, batchsize)
with Model():
mu = pm.Normal('hyper_mu', 0, 1e2)
tau = pm.HalfCauchy('hyper_tau', 2.5)
shifts = pm.Normal('shifts', mu, tau=tau, shape=batchsize, total_size=nobs)
observed = batches - shifts.dimshuffle(0, 'x')
pm.MvNormal('realisations', mu=np.zeros(K.shape[0]), cov=K, observed=observed, shape=(batchsize, n_obs), total_size=(n_subjects, n_obs))
approximation = pm.fit(30000)
Now how do I get a posterior sample on the shift for fulldata[0,:] or even for all 128 first realisations, ie fulldata[:128,:]?