I’m experiencing what seems to be the same problem as highlighted in various other posts (#1983, #1083, #1304), although none of the proposed solutions have worked for my model.
I am running a Gaussian Process Regression in a for loop where for each iteration in the loop my target ‘y’ variable changes and I use the same model structure. The code runs fine (although significantly slower) if my sampling option is pm.NUTS(), however fails if it is Metropolis, with error: “The broadcastable pattern of the input is incorrect for this op. Expected (True,), got (False,).”
I’ve tried both flattening and reshaping my priors, and also setting the patternbroadcast to True for each, but have had no luck. I would appreciate any help! Also, any comments on the general structure of the code are welcome. I’ve been trying to set this up correctly for some time, as running in a for loop has been leading to “memory leakage” problems (#1825), so I hope my current setup is memory efficient… - On this note, I’ve found I have to define a new conditional function for each loop, as I get the problem that the variable already exists if I keep the variable name the same. Is this normal? Many thanks.
Below is my code: - I’ve just randomly generated some fake data for the example here:
#'data' (see below) is a 3D matrix, where each ij entry will be predicted. The 3rd dimension of data is size: n
#X is the same for each prediction and has dimensions nxN
#Xnew is a 1D vector with dimensions 1xN
np.random.seed(1)
data = np.random.rand(60,60,30)
X = np.random.rand(30,100)
Xnew = np.random.rand(1,100)
shared_data = theano.shared(data[0,0,:], borrow=True)
with pm.Model() as model:
# hyperpriors
ℓ_mu = pm.Normal('ℓ_mu', mu=-10, sd=2)
ℓ_sd = pm.Gamma('ℓ_sd', alpha=10, beta=10)
σf_mu = pm.Normal('σf_mu', mu=-1, sd=2)
σf_sd = pm.Gamma('σf_sd', alpha=5, beta=10)
σn_mu = pm.Normal('σn_mu', mu=-1, sd=2)
σn_sd = pm.Gamma('σn_sd', alpha=5, beta=10)
# priors
logℓ = pm.Normal('log(ℓ)', mu=ℓ_mu, sd=ℓ_sd)
logσf = pm.Normal('log(σf)', mu=σf_mu, sd=σf_sd)
logσn = pm.Normal('log(σn)', mu=σn_mu, sd=σn_sd)
ℓ = tt.exp(logℓ)
σf = tt.exp(logσf)
σn = tt.exp(logσn)
μprior = pm.gp.mean.Zero()
Σprior = σf * pm.gp.cov.ExpQuad(input_dim=X.shape[1],active_dims=np.arange(X.shape[1]),ls=ℓ)
# GP function
gp = pm.gp.Marginal(mean_func=μprior, cov_func=Σprior)
y_ = gp.marginal_likelihood('y_',X=X, y=shared_data, noise=σn)
for i in range(60):
for j in range(60): #dimensions of the 'data' matrix above
ID = i*60 + j
shared_data.set_value(data[i,j,:])
with model:
db = pm.backends.Text('./backends/'+str(ID))
pm.sample(draws=1000, chains=5, step=pm.Metropolis(), random_seed=1, progressbar=False, trace=db)
trace = pm.backends.text.load('/backends/'+str(ID))
point = {}
for obj in trace[-1]:
array = np.asarray([np.nanmean(np.nanmean(trace.get_values(str(obj), burn=1000//2, combine=False),axis=1))])
point[str(obj)] = array
func = gp.conditional(str(ID), Xnew=Xnew, pred_noise=True)
μpost, Σpost = gp.predict(Xnew, point=point, pred_noise=True)