I am working on a bayesian model using pymc3. I have a data set with 29 rows and three columns: “K”, “B”, “M”. Where “B” and “M” are predictor variables. column B has a few missing values.
I have built the following model, which usages pm.MvNormal distribution to support the imputation of missing values.
Model definition
Ki ∼ Normal(μi, sigma)
μi = a + bB * Bi + bM log Mi
sigma ∼ Exponential(1)
a ∼ Normal(0, 0.5)
bB ∼ Normal(0, 0.5)
bM ∼ Normal(0, 0.5)
(Mi, Bi) ∼ MVNormal((muM, muB), sigma_BM)
muM ~ Normal(0.5, 1)
muB ~ Normal(0.5,1)
# data fields
# standardize is a method to make mean 0 and standard deviation 1
# output variable
K_ = standardize(d['kcal.per.g'])
#predictor variable
B_ = standardize(d['neocortex.prop']) # data with missing values
M_ = standardize(d['logmass'])
X_masked = np.ma.masked_invalid(np.stack([B_, M_]).T)
with pm.Model() as m15_7:
a = pm.Normal("a", 0, 0.5)
bB = pm.Normal("bB", 0, 0.5)
bM = pm.Normal("bM", 0, 0.5)
sigma = pm.Exponential("sigma", 1)
# means of the multivariate Normal distribution
muB = pm.Normal("muB", 0, 0.5)
muM = pm.Normal("muM", 0, 0.5)
# Cholesly decomposition
chol_BM, rho_BM, sigma_BM = pm.LKJCholeskyCov("chol_cov_BM",
eta=2,
n = 2,
sd_dist=pm.Exponential.dist(1),
compute_corr=True)
BM = pm.MvNormal("BM",
tt.stack([muB, muM]),
chol=chol_BM,
observed = X_masked)
# computing mean for output variable distribution
mu = a + bB * BM[:,0] + bM * M_
K = pm.Normal("K", mu, sigma, observed=K_)
trace_15_7 = pm.sample()
After executing this I am getting the following error
ValueError: operands could not be broadcast together with shapes (29,2) () (29,)
Please help