Hello, I want to perform Bayesian regression where the observed data is distributed as per Normal distribution with mean represented by the sum of Gaussians (let’s say 5 Gaussians, each with 3 parameters). All these parameters in each Gaussian are random variables with prior distribution. How can I add these random variables?
Something along these lines maybe?
import pymc3 as pm
import theano.tensor as tt
my_data = [1,2,3,4,5]
with pm.Model() as model:
s1 = pm.Normal('summand1', mu=0, sigma=1)
s2 = pm.Normal('summand2', mu=0, sigma=1)
s3 = pm.Normal('summand3', mu=0, sigma=1)
s4 = pm.Normal('summand4', mu=0, sigma=1)
s5 = pm.Normal('summand5', mu=0, sigma=1)
my_sum = pm.Deterministic('my_sum', s1+s2+s3+s4+s5)
likelihood = pm.Normal('likelihood',
mu=my_sum,
sigma=1,
observed=my_data)
trace = pm.sample()
Or, more compactly:
with pm.Model() as model:
summands = pm.Normal('summands', mu=0, sigma=1, shape=5)
my_sum = pm.Deterministic('my_sum', tt.sum(summands))
likelihood = pm.Normal('likelihood',
mu=my_sum,
sigma=1,
observed=my_data
trace = pm.sample()
Hi @cluhmann,
To be more precise, I have a mean function that I know via Least square’s regression whose function is given by:
f(x) = a1*exp(-((x-b1)/c1)^2) + a2*exp(-((x-b2)/c2)^2) +
a3*exp(-((x-b3)/c3)^2) + a4*exp(-((x-b4)/c4)^2) +
a5*exp(-((x-b5)/c5)^2)
Here, “x” is the predictor variable and all others are the unknown parameters (a1,a2,…b1,b2,…c1,c2,…).
I want to model this in PyMC3 considering all these unknown parameters as RVs. I have already tried defining a function and using it as follows:
with pm.Model() as model:
def sumGaussians(a,b,c,x):
sum = tt.zeros (shape=(1,26))
for g in range(numG):
sum = sum + a[g] * pm.math.exp(pm.math.sqr((x - b[g])/c[g]))
return sum
a_ = pm.Normal("a_", mu=0, sigma=sigma_theta, shape=(numG, 1))
b_ = pm.Normal("b_", mu=0, sigma=sigma_theta, shape=(numG, 1))
c_ = pm.Normal("c_", mu=0, sigma=sigma_theta, shape=(numG, 1))
mu_y = sumGaussians(a_,b_,c_,x)
sigma_y = pm.InverseGamma("sigma_y",1,1)
Y_obs = pm.Normal('Y_obs', mu_y, sigma_y, observed=obs_data)
This doesn’t work for me. What do you think is going wrong for me?
Can you say more about how it is not working?
It shows “pymc3.exceptions.SamplingError: Bad initial energy”. Any pointers?
Placing c_
in the denominator and using a prior that sits directly on zero is probably not helping things. With mu=0
it’s likely the sampler is starting off at a point in the parameter space where c_
is exactly zero. What happens if you alter the prior for c_
to be even slightly offset from zero?
Hey, it was because of the parameterization. I don’t see any problem if I consider c_
as a constant. except for some divergences. I really need to look at the other parameters. Thanks for the help. @cluhmann