Hello, I want to perform Bayesian regression where the observed data is distributed as per Normal distribution with mean represented by the sum of Gaussians (let’s say 5 Gaussians, each with 3 parameters). All these parameters in each Gaussian are random variables with prior distribution. How can I add these random variables?
Something along these lines maybe?
import pymc3 as pm import theano.tensor as tt my_data = [1,2,3,4,5] with pm.Model() as model: s1 = pm.Normal('summand1', mu=0, sigma=1) s2 = pm.Normal('summand2', mu=0, sigma=1) s3 = pm.Normal('summand3', mu=0, sigma=1) s4 = pm.Normal('summand4', mu=0, sigma=1) s5 = pm.Normal('summand5', mu=0, sigma=1) my_sum = pm.Deterministic('my_sum', s1+s2+s3+s4+s5) likelihood = pm.Normal('likelihood', mu=my_sum, sigma=1, observed=my_data) trace = pm.sample()
Or, more compactly:
with pm.Model() as model: summands = pm.Normal('summands', mu=0, sigma=1, shape=5) my_sum = pm.Deterministic('my_sum', tt.sum(summands)) likelihood = pm.Normal('likelihood', mu=my_sum, sigma=1, observed=my_data trace = pm.sample()
To be more precise, I have a mean function that I know via Least square’s regression whose function is given by:
f(x) = a1*exp(-((x-b1)/c1)^2) + a2*exp(-((x-b2)/c2)^2) + a3*exp(-((x-b3)/c3)^2) + a4*exp(-((x-b4)/c4)^2) + a5*exp(-((x-b5)/c5)^2)
Here, “x” is the predictor variable and all others are the unknown parameters (a1,a2,…b1,b2,…c1,c2,…).
I want to model this in PyMC3 considering all these unknown parameters as RVs. I have already tried defining a function and using it as follows:
with pm.Model() as model: def sumGaussians(a,b,c,x): sum = tt.zeros (shape=(1,26)) for g in range(numG): sum = sum + a[g] * pm.math.exp(pm.math.sqr((x - b[g])/c[g])) return sum a_ = pm.Normal("a_", mu=0, sigma=sigma_theta, shape=(numG, 1)) b_ = pm.Normal("b_", mu=0, sigma=sigma_theta, shape=(numG, 1)) c_ = pm.Normal("c_", mu=0, sigma=sigma_theta, shape=(numG, 1)) mu_y = sumGaussians(a_,b_,c_,x) sigma_y = pm.InverseGamma("sigma_y",1,1) Y_obs = pm.Normal('Y_obs', mu_y, sigma_y, observed=obs_data)
This doesn’t work for me. What do you think is going wrong for me?
Can you say more about how it is not working?
It shows “pymc3.exceptions.SamplingError: Bad initial energy”. Any pointers?
c_ in the denominator and using a prior that sits directly on zero is probably not helping things. With
mu=0 it’s likely the sampler is starting off at a point in the parameter space where
c_ is exactly zero. What happens if you alter the prior for
c_ to be even slightly offset from zero?
Hey, it was because of the parameterization. I don’t see any problem if I consider
c_ as a constant. except for some divergences. I really need to look at the other parameters. Thanks for the help. @cluhmann