I am trying to perform a Bayesian regression for a 2-D dataset.
Refer 2D-data.
I want to formulate it as a sum of Gaussians and for this purpose, I’m using Gaussian Mixture Models with PyMC3. The code snippet is shown below.
nbr_gauss = 15
data = np.column_stack((x_train, y_train))# standardize the data x_train = normalize(x_train,norm='max', axis=0) y_train = normalize(y_train,norm='max',axis = 0) with pm.Model() as multiVarModel: # Proportion of each component (Prior) -> Mixture weights p = pm.Dirichlet('p', a=np.ones(nbr_gauss)) # Prior on means (mu_k) mu = pm.Normal( 'mu', mu=np.linspace(x_train.min(), 1, nbr_gauss), sigma=0.1, transform=pm.transforms.ordered, shape=(nbr_gauss,), testval=np.linspace(x_train.min(), 1, nbr_gauss) ) # Prior on the precision matrix (or the inv of covariance matrix) tau= pm.Gamma('tau', alpha=10, beta=1.0, shape=(nbr_gauss,)) # Likelihood Y_obs = pm.NormalMixture('Y_obs', w=p, mu=mu, tau=tau, observed=y_train) prior_checks = pm.sample_prior_predictive(samples=50, random_seed=seed) # Start the sampler trace = pm.sample(draws=3000, tune=1000, step=pm.NUTS(), chains=1, cores=1) # sample posterior predictive samples ppc_trace = pm.sample_posterior_predictive(trace, var_names=["mu", "tau", "p", "Y_obs"], keep_size=True)
I am currently failing to retrieve the original curve after having estimated the parameters (p, mu, tau). Is the implementation correct? Thanks.