Hi Pymc3 community,
I am trying to fit a mixture model, where each observation (s
) is sampled from a number of component distributions. The number of component distributions is unknown and to be estimated.
The possible component distributions are one Uniform(0, 25)
(to model contaminates) and an unknown number (z
) of Normal distributions, where z can be one of {0,1,2,3,4,5}.
I am trying to infer:
- The number of normal distributions:
z
- The mean of each of the normal distributions:
gss
- A number of standard deviations (```r_E``) associated with the normal distributions, which are shared across observations.
- The mapping from each data point to each component distribution:
w
, wherew <= z
The model seems to produce reasonable estimates of P(gss | z)
but z
rarely changes. When I print z, I can see different values being proposed, but the trace contains a constant value for each chain, which leads me to believe that alternate values of z are always rejected.
I also have a lot of divergences and some bad Gelman Rubin stats.
Questions:
- Can i parameterize my model differently for more effective sampling/less divergences?
- How can i sample from
z
more effectively? - Can i/should i use different samplers?
Ive also posted the code here:
with pm.Model() as model:
# True s
gss = pm.Uniform('gss', lower=0., upper=25., shape=(1,max_s),
testval=gss_spindle_testvals)
gss_per_obs = gss[data['epoch_i'],:] #broadcast, 1 for each obs
# The number of gss
gss_prior = pm.Dirichlet('gss_prior', a=s_number_prior)
z = pm.Categorical('z', p=gss_prior).reshape((1,1)) #change shape to fix theano reshape error
z_per_obs = z[data['epoch_i']] #broadcast, 1 for each obs
w_prior_possibilities = tt.tril(tt.ones((max_s + 1, max_s + 1)))
w = pm.Categorical('w', p=w_prior_possibilities[z_per_obs[:,0],:], shape=n_obs)
# --- Raters ability to detect gss --- #
r_E = pm.Bound(pm.Normal, lower=0.)('r_E', mu=0.5, sd=0.5, shape=n_raters)
r_E_per_obs = r_E[data['rater_i']]
# --- Behaviour --- #
contaminate_dist_s = pm.Uniform.dist(lower=0., upper=25., shape=n_obs)
contaminate_dist_s.mean = 12.5
possible_dists = [contaminate_dist_s]
for i in range(0, 5):
dist = pm.Normal.dist(mu=gss_per_obs[:, i], sd=r_E_per_obs)
dist.mean = gss_spindle_testvals[i]
possible_dists.append(dist)
w_array = tt.extra_ops.to_one_hot(w, nb_class=max_s + 1)
s = pm.Mixture('s', w=w_array,
comp_dists=possible_dists,
observed=data['s'])
I’ve added some questions that seem related to my problem below for anyone who might come with a similar issue later down the track.