Questions need help in mixture modeling for unpooled data.
i’m in the middle of running a code for mixture model, coping to compare gaus mixture and gamma mixture.
this is mu code for gaus mixture.
area_categories = data['Kec'].unique()
area_to_index = {area: index for index, area in enumerate(np.unique(area_categories))}
areas = np.array([area_to_index[area] for area in data['Kec']])
data['Kec_idx'] = data['Kec'].map(area_to_index)
n_components = 2
coords = {
'area': np.unique(areas),
'obs_id': np.arange(data['Harga_Miliar'].shape[0]),
'comp': np.arange(n_components)
}
with pm.Model(coords=coords) as gmm_model:
Y = pm.MutableData('Y', data['Harga_Miliar'], dims='obs_id')
X1 = pm.MutableData('X1', data['Luas_Tanah'], dims='obs_id')
area_coords = pm.MutableData('area_coord', data.Kec_idx, dims='obs_id')
beta_0_global = pm.Normal('beta_0_global', mu=0, sigma=5, initval=data['Harga_Miliar'].mean())
beta_1_global = pm.Normal('beta_1_global', mu=0, sigma=5, initval=0)
beta_0_area = pm.Normal('beta_0_area', mu=0, sigma=5, dims=('comp', 'area'))
beta_1_area = pm.Normal('beta_1_area', mu=0, sigma=5, dims=('comp', 'area'))
mu = beta_0_global + beta_1_global * X1
mu_comp = []
for comp_idx in range(n_components):
mu_c = mu + beta_0_area[comp_idx, area_coords] + beta_1_area[comp_idx, area_coords] * X1
mu_comp.append(mu_c)
sigma = pm.HalfNormal('sigma', sigma=1, dims=('comp', 'area'))
sigma_broadcasted = sigma[:, area_coords]
w_shape = (n_components, len(np.unique(areas)))
w = pm.Dirichlet('w', a=np.ones(w_shape), shape=w_shape, dims=('comp', 'area'))
w_broadcasted = w[:, area_coords].T
comp_dists = [pm.Normal.dist(mu=mu_comp[comp_idx], sigma=sigma_broadcasted[comp_idx]) for comp_idx in range(n_components)]
y = pm.Mixture('y', w=w_broadcasted, comp_dists=comp_dists, observed=Y)
and I got error like this
index 2 is out of bounds for axis 0 with size 2
Apply node that caused the error: AdvancedSubtensor1(w_simplex___simplex, area_coord)
Toposort index: 70
Inputs types: [TensorType(float64, (None, None)), TensorType(int32, (None,))]
Inputs shapes: [(2, 26), (579,)]
Inputs strides: [(208, 8), (4,)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Sum{axis=[1], acc_dtype=float64}(AdvancedSubtensor1.0), Elemwise{le,no_inplace}(AdvancedSubtensor1.0, TensorConstant{(1, 1) of 1}), Elemwise{ge,no_inplace}(AdvancedSubtensor1.0, TensorConstant{(1, 1) of 0}), Elemwise{Composite{(log(i0) + i1)}}[(0, 0)](AdvancedSubtensor1.0, Join.0)]]
Backtrace when the node is created (use Aesara flag traceback__limit=N to make it longer):
HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
please. help.
thanks.