Unpooled gauss mixture regression model

Questions need help in mixture modeling for unpooled data.

i’m in the middle of running a code for mixture model, coping to compare gaus mixture and gamma mixture.
this is mu code for gaus mixture.

area_categories = data['Kec'].unique()  
area_to_index = {area: index for index, area in enumerate(np.unique(area_categories))}


areas = np.array([area_to_index[area] for area in data['Kec']])
data['Kec_idx'] = data['Kec'].map(area_to_index)

n_components = 2  
coords = {
    'area': np.unique(areas),
    'obs_id': np.arange(data['Harga_Miliar'].shape[0]),
    'comp': np.arange(n_components)
}

with pm.Model(coords=coords) as gmm_model:
    
    Y = pm.MutableData('Y', data['Harga_Miliar'], dims='obs_id')
    X1 = pm.MutableData('X1', data['Luas_Tanah'], dims='obs_id')
    
    
    area_coords = pm.MutableData('area_coord', data.Kec_idx, dims='obs_id')
    
    
    beta_0_global = pm.Normal('beta_0_global', mu=0, sigma=5, initval=data['Harga_Miliar'].mean())
    beta_1_global = pm.Normal('beta_1_global', mu=0, sigma=5, initval=0)

   
    beta_0_area = pm.Normal('beta_0_area', mu=0, sigma=5, dims=('comp', 'area'))
    beta_1_area = pm.Normal('beta_1_area', mu=0, sigma=5, dims=('comp', 'area'))

  

    
    mu = beta_0_global + beta_1_global * X1
    mu_comp = []
    for comp_idx in range(n_components):
        mu_c = mu + beta_0_area[comp_idx, area_coords] + beta_1_area[comp_idx, area_coords] * X1
        mu_comp.append(mu_c)

    
    sigma = pm.HalfNormal('sigma', sigma=1, dims=('comp', 'area'))
    sigma_broadcasted = sigma[:, area_coords]

   
    w_shape = (n_components, len(np.unique(areas)))
    w = pm.Dirichlet('w', a=np.ones(w_shape), shape=w_shape, dims=('comp', 'area'))
    w_broadcasted = w[:, area_coords].T

    
    comp_dists = [pm.Normal.dist(mu=mu_comp[comp_idx], sigma=sigma_broadcasted[comp_idx]) for comp_idx in range(n_components)]

   
    y = pm.Mixture('y', w=w_broadcasted, comp_dists=comp_dists, observed=Y)

    

and I got error like this

 index 2 is out of bounds for axis 0 with size 2
Apply node that caused the error: AdvancedSubtensor1(w_simplex___simplex, area_coord)
Toposort index: 70
Inputs types: [TensorType(float64, (None, None)), TensorType(int32, (None,))]
Inputs shapes: [(2, 26), (579,)]
Inputs strides: [(208, 8), (4,)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Sum{axis=[1], acc_dtype=float64}(AdvancedSubtensor1.0), Elemwise{le,no_inplace}(AdvancedSubtensor1.0, TensorConstant{(1, 1) of 1}), Elemwise{ge,no_inplace}(AdvancedSubtensor1.0, TensorConstant{(1, 1) of 0}), Elemwise{Composite{(log(i0) + i1)}}[(0, 0)](AdvancedSubtensor1.0, Join.0)]]

Backtrace when the node is created (use Aesara flag traceback__limit=N to make it longer):
  

HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

please. help.
thanks.

It is hard to guess exactly without seeing the full code and the full error stack but since it says out of bounds for w_simplex and area_coord I will make a guess and say the problem is in the last line of the following:

w_shape = (n_components, len(np.unique(areas)))
w = pm.Dirichlet('w', a=np.ones(w_shape), shape=w_shape, dims=('comp', 'area'))
w_broadcasted = w[:, area_coords].T

It is likely that the number of columns of w is less than the maximum value in area_coords (hence the out of bounds error). Also the warning says some stuff about Aesara which seems to suggest you have the older version of pymc. Updating is recommended.

1 Like