Mixing pytensor.dot and dimensions

You shouldn’t need to take the mean of alpha. This will give the same intercept value to all the units in your sample – the average of the two cluster intercepts. Indexing X by cluster will also not do what you think, it will duplicate rows 0 and 1 over and over.

Note that mu is the conditional expectation, given the X values and the clusters, so you don’t want to take the average of everything.

Here’s a corrected model:

import pymc as pm
coords = {
    'num_cols':['BMI','WTGAIN','M_Ht_In'],
    'obs':example_data.index,
    'clusters':["cluster_3", "cluster_2"] 
}
with pm.Model(coords=coords) as Hier: 
    # add data containers
    X = pm.MutableData("X", example_data[['BMI','WTGAIN','M_Ht_In']].values)
    y = pm.MutableData("y", example_data['DBWT'].values)
    
    # hyper-prior
    αμ = pm.Normal("αμ", mu=0., sigma=3.,)
    ασ = pm.HalfNormal("ασ", sigma=3.,)
    
    βμ = pm.Normal("βμ", mu=0., sigma=3.,)
    βσ = pm.HalfNormal("βσ", sigma=3.,)
    
    # error
    ϵ = pm.Normal("ϵ", mu=0, sigma=3.)
        
    # using the hyperpior 'δ' 
    α = pm.Normal("α", mu=αμ, sigma=ασ, dims="clusters")
    β = pm.Normal("β", mu=βμ, sigma=βσ, dims= ("clusters", "num_cols"))
    
    μ = (X * β[cluster_index, :]).sum(axis=-1) + α[cluster_index]
    # response
    yhat = pm.Normal("yhat", mu=μ, sigma=ϵ, observed=y)
    
    # sample
    trace = pm.sample(
        250,
        tune=50,
        chains=4,
        return_inferencedata=True,
        idata_kwargs={'log_likelihood':True}
    )
1 Like