Problems moving from shared variable to data container

Hi I’m trying to uprgrade to pymc5 in order to try out nutpie but am struggling to understand how to convert the model which used theano/aesera shared variables to the new pm.Data container. My model is something like this

age = data_df.age_num.astype(int).values
age_sh = shared(age)
n_age = len(set(age))

region = data_df.region
n_region = len(set(region))

observed_data = data_df.obs

def hierarchical_normal(name, shape, μ=0.0):
    Δ = pm.Normal("Δ_{}".format(name), 0.0, 1, shape=(shape, 4))  # 4 is num parties
    σ = pm.HalfCauchy("σ_{}".format(name), 1)
    return pm.Deterministic(name, μ + Δ * σ)

with pm.Model() as model:
    α_region = hierarchical_normal("state", n_region)
    β0 = pm.Normal(
        "β0",
        0.0,
        0.5,
        shape=(1, 4),
    )
    η =  β0 + α_age[age_sh ]
    p = pm.math.softmax(η, axis=1)
    obs = pm.Multinomial("O", n_, p, shape=(len(observed_data ), 4), observed=observed_data )

Currently this returns:
TypeError: TensorType does not support iteration. Maybe you are using builtins.sum instead of aesara.tensor.math.sum? (Maybe .max?)

Is there a simple way to make this work in pymc5? If not how should I write this using the pm.Data ?

Thanks
Rhys

Is that example complete? I don’t see where α_age is defined and α_region doesn’t seem to be used at all.

Yes sorry I’ve made a mistake when trying to make a minimal example, (I can’t seem to be able to edit the original post) the code should be

age = data_df.age_num.astype(int).values
age_sh = shared(age)
n_age = len(set(age))

region = data_df.region
region_sh = shared(region)
n_region = len(set(region))

observed_data = data_df.obs

def hierarchical_normal(name, shape, μ=0.0):
    Δ = pm.Normal("Δ_{}".format(name), 0.0, 1, shape=(shape, 4))  # 4 is num parties
    σ = pm.HalfCauchy("σ_{}".format(name), 1)
    return pm.Deterministic(name, μ + Δ * σ)

with pm.Model() as model:
    α_region = hierarchical_normal("state", n_region)
    α_age = hierarchical_normal("age", n_age)
    β0 = pm.Normal(
        "β0",
        0.0,
        0.5,
        shape=(1, 4),
    )
    η =  β0 + α_age[age_sh ] + α_region[region_sh ]
    p = pm.math.softmax(η, axis=1)
    obs = pm.Multinomial("O", n_, p, shape=(len(observed_data ), 4), observed=observed_data )

There are going to be several different ways to handle this, but it’s going to impact both your model and your hierarchical_normal() function. The simplest way would be to just do this kind of thing:

age = data_df.age_num.astype(int).values
age_idx,ages = pd.factorize(data_df["age_num"])
n_age = len(ages)
region_idx,regions = pd.factorize(data_df["region"])
n_region = len(regions)

and then index into your parameter arrays like this:

η =  β0 + α_age[age_idx] + α_region[region_idx]

However, you can definitely utilize pm.MutableData objects if the flexibility would be of use to you. I would also recommend utilizing the dimension/coordinate functionality to have more meaningful index labels. Check here and here.