Hello,
first of all, congratulations on the new and shiny version 4!
I’m trying to define a simple hierarchical model, and have a problem in understanding how to use implicit dimensions. My (simplified) code is here:
coords = {
"dim1": ["one", "two", "three"],
"dim2": list(range(50)),
}
with pm.Model(coords=coords) as m:
# Define priors
γ = pm.Normal("γ", -2.8, 0.3, dims=("dim1",))
τ = pm.HalfNormal("τ", 1, dims=("dim1",))
α = pm.Normal("α", γ, τ, dims=("dim2", "dim1"))
# Define the log-link function
linear_part = α[dim2_idx, dim1_idx] + at.log(exposure)
μ = at.exp(linear_part)
# The likelihood
y = pm.Poisson("y", μ, observed=claims)
When I try to sample from the prior I get an error IndexError: index 22 is out of bounds for axis 0 with size 3
and I know this is related to having both γ and τ as hyperpriors in the definition of α. This code works if I replace τ with a constant, so I suspect this has something to do with broadcasting, but I’m not really sure how to communicate to pymc that I want both γ and τ to be indexed by the same dim1_idx
in the linear part of the GLM.
Here dim1_idx
and dim2_idx
are coming from categorical columns in a pandas DataFrame, and exposure
and claims
are pandas series with the same length as dim1_idx
and dim2_idx
.
Any help would be greatly appreciated!
Cheers,
Omri