Poisson regression model for beginner

I’m also new to PYMC, but have been trying to model with the related Negative Binomial Distribution, and similarly with categorical independent values. I’ve had trouble finding much documentation on distribution indexing, but have pieced together some code through other posts, so I adapted some of my code to your model. It generates an equivalent model as best as I can tell, and samples in under 2 minutes for me.


Imports

import arviz as az
import matplotlib.pyplot as plt
import pymc as pm
import pandas as pd
from pandas.api.types import CategoricalDtype

Preprocessing

print(f"Running on PyMC3 v{pm.__version__}")
print(f"Running on Arviz v{az.__version__}")

df = pd.read_csv('data/SwedishMotorInsurance.csv')
# making Pandas Categorical. You could add Names to the numerical codes if you wish
df['Kilometres'] = df['Kilometres'].astype(CategoricalDtype())
df['Zone'] = df['Zone'].astype(CategoricalDtype())
df['Bonus'] = df['Bonus'].astype(CategoricalDtype())
df['Make'] = df['Make'].astype(CategoricalDtype())

df['Rate']=df['Claims']/df['Insured']

df.head()

Model

COORDS = {'Kilometres_Vals':df['Kilometres'].cat.categories.values, 
          'Zone_Vals':df['Zone'].cat.categories.values, 
          'Bonus_Vals':df['Bonus'].cat.categories.values, 
          'Make_Vals':df['Make'].cat.categories.values,
          "obs_idx": df.index}

with pm.Model(coords=COORDS) as model:

    # define priors, weakly informative Normal
    b0 = pm.Normal("Intercept", mu=0, sigma=10)
    b1 = pm.Normal("Kilometres", mu=0, sigma=10, dims="Kilometres_Vals")
    b2 = pm.Normal("Zone", mu=0, sigma=10, dims="Zone_Vals")
    b3 = pm.Normal("Bonus", mu=0, sigma=10, dims="Bonus_Vals")
    b4 = pm.Normal("Make", mu=0, sigma=10, dims="Make_Vals")

    km_data = pm.ConstantData('Kilometres_data', df['Kilometres'].cat.codes.to_numpy(), dims="obs_idx")
    zone_data = pm.ConstantData('Zone_data', df['Zone'].cat.codes.to_numpy(), dims="obs_idx")
    bonus_data = pm.ConstantData('Bonus_data', df['Bonus'].cat.codes.to_numpy(), dims="obs_idx")
    make_data = pm.ConstantData('Make_data', df['Make'].cat.codes.to_numpy(), dims="obs_idx")

    # not making this a Deterministic because it would take the dims of # of datapoints
    theta = b0 + b1[km_data] + b2[zone_data] + b3[bonus_data] + b4[make_data]

    # ## Define Poisson likelihood
    y = pm.Poisson("y", mu=pm.math.exp(theta), observed=df['Rate'].values)

pm.model_to_graphviz(model)

Sample

with model:
    trace = pm.sample(draws=5000, tune=500, cores=4, random_seed=45, step=pm.Metropolis())
    
pm.summary(trace)

100.00% [22000/22000 01:50<00:00 Sampling 4 chains, 0 divergences]
Sampling 4 chains for 500 tune and 5_000 draw iterations (2_000 + 20_000 draws total) took 110 seconds.

|mean|sd|hdi_3%|hdi_97%|mcse_mean|mcse_sd|ess_bulk|ess_tail|r_hat|
|Intercept|-21.138|8.832|-38.752|-5.647|2.836|2.069|10.0|40.0|1.31|
|Kilometres[1]|-8.326|7.546|-22.524|5.354|1.062|0.755|49.0|352.0|1.06|
|Kilometres[2]|-8.384|7.528|-22.646|5.808|0.996|0.708|54.0|321.0|1.05|
|Kilometres[3]|0.082|5.562|-10.302|10.515|1.475|1.065|15.0|65.0|1.20|
|Kilometres[4]|-8.937|7.757|-23.532|5.206|0.691|0.490|97.0|423.0|1.04|
|Kilometres[5]|1.182|5.566|-8.483|11.945|1.522|1.100|14.0|54.0|1.22|
|Zone[1]|-5.257|8.245|-21.075|9.332|0.287|0.203|760.0|1336.0|1.01|
|Zone[2]|-4.591|8.190|-19.726|10.162|0.291|0.206|707.0|931.0|1.01|
|Zone[3]|-4.900|8.247|-20.375|10.100|0.311|0.220|621.0|1090.0|1.01|
|Zone[4]|-4.683|8.157|-20.335|9.482|0.408|0.288|377.0|828.0|1.02|
|Zone[5]|-5.012|8.275|-21.077|9.448|0.327|0.232|583.0|1014.0|1.01|
|Zone[6]|-4.879|8.370|-20.631|10.338|0.327|0.232|585.0|1063.0|1.01|
|Zone[7]|8.438|5.107|-1.407|17.877|0.900|0.642|32.0|113.0|1.08|
|Bonus[1]|-6.244|7.774|-21.171|7.367|0.273|0.193|743.0|1166.0|1.01|
|Bonus[2]|-6.905|7.864|-21.760|7.405|0.278|0.197|718.0|853.0|1.01|
|Bonus[3]|-6.822|7.791|-21.670|6.898|0.331|0.234|495.0|1052.0|1.01|
|Bonus[4]|4.317|4.472|-3.805|12.705|0.560|0.398|61.0|185.0|1.07|
|Bonus[5]|3.225|4.568|-5.171|11.791|0.569|0.404|63.0|194.0|1.07|
|Bonus[6]|-6.820|7.575|-21.056|6.621|0.321|0.227|493.0|834.0|1.01|
|Bonus[7]|-7.098|7.639|-21.793|6.213|0.310|0.220|548.0|905.0|1.02|
|Make[1]|-5.948|8.054|-21.779|8.020|0.312|0.221|628.0|766.0|1.00|
|Make[2]|-5.653|7.975|-21.086|8.439|0.316|0.223|582.0|892.0|1.01|
|Make[3]|-5.326|8.008|-20.667|8.656|0.270|0.191|815.0|1060.0|1.00|
|Make[4]|-6.222|8.060|-21.833|7.777|0.321|0.227|561.0|839.0|1.01|
|Make[5]|-5.364|7.761|-20.000|8.180|0.215|0.152|1116.0|1702.0|1.01|
|Make[6]|5.070|4.270|-2.444|13.567|0.464|0.329|85.0|162.0|1.04|
|Make[7]|6.183|4.122|-0.865|14.679|0.457|0.324|81.0|141.0|1.04|
|Make[8]|-5.438|7.866|-20.660|8.451|0.323|0.228|520.0|862.0|1.01|
|Make[9]|-5.686|7.940|-21.525|7.588|0.292|0.206|623.0|851.0|1.01|

Plot Trace

az.plot_trace(
    trace,
    backend_kwargs={"layout": "constrained"}, legend=True
);


I hope this helps. Also, if any more advanced users can improve upon it, that would help my own modeling as well. Also, if anyone knows where the actual documentation is on indexing.
e.g. b1[km_data] or b1[km_data, zone_data]
I’m particualarly interested in generalizing to higher dimensional distributions.