Random Walk Model - Terrible sampling but realistic inference?

Really good suggestion, makes a lot of sense thank you! I ended up getting a better fit by changing to a GRW, as well as switching from a multivariate to just a no pooling Gaussian random walk and adding an AR1 term. Here’s the model

with pm.Model() as m2c:

# latent variables
team_bias = pm.GaussianRandomWalk('team_bias', sigma=1, shape=(len(np.unique(t)), len(team_dct)), )

mu_form = pm.Normal('mu_form', 0,0.5)
team_form = pm.AR1('team_form', k=mu_form, tau_e=0.1, shape=(len(np.unique(t)), len(team_dct)), )

# home team advantage
mu_home = pm.Normal('mu_home', 0.3, 0.4)
home = pm.Normal('home', mu_home,0.5, shape=len(team_dct))

# prior
sigma_y = pm.HalfNormal('sigma_y', 0.1) # match variance
ability = pm.Deterministic('ability', team_bias + team_form )

mu_match = home[team1_] + ability[t, team1_] - ability[t, team2_]
y = pm.Normal('y', mu=mu_match, sd=sigma_y, observed = result)
trace2c = pm.sample(1000,  tune=1000, return_inferencedata=True)

Two issues I’m running into:

  1. I can’t figure out why having both a RW and an AR1 process lead to the best model. Is that weird theoretically?
  2. Is there a clever way I can add a prior that limits extreme match results and add more weight to predictions around 0? Example of why my current model is problematic below

I don’t think it makes much sense that the probability of Milan scoring 4+ more goals over Parma is the same probability as them drawing or losing.