Basically I’m doing something along the lines of (this is just a code sketch):
mu = pm.Normal("mu", mu=0, sigma=0.5)
sigma = pm.HalfNormal("sigma", sigma=0.2)
traffic_0 = pm.LogNormal("traffic_0", mu=mu, sigma=sigma)
And I expect that the posterior samples of traffic_0
are similar to taking the posterior samples of mu
and sigma
and sampling from a lognormal.
However I actually get quite different distributions.
mus = az.extract(idata.posterior, var_names=["mu"])
sigmas = az.extract(idata.posterior, var_names=["sigma"])
samples = pm.draw(pm.LogNormal.dist(mu=mus.values, sigma=sigmas.values))
bins = np.linspace(0, 5, 50)
traffic_0 = az.extract(idata.posterior, var_names=["traffic_0"])
plt.hist(samples, density=True, alpha=0.5, bins=bins)
plt.hist(traffic_0.squeeze(), density=True, alpha=0.5, bins=bins)
plt.legend(["Lognormal with $\mu$,$\sigma$ posteriors", "traffic_0 posterior"])
This feels like I must be missing something very basic but I can’t see what. Now what I’m doing is slightly more complex but not that much.
Autoregressive traffic model
I’m setting up an AR model where the initial value and the innovations come from the same lognormal distribution. In the true data generating process initial values and innovations are actually different, so this posterior “inconsistency” actually adjusts the data better, but I wanted to try a simpler version and I just fail to see how the results I’m getting are possible with the model I’ve defined.
The following is a simplified version of the model I’m using. If the issue is not straightforward and requires debugging the model let me know and I’ll work on an MRE.
def ar_dist(ar_init, rho, mu, sigma, n_steps, size):
def ar_step(traffic_t1, rho, mu, sigma):
innov = pm.LogNormal.dist(mu=mu, sigma=sigma)
traffic = traffic_t1 * rho[0] + innov * (1 - rho[0])
return traffic, collect_default_updates([innov])
#
ar_innov, _ = pytensor.scan(
fn=ar_step,
outputs_info=[{"initial": ar_init, "taps": range(-lags, 0)}],
non_sequences=[rho, mu, sigma],
n_steps=n_steps,
strict=True,
)
return ar_innov
with pm.Model(coords=coords, check_bounds=False) as ar_model:
# AR parameters
rho = pm.Beta("rho", 5, 1, dims=("lags",))
mu = pm.Normal("mu", mu=0, sigma=0.5)
sigma = pm.HalfNormal("sigma", sigma=0.2)
# AR process
traffic_0 = pm.LogNormal("traffic_0", mu=mu, sigma=sigma, dims=("lags",))
traffic_t = pm.CustomDist(
f"traffic_t",
traffic_0,
rho,
mu,
sigma,
stops - lags,
dist=ar_dist,
dims=("steps"),
)
traffic = pm.Deterministic(
name="traffic",
var=pt.concatenate([traffic_0, traffic_t], axis=-1),
dims="stops",
)
pm.Exponential("delay", traffic, observed=obs, dims=("instances", "stops"))
I first realised the issue by plotting the first 20 steps (I’m using a shifted geometric rather than an exponential, hence the discrete hdi). In the plot below you can see that for the posterior the initial traffic and the innovations (the latter being the value towards which the traffic converges as steps increase) come from different distributions despite being modelled by the same parameters. At first I thought it could be an issue with scan and the updating of \mu and \sigma but the innovations are actually consistent with these posterior, it’s the initial value (i.e., outside scan) that is not.
Versions
- pymc 5.17.0