I have a model that I am now regretting trying too many new things at once.
I am trying to estimate parameters that capture the influence of two independent variables while also getting the model to weight the likely trade off between them over another index variable.
I also have observations that can only be trusted (and I only care about matching) at an aggregate level so I have implimented a scan version of pandas df.groupby(‘id’).sum() over all columns. Then attempting a multinomial likelihood. code below.
When I call pm.sample() My model gives the error “samplingerror: initial evaluation of model at starting point failed!” and says that the “count” variable is evaluated as -inf
I usually chase these errors down using prior_predictive but that is working perfectly. Does anyone have any pointers or know why it works fine (and there is no -inf or nulls in “count”) when I run sample_prior_predictive??
I checked this via “idata.prior_predictive.counts.isnull().max()” and “np.isinf(idata.prior_predictive.counts).max()” and they returns False
def bc_on_col(j, matrix,IDS):
return at.extra_ops.bincount(x=IDS,weights=matrix[:,j])
with pm.Model(coords=coords) as distribution_calibration:
a = pm.Gamma("a", alpha=6., beta=2.)
b = pm.Gamma("b", alpha=1.5, beta=4.5)
F = pm.Gamma("F", alpha=a, beta = b, dims="area_NAME")
Pr_D_dis = pm.Deterministic("Pr_D_dis",at.exp(pm.logp(F,hts_destination_gc_dollars.values)),dims=("tid","area_NAME"))
#a destcat_based factor of the importance of attraction power <1 makes differences less pronouned >1 more
ap_importance = pm.Lognormal("ap_importance", mu=0, sigma=0.2, dims='DESTCAT')
#attraction power, scaled by importance
Pr_D_ap = at.pow(total_ap.values,ap_importance[destcad_ids].dimshuffle(0,'x'))
#re-normalise
Pr_D_ap = Pr_D_ap / at.sum(Pr_D_ap, axis=1, keepdims=True)
Pr_D = Pr_D_ap * Pr_D_dis
Pr_D = pm.Deterministic("Pr_D",Pr_D/at.sum(Pr_D, axis=-1, keepdims=True),dims=("tid","area_NAME"))
rolled_up_O, _ = aesara.scan(fn = bc_on_col, sequences=[at.arange(Pr_D.shape[1])], non_sequences=[Pr_D * tid_weights,estOcat_ids])
rolled_up_O = rolled_up_O / at.sum(rolled_up_O,axis=1,keepdims=True)
rolled_up_O2 = pm.Deterministic("rolled_up_O2",rolled_up_O.T,dims=("estOcat","area_NAME"))
counts = pm.Multinomial(
"counts", n=obs_n, p=rolled_up_O2,dims=("estOcat","area_NAME"),observed = obs_counts.values
)
idata = pm.sample_prior_predictive(samples=200, random_seed=rng)
#idata = pm.sample()