Posterior does not fit observed well when group and a predictor is included

I’ve been struggling with this model for awhile and do not know what to do. I have data that is log-normally distributed. So I take the log of my response variable and fit a normal model like below.

with pm.Model() as pooled_model:
    mu = pm.Normal('mu', mu = 0, sd = 1)
    sigma = pm.Exponential('sigma', 1)
    #nu = pm.Exponential('nu', 10)
    
    
    work_time = pm.Normal('work_time', mu = mu, sigma = sigma, observed = d.work_time_log_std)
    
    #work_time = pm.StudentT('work_time',nu = nu, mu = mu, sigma = sigma, observed = d.work_time_log_std)    
    pooled_trace = pm.sample(2000, tune = 1000, random_seed= seed)

The ppc plot of the above model is below.

Here’s the issue I am having. I want to fit the same model above but with a predictor and a different intercept for my groups.

with pm.Model() as unpooled_glm:
    
    a = pm.Normal('a',mu = 0, sd = 10, shape = n_groups)
    b1 = pm.Normal('b1',mu = 0, sd = 10) 

    mu = pm.Deterministic('mu', a[g_idx] + b1*d['feet_std'])
    
    sigma = pm.HalfCauchy('sigma', 1, shape = n_groups)
    nu = pm.Exponential('nu', 1)
   

    work_time = pm.StudentT('work_time',nu = nu, mu = mu, sigma = sigma[g_idx], observed = d.work_time_log_std)

    #work_time = pm.Normal('work_time', mu = mu, sigma = sigma[g_idx], observed = d.work_time_log_std)
    
    unpooled_glm_trace = pm.sample(2000, tune = 1000, random_seed= seed)

The ppc plot looks like below. I cannot seem to figure out how to parameterize my model so that the ppc fit the observed well like the first model. Any idea on what I’m doing wrong? Thanks for the help

Hi,
Have you tried with a Normal likelihood? It looks like your observed data isn’t really fat-tailed.

Also, maybe try estimating only one sigma – right now you’re estimating one per group, which can be difficult when you don’t have a lot of data (and sometimes it also doesn’t make sense from a domain knowledge perspective: do you really expect the observational noise to be different for each group?).

Finally, you can do prior predictive checks, to make sure your priors make sense on the outcome scale, and you can make custom posterior predictive checks, looking at each group in a different way, in addition to ArviZ’s standard PPC plot.

Hope this helps :vulcan_salute: