Posterior does not fit observed well when group and a predictor is included

kwespiipi · October 24, 2020, 2:01pm

I’ve been struggling with this model for awhile and do not know what to do. I have data that is log-normally distributed. So I take the log of my response variable and fit a normal model like below.

with pm.Model() as pooled_model:
    mu = pm.Normal('mu', mu = 0, sd = 1)
    sigma = pm.Exponential('sigma', 1)
    #nu = pm.Exponential('nu', 10)
    
    
    work_time = pm.Normal('work_time', mu = mu, sigma = sigma, observed = d.work_time_log_std)
    
    #work_time = pm.StudentT('work_time',nu = nu, mu = mu, sigma = sigma, observed = d.work_time_log_std)    
    pooled_trace = pm.sample(2000, tune = 1000, random_seed= seed)

The ppc plot of the above model is below.

Here’s the issue I am having. I want to fit the same model above but with a predictor and a different intercept for my groups.

with pm.Model() as unpooled_glm:
    
    a = pm.Normal('a',mu = 0, sd = 10, shape = n_groups)
    b1 = pm.Normal('b1',mu = 0, sd = 10) 

    mu = pm.Deterministic('mu', a[g_idx] + b1*d['feet_std'])
    
    sigma = pm.HalfCauchy('sigma', 1, shape = n_groups)
    nu = pm.Exponential('nu', 1)
   

    work_time = pm.StudentT('work_time',nu = nu, mu = mu, sigma = sigma[g_idx], observed = d.work_time_log_std)

    #work_time = pm.Normal('work_time', mu = mu, sigma = sigma[g_idx], observed = d.work_time_log_std)
    
    unpooled_glm_trace = pm.sample(2000, tune = 1000, random_seed= seed)

The ppc plot looks like below. I cannot seem to figure out how to parameterize my model so that the ppc fit the observed well like the first model. Any idea on what I’m doing wrong? Thanks for the help

AlexAndorra · October 26, 2020, 9:42am

Hi,
Have you tried with a Normal likelihood? It looks like your observed data isn’t really fat-tailed.

Also, maybe try estimating only one sigma – right now you’re estimating one per group, which can be difficult when you don’t have a lot of data (and sometimes it also doesn’t make sense from a domain knowledge perspective: do you really expect the observational noise to be different for each group?).

Finally, you can do prior predictive checks, to make sure your priors make sense on the outcome scale, and you can make custom posterior predictive checks, looking at each group in a different way, in addition to ArviZ’s standard PPC plot.

Hope this helps

Topic		Replies	Views
Model Fits with no issues but getting an error when sampling ppc v5 modeling	4	370	July 11, 2023
Posterior and data mismatch in linear model with observed predictors Questions	0	471	August 13, 2020
Unexpected prior predictive behaviour Questions	1	410	September 30, 2020
Samples from prior appear to have wrong distribution Questions	2	582	October 8, 2018
Interpretation of posterior predictive checks for a Gaussian Process v5 gaussian_process , modeling , arviz	11	1168	August 24, 2023

Posterior does not fit observed well when group and a predictor is included

Related topics