Model fails to recover posterior when using correct prior (but does with incorrect prior)

jewelltaylor · March 5, 2023, 9:12pm

I am having trouble getting a certain parameters in my bayesian GLM to converge even though it follows the data generating process.

The data is generated according to the following (simulated) process. There are a number of user who give scores a number of weeks. A users score for a given week is sampled from a normal where the mean is given by the sum of a user specific term, a week specific term and a given treatment effect (user specific) thats scaled by a change factor that grows across weeks. This normal has constance variance across all users and weeks. The user and week specific terms are simply selected from normal distribution. The treatment effect is the product of global user weights and user specific features. A treatment effect sampled from a normal for each user is used to scale the global user weights prior to this operation to achieve the desired user specific treatment effect.

More specifically, here is python like pseudocode to generate the data:

num_users = 10000
num_weeks = 12
user_terms = Normal(user_mean, user_std, size=num_users) 
week_terms = Normal(0, week_std, size=num_weeks)
treatment_effects = Normal(effect_size, effect_size / 2)
global_user_weights = Laplace(weights_mean, weights_decay, size=num_features) 

# Normalize global user weights to unit norm 1
scaled_global_user_weights = global_user_weights / sum(global_user_weights) 

score_matrix = ones(rows=num_users, cols=num_weeks)
for user_term, user_id in enumerate(user_terms): 
    user_features = choice([0, 1], size=num_features) # binary features 
    for week_term, week_id in enumerate(week_terms): 
        change_factor = log(week_id * (velocity - 1) / (num_weeks - 1) + 1) / log(velocity)
        # Scale global user weights to induce a maximum treatment_effect of effect_size 
        # when thew dot product is taken with user specific features
        weights = scaled_global_user_weights * treatment_effects[user_id] 
        treatment_effect = dot(weights, user_features) * change_factor) 
        treatment_effect =  treatment_effects[user_id] * change_factor
        score_mean = user_term + weekly_term + treatment_effect
        score = Normal(score_mean, score_std) 
        score_matrix[user_id, week_id] = score

Based on this data generating process, we are looking to construct a bayesian glm to recover the global user weights, user terms, weekly terms and user specific velocity terms. We are able to recover, the global user weights and user specific velocity terms but having trouble recovering weekly terms. Specifically, when using the prior that is used to generate the data (ie Normal(0, week_std), the MCMC converges to a place in which all of the weekly terms are negative. However, when I use Uniform prior for the weekly terms, posterior estimates are a lot more reasonable. Here is the code:

# Set prior for global user weights
global_user_weights = pm.Laplace("weights", mu=0, b=b, shape=num_features)

# Set prior for velocity 
velocity = pm.Normal("velocity", velocity_mean, velocity_std, size=num_users)

# Set prior for user terms 
user_terms = pm.Uniform("intercept", 0, 1, size=num_users)

# Set prior for weekly terms 
weekly_terms = pm.Uniform(
    "exogenous_means", -weekly_std, weekly_std, size=num_weeks
)

# Setup feature values
user_features = pm.Data("user_features", user_features, mutable=False)
week = pm.Data("week", weeks, mutable=False)
user_id = pm.Data("user_id", user_indices, mutable=False)
week_id = pm.Data("week_id", week_indices, mutable=False)

likelihood = (
    pm.math.dot(user_features, global_user_weights)
    * pm.math.log(week * (velocity[user_id] - 1) / num_weeks + 1)
    / pm.math.log(velocity[user_id])
    + user_terms[user_id]
    + weekly_terms[week_idx]
)
obs = pm.Normal(
    "observation",
    likelihood,
    sigma=score_std,
    observed=scores,
)

trace = pm.sampling_jax.sample_numpyro_nuts(tune=2000, target_accept=0.95, data_kwargs= dict(log_likelihood=False))

Here is an example of the recovered weekly_terms (exogenous_means) when using uniform:

And here is an example of the recovered weekly_terms when using normal:

As you can see, the estimates are all negative even if i very prior to make it very accurate. Here is some of the sample chains for certain weekly terms:

Any help debugging this strange behaviour would be appreciated!

Topic		Replies	Views
Trouble fitting Bayesian GLM with large number of parameters version agnostic development , large_model , modeling	2	380	November 18, 2023
Problems with posteriors of parameters in Bayesian hierarchical model (partial pooling) v5	0	237	August 8, 2023
Issues with Hierarchical model version agnostic modeling	3	76	February 7, 2025
Ideas for reparameterizing models/changing priors to avoid divergences Questions	3	912	December 5, 2019
Calculated Values not matching Posterior Predictive version agnostic modeling	8	1028	May 29, 2022

Model fails to recover posterior when using correct prior (but does with incorrect prior)

Related topics