Question about multi level structure for likelihood

JAB · June 16, 2023, 4:17pm

I am working on a modeling problem where the regression is perfectly suited for a hierarchical structure. However, after inspecting the data, it seems that the response variable has different mean and standard deviation for each group. This might be a straightforward problem, but I wondering how to extend the multilevel structure to the noise in the likelihood. e.g., if we have some x-data and some y-data with four groups, we might model like likelihood as:

intercept = pm.Normal(..., dims=groups)
slope     = pm.Normal(..., dims=groups)

mu = intercept[group] + slope[group]*x-data

err = pm.HalfNormal(...)

likelihood = pm.Normal(..., mu = mu, sigma = err, dims="obs_id")

This works fine in general, but it assumes a single noise param for all groups. Is it as simple as changing the noise to be something like:

err = pm.__SOME_DISTRIBUTION__('err', .... , dims=groups)

I wrote a full working example with simulated data. This question is based off of a problem that I am actually working on, but for simplicity in understanding, I am working with the simulated set. I was able to get the above approach to work, but (a) I am not sure if this is a correct approach, and (b) how to appropriately model the noise when I need prior and hyperprior.

Here is the notebook on colab. I thought it was clearer than trying to fit all the code/figures into one long post.

Thanks!

ricardoV94 · June 16, 2023, 8:40pm

You can use a hierarchical normal prior and exponentiate it. Or a LogNormal which is exactly the same.

JAB · June 16, 2023, 8:51pm

Great, thanks for the reply! So you mean something like this:

# Normal prior with exponential
err_mu     = pm.Normal("err_mu", mu=0, sigma=10)
err_sigma  = pm.HalfNormal("err_sigma", 1)
err        = pm.math.exp(pm.Normal("err", mu=err_mu, sigma=err_sigma, dims=("group"))

# Log normal for all?
err_mu    = pm.LogNormal("err_mu", mu=0, sigma=0.25)
err_sigma = pm.LogNormal("err_sigma", mu=0, sigma=0.25)
err       = pm.LogNormal("err", mu=err_mu, sigma=err_sigma, dims=("group"))

# Mean of distribution
mu = intercept[groupidx] + slope[groupidx]*xdata
    
# Likelihood
y_likelihood = pm.Normal("y_likelihood", mu=mu, sigma=err[groupidx], observed=ydata, dims="obs_id")

or am I getting it entirely wrong?

On a more fundamental note, are there any suggested resources for how to model noise? Is it something that really makes a difference? EDIT: what I mean by makes a difference, for some processes we may not have any idea what generates the noise in the likelihood, so in this case do we just want to test distributions until our model achieves good performance?

UPDATE:

I tried both

    # Log normal distribution for lambda param
    sigma_err  = pm.Lognormal('sigm_err',mu=0, sigma=0.5)
    # Exponentional distribution for likelihood noise
    err        = pm.Exponential("err",lam=sigma_err,dims=("group"))

   # All log normal
    err_mu    = pm.LogNormal("err_mu", mu=0, sigma=0.25)
    err_sigma = pm.LogNormal("err_sigma", mu=0, sigma=0.25)
    err       = pm.LogNormal("err", mu=err_mu, sigma=err_sigma, dims=("group"))

and both worked w/o divergence. Both converged on nearly the same solution, but neither quite accurately captures the true likelihood noise. I also tried

err_mu     = pm.Normal("err_mu", mu=0, sigma=10)
err_sigma  = pm.HalfNormal("err_sigma", 1)
err        = pm.math.exp(pm.Normal("err", mu=err_mu, sigma=err_sigma, dims=("group")))

and that failed miserably with hundreds of divergences.

Any input on this? Truthfully, the only intuition I have is that the noise should be positive. Otherwise, I am sort of guessing as to how to select and appropriate distribution.

Topic		Replies	Views
Problems when changing the likelihood slightly v5 modeling	2	293	August 31, 2023
Feedback on a Hierarchical Model for Learning Shared Coefficient of Variation v5 modeling , hierarchical	4	230	February 29, 2024
Hierarchical model with Additional Features for Certain Groups v5 modeling	4	89	June 25, 2024
Hierarchical Gaussian Mixture Regression Model Questions	5	1792	March 16, 2019
Feasibility of following Hierarchical model Questions	2	469	July 31, 2020

Question about multi level structure for likelihood

Related topics