Random variable transformations and Bambi

Hi, I’m training a simple hierarchal model in Bambi.

My data generation process is

outcome ~ Possion(mu)
log(mu) = intercept_group  + log(β)*slope_group

I know from the definition of the of problem that 0<=slope_group<=1

My bambi model looks like this:

     Formula: outcome ~  0 + group_id + log(β) + (0 + log(β)|group_id)
        Family: poisson
          Link: mu = log
  Observations: 580
        Priors: 
    target = mu
        Common-level effects
            group_id ~ Normal(mu: 0.0, sigma: 2.0)
            log(β) ~ Beta(alpha: 1.0, beta: 1.0)
        
        Group-level effects
            log(β)|group_id ~ Normal(mu: 0.0, sigma: HalfNormal(sigma: 1.0))
        

It fits fine, but my problem that we I get individual group slopes as idata.posterior['log(β)'] + idata.posterior['log(β)|group_id'] i may end up with the values in the range of [0,1], what blows up some my post-processing steps.

I would like to have something slope_group = logit(slope_global + slope_correction|group) to constrain the slope between 0 and 1. I wonder if that’s possible to do without going back to pure PyMC implementation.

1 Like

CC @tcapretto @zweli

You could achieve this with a custom prior on the group-specific slope.

Notice that you only add the group-specific part, but under the hood, you also add the population mean.

The following is an example I used to check I was setting the priors correctly, but don’t try to use it with these data because it should not work well.

import bambi as bmb
import pymc as pm

data = bmb.load_data("sleepstudy")

def custom_prior_centered(name, mu, sigma, dims=None):
    param_untransformed = pm.Normal(f"{name}_untransformed", mu=mu, sigma=sigma, dims=dims)
    return pm.Deterministic(name, pm.math.logit(param_untransformed), dims=dims)


def custom_prior_noncentered(name, mu, sigma, dims=None):
    param_raw = pm.Normal(f"{name}_raw", mu=0, sigma=1, dims=dims)
    param_untransformed = pm.Deterministic(f"{name}_untransformed", mu + sigma * param_raw, dims=dims)
    return pm.Deterministic(name, pm.math.logit(param_untransformed), dims=dims)

formula = "Reaction ~ 0 + Subject + (0 + Days | Subject)"

priors = {
    "Subject": bmb.Prior("Normal", mu=0, sigma=200),
    "Days|Subject": bmb.Prior(
        "TransformedNormal",
        mu=bmb.Prior("Normal", mu=0, sigma=1),
        sigma=bmb.Prior("HalfNormal", sigma=1),
        dist=custom_prior_centered, # or custom_prior_noncentered
    )
}

bmb.Model(formula, data, priors=priors, categorical="Subject")

In your case, you may also want to explore pymc.LogitNormal — PyMC 5.21.0 documentation

This may help to understand how Bambi does hierarchical modeling.

@tcapretto is a ninja at this kind of stuff, but I am thinking some kind of custom prior. I don’t have time right now but will look at it this evening.

2 Likes

Lol. He responded while I was typing. Like I said, @tcapretto is a ninja.

3 Likes

Thank you for your reply!

I’m a bit confused, does it say I should not use a population slope in the model because it is added automatically?

One of the purposes of my study is the population slope.

Reading from that github explanation you linked

This is exactly what Bambi represents with 1 and (1 | Pig) in the model. 1 is the common intercept, and (1 | Pig) is the pig specific deviation around that common intercept. As in the math notation, the sum of these two components give the intercept for each pig.

That’s I understand, my goal is to limit 1 + (1|Pig) to be within 0 and 1, if that’s two independent distributions that’s not possible.

I’m a bit confused, does it say I should not use a population slope in the model because it is added automatically?

It’s not added in the formula, but it is added in the model because of the way the custom prior is constructed. In the example above, mu is the population slope.


If, in your example, you want to have both the population slope (mu) and the group slopes to be between 0 and 1 (mu + group specific deviation), perhaps you need to use something different.

You could do the following

import bambi as bmb

data = bmb.load_data("sleepstudy")

formula = "Reaction ~ 0 + Subject + (0 + Days | Subject)"

priors = {
    "Subject": bmb.Prior("Normal", mu=0, sigma=200),
    "Days|Subject": bmb.Prior(
        "Beta",
        mu=bmb.Prior("Beta", alpha=3, beta=3),
        nu=bmb.Prior("HalfNormal", sigma=10),
    )
}

bmb.Model(formula, data, priors=priors, categorical="Subject")

Again, don’t run it against that data because it will not work.

In the example, I’m partially pooling slope parameters that are, by design, between 0 and 1. It uses the mean and “sample size” parametrization of the beta distribution.


With your syntax it would be:

formula = "outcome ~ 0 + group_id + (0 + log(β)|group_id)"

priors = {
    "log(β)|group_id": bmb.Prior(
        "Beta",
        mu=bmb.Prior("Beta", alpha=3, beta=3),
        nu=bmb.Prior("HalfNormal", sigma=10),
    )
}

bmb.Model(formula, data, priors=priors)

I try to avoid beta as I don’t have a good prior on the effective sample size, I tried LogitNormal but what confused me a lot is that mu hyperprior is not even in the graph. Is this a bug?

The formulation is:

bmb.Model(
    "actions ~ 0 + group_id + (0+log(β)|group_id)",
    data=df,
    family="poisson",
    priors={
        "log(β)|group_id": bmb.Prior("LogitNormal", mu=bmb.Prior("Normal", mu=0, sigma=1), sigma = bmb.Prior("HalfNormal", sigma=1)),
        "group_id": bmb.Prior("Normal", mu=0, sigma=2),
    },

It’s same behavior for “Normal” prior for the slope by the way, but not so for the Beta:

"log(β)|group_id": bmb.Prior("Beta", mu=bmb.Prior("Beta", alpha=1, beta=1), nu = bmb.Prior("HalfNormal", sigma=1)),

It’s a really bad prior for nu, but for the sake of example.

I think you need to pass bmb.Model(..., noncentered=False) and then it should work. If it doesn’t it’s a bug.

Regarding the Beta prior and its sample size, take it as a measure of precision. The larger it is, the smaller the variance. You could use [PreliZ]( Exploring and eliciting probability distributions — PreliZ 0.15.0 documentation to do prior elicitation. For example, look at Direct elicitation in 1D — PreliZ 0.15.0 documentation.

I was a bit skeptical of a centered parametrization, but it samples really well with LogitNormal. Thank you very much for your help.

As a feedback I guess the API should not allow specifying the mu prior if the parametrization is non-centered, as it’s quite confusing.

1 Like

Thanks. I think the source of the confusion comes from the blend of frequentist and bayesian approaches to mixed effects models.