Sensible prior for logistic regression with one or no input variable

leomein · December 2, 2021, 2:02pm

Hi all,

I have what feels like a very basic question. I am trying to get the HPD interval for the mean of a binary outcome (i.e. the interval for a probability value). I am currently running a logistic regression with just one input variable (which is used for stratifying the data). So essentially I just want to model probability values for different subgroups within the data. Currently I am using bambi and it uses a Normal prior. I was wondering if this is how one would normally go about this or if there’s another way (maybe another prior) to do this better? Also, if I just have the outcome values and no input variable (so just want to estimate the overall HPD interval for the probability for the whole dataset), how would I do that ideally? I used a Beta prior with alpha=beta=1 to do this for now but don’t know if this is sensible.

Thank you!

vb690 · December 3, 2021, 10:28am

Hello

if I understood this correctly, maybe this is what you are looking for

# simulate data
import numpy as np

import pymc3 as pm

groups = {
    0: 0.3,
    1: 0.1,
    2: 0.7
}
n = 100

groups_idx = []
observed = []

for group, p in groups.items():
    
    groups_idx.extend([group] * n)
    observed.extend([np.random.binomial(1, p) for sample in range(n)])
    
groups_idx = np.array(groups_idx)
observed = np.array(observed)

# build the model and sample from it
with pm.Model() as bernoulli_model:
    
    groups_data = pm.Data(
        'Groups Indices',
        groups_idx
    )
    observed_data = pm.Data(
        'Observed Data',
        observed
    )
    
    # weakly informative prior centred on 0.5
    p = pm.Beta(
        'p',
        2,
        2,
        shape=(len(groups))
    )
    
    observed = pm.Bernoulli(
        'Observed',
        p=p[groups_data],
        observed=observed_data
    )
    
with bernoulli_model:
    
    trace_bernoulli = pm.sample()

If what I illustrated reflects your case, you could also model this as a binomial model summing the outcomes of the bernoulli trials over groups.

leomein · December 5, 2021, 7:01pm

Great, thanks a lot! That’s exactly what I was looking for

Topic		Replies	Views
Bambi logistic regression, prior and posterior distributions of probabilities Questions bambi	4	1416	October 7, 2021
How to understand parameter uncertainty and HPD? Questions	1	677	April 4, 2019
Highly correlated variables v5 bambi , modeling	3	515	January 3, 2023
Exclude group inference for single parameter Questions	6	671	May 20, 2019
LogitNormal vs. Beta vs. Logistic Questions	1	1035	August 15, 2018

Sensible prior for logistic regression with one or no input variable

Related topics