Why does my logistic regression model summary give a mean for each observation?

Kynnemall · August 26, 2022, 11:58am

I’ve built a logistic regression model on discrete data (2 binary predictors, 2 ordinal predictors) and expected that my model summary would only output a single row for the predicted variable, instead it outputs a row for each observation in my dataset.

Is there an error in my code? Or is it what the model predicts given each observation?

with pm.Model() as model:
    # priors on parameters
    beta_0 = pm.Normal("beta_0", mu=0, sigma=1)
    
    a = pm.Normal("A", mu=0, sigma=1)
    b = pm.Normal("B", mu=0, sigma=1)
    c = pm.Normal("C", mu=0, sigma=1)
    d = pm.Normal("D", mu=0, sigma=1)

    # probability of belonging to class 1
    p = pm.Deterministic("p(Class=1)", pm.math.sigmoid(beta_0+
                                                     a*cleaned["A"]+
                                                     b*cleaned["B"]+
                                                     c*cleaned["C"]+
                                                     d*cleaned["D"])
                        )
with model:
    #fit the data 
    observed = pm.Bernoulli("Class=1", p, observed=cleaned["Predict"])
    start = pm.find_MAP()
    step = pm.Metropolis()
    
    #samples from posterior distribution 
    trace = pm.sample(25000, step=step, initvals=start)

Also, the model trace shows alot of variability in the predicted variable distribution. Does this indicate that my model is a poor fit, or rather that there’s alot of uncertainty in the predictions?

Any feedback is greatly appreciated

ricardoV94 · August 26, 2022, 12:33pm

You are recording p which is going to have the same shape as your cleaned data.

The uncertainty doesn’t look unreasonable given the spread of the parameters A-D.

Topic		Replies	Views
Weird/pathological behavior with a simple logistic regression model Questions	3	572	August 13, 2018
What would be the correct way of modelling logistic regression output p(C=1\|X) uncertainty? Questions	1	363	March 5, 2021
Unexpected results from logistic regression model version agnostic	8	546	February 15, 2023
Observational error in logistic regression v5	6	374	March 2, 2023
Samples from prior appear to have wrong distribution Questions	2	582	October 8, 2018

Why does my logistic regression model summary give a mean for each observation?

Related topics