LogitNormal vs. Beta vs. Logistic

samslusher · August 14, 2018, 5:53pm

Hi -

I am attempting to predict a percentage variable (bounded by 0 and 1, with no instances of 0 or 1 occurring), using a continuous normal variable and a bounded normal variable. Here is my code:

from theano import shared

homespread = shared(data.HomeSpread.values)
homeodds = shared(data.ImpliedHomeOdds.values)
y=data.MidML_Home.values

with pm.Model() as model: 

    # Define priors
    intercept = pm.Normal('Intercept', 0, sd=10)
    x0 = pm.Normal('x0', mu=0, sd=20)
    x1 = pm.Normal('x1', mu=0.5, sd=.1)
    
    y_est = pm.math.sigmoid(intercept+x0*homespread+x1*homeodds)
    model_err = pm.Normal('model_err',mu=0.5,sd=.1)

    # Data likelihood
    y_like = pm.LogitNormal('y_like',mu=y_est,sd=model_err,observed=y)

    trace = pm.sample(20000,tune=5000)

Using the LogitNormal does not seem to (1) appropriately handle independent values above 0 and (2) predicts a tighter posterior than occurs in reality, as tested via:

homespread.set_value(np.array([-7.]))
homeodds.set_value(np.array([.5]))

ppc = pm.sample_ppc(trace, model=model, samples=10000)
_, ax = plt.subplots(figsize=(12, 6))
ax.hist([n.mean() for n in ppc['y_like']], bins=19, alpha=0.5)
ax.axvline(data[data['HomeSpread']==-7.].MidML_Home.mean())
ax.set(title='Posterior predictive of the mean', xlabel='mean(x)', ylabel='Frequency');

Any suggestions for how to best go about modelling this problem in PyMC3 would be much appreciated. Of note, I have tried using a Beta distribution in place of the LogitNormal as well, but struggled identifying the appropriate alpha/beta priors and ended up with a posterior far too tight around .5.

Thanks!

junpenglao · August 15, 2018, 4:32am

Since there is no data and figure it is a bit difficult to say what is the problem, so just FYI you can parameterize the Beta distribution with mean and sd, as long as the sd satisfy that sd < (1-mu)*mu

Topic		Replies	Views
How to model observed percentages (bounded from 0 to 1) Questions	8	2737	January 3, 2018
Modeling Zero-Inflation on continuous outcome Questions	6	1887	November 11, 2024
Is there a difference between a bounded normal and truncated normal? Questions	5	1565	June 4, 2022
Sensible prior for logistic regression with one or no input variable Questions	2	456	December 5, 2021
[quick conceptual question] shouldn't the lognormal distribution be used as the likelihood more often? Questions	5	1161	August 6, 2019

LogitNormal vs. Beta vs. Logistic

Related topics