Help with Censored Regression

Censored model solved :slight_smile:

Correct model is this…

def censored_regression(x, y, bounds):
    
    # data pre-processing
    left_censored = (y <= bounds[0])
    x_lc = x[left_censored]
    y_lc = y[left_censored]
    
    right_censored = (y >= bounds[1])
    x_rc = x[right_censored]
    y_rc = y[right_censored]
    
    uncensored = (y>bounds[0]) & (y<bounds[1])
    x = x[uncensored]
    y = y[uncensored]

    with pm.Model() as model:
        m = pm.Normal("m", mu=0, sd=1)
        c = pm.Normal("c", mu=0, sd=1)
        σ = pm.HalfNormal("σ", sd=1)
        y_likelihood = pm.Normal("y_likelihood", mu=m * x + c, sd=σ, observed=y)
        left_censored = pm.Potential("left_censored", normal_lcdf(m * x_lc + c, σ, y_lc))
        right_censored = pm.Potential("right_censored", normal_lccdf(m * x_rc + c, σ, y_rc))

    with model:
        trace = pm.sample()

    return model, trace


If people can confirm that this looks kosher, then I’m happy to put the truncated and censored regression code together in an example notebook.

4 Likes

Looks good to me!

1 Like

@AlexAndorra I was suggesting to @drbenvincent to develop an example of Censored and Truncated regression to the pymc-examples, since he has spent quite some time tinkering and debugging about these in pymc3. Since you are more familiar with the pymc-examples, could you share your opinion as to whether this makes sense (I don’t want to waste anyone’s time :slight_smile: )

I’ve made a pull request for a notebook on truncated regression. I’m “ok” with GitHub, so I have little confidence that I’ve done it correctly.