Different approaches to the Challenger Example


I’m quite new to probabilistic programming and am trying to solve a problem that I just realized is very similar to the Challenger Example from Chapter 2 of Cam’s book:

I.e., my data is a sequence of tuples (x,t) where x is 0 or 1, and t is a positive number. The probability p of X to be 1 depends on the value t in some way.

I’m still getting my head around the probabilistic programming regime, but I would like to be able to do something a bit more flexible than what is done in the Challenger Example. There, the problem consists of modelling a failure rate as a function of temperature, and the model is the following:

with pm.Model() as model:
   beta = pm.Normal("beta", mu=0, tau=0.001, testval=0)
   alpha = pm.Normal("alpha", mu=0, tau=0.001, testval=0)
   p = pm.Deterministic("p", 1.0/(1. + tt.exp(beta*temperature + alpha)))

   observed = pm.Bernoulli("bernoulli_obs", p, observed=failure)
   trace = ....

i.e. it’s essentially a logistic regression. Now, I’m wondering what other approaches there are to this. In particular, in my case, it would also be interesting to model the distribution of the ‘t’ variable, and therefore I’m thinking I want the final outcome to be a joint posterior distribution over p and t.

How would I achieve this? I’ve tried specifying priors for p and t separately but I don’t see any covariance in the posterior. Do I need to specify a joint prior using DensityDist? I tried this as well but couldn’t get the code to run:

def logp(p, V):
    #Super simple uninformative priors... 
    return pm.Uniform().logp(p)+pm.Uniform().logp(V)

with pm.Model() as model:
    joint = pm.DensityDist('joint', logp) 
    #Now what?

Any suggestions? Please let me know if I need to be more specific and I’ll try my best.


I think this is usually modelled using a Poisson point process. I have never fit one myself, but using the GP module you can do something similar: http://docs.pymc.io/notebooks/GP-slice-sampling.html#Gaussian-Process-Classification