Observed variable is an integral of a spline defined by PyMC variables

Hi all,

I’ve been struggling on this example for a while and can’t seem to get the right approach.

For my observed data I have many tuples of [A, B, Y] where Y = Integral of F(x) from A to B + Error. I am trying to define F as a simple spline with e.g. 5 knots with the knots themselves being PyMC variables.

Ideally this would allow me to see the posterior distribution on each of the knots and have a sense of what spline fits my data best.

However the result of the below is quite poor – with the betas effectively having HPDs across the entire space prior.

aps = theano.shared(dummydata.ap.values)
dps = theano.shared(dummydata.dp.values)
dmsact = theano.shared(dummydata.dm.values)
knot_vals = np.linspace(0, 100, 5)
knot_vals_ = theano.shared(knot_vals)

@theano.compile.ops.as_op(itypes=[t.dvector, t.dvector, t.dvector, t.dvector], otypes=[t.dvector])
def predict(knot_vals, betas, ap, dp):
    spl = UnivariateSpline(knot_vals, betas)
    return spl.integral(ap, dp) / (dp - ap)
  with pm.Model() as model:
    betas = pm.Uniform("betas", lower=0, upper=1000, shape=knot_vals.shape[0])
    dmspred = predict(knot_vals_, betas, aps, dps)    
    sigma = pm.HalfNormal("sigma", sigma=1)
    dmsobs = pm.Normal('dm_obs', mu=dmspred, sigma=sigma, observed=dmsact)