Hi everyone,
I need to fit some experimental data using a model that can be expressed as: y(t)=N(S_0.f(t, \theta),\sigma^2 where S_0 is a scaling parameter, f(t, \theta) is a non linear function of the time t and of the model parameters \theta, and \sigma is the standard deviation of the noise.
Using non informative priors on S_0 and \sigma^2, the integration of the likelihood over S_0 and \sigma^2 yields the marginalized likelihood, which is expressed as:
p(\mathbf{y} \vert \theta) \propto [\mathbf{y}^T.\mathbf{y} - (\mathbf{y}^T.\mathbf{f})^2/(\mathbf{f}^T.\mathbf{f})]^{-N/2}
\mathbf{y} is the vector containing the N observations of y(t), sampled at time t, and \mathbf{f} is the vector containing the model f(t,\theta) sampled at the same time t for a given set of parameters \theta.
To keep it simple, lets assume that f is a mono exponential decay model, where \theta is the characteristic time:
f(t,\theta)=(1-\exp(-t/\theta)).
Furthermore, lets assume that the prior on \theta is normal prior with large standard deviation: p(\theta)\ \sim N(0,100).
How can I build such a custom likelihood function? I have been looking on this forum and on the documentation of PyMC3, so I guess that the solution will use the function DensityDist
and an implementation of the function f using Theano. Here is my code:
import pymc3 as pm
import numpy as np
from matplotlib import pyplot as plt
import theano.tensor as T
def logp(theta, time, y):
f = (1 - T.exp(-time/ theta))
yty = T.dot(y, y)
ytf = T.dot(y, f)
ftf = T.dot(f, f)
N = T.shape(time)[0]
return -N / 2 * (yty - ytf ** 2 / ftf)
if __name__ == '__main__':
# Generate noisy data
S0=10
sigma=1
theta=3
time=np.linspace(0,9,10)
mu=S0*(1-np.exp(-time/theta))
y=np.random.normal(mu,sigma)
with pm.Model() as model:
theta=pm.Normal('theta',mu=0, sd=100)
likelihood=pm.DensityDist('Likelihood', logp, observed={'theta':theta, 'time':time, 'y':y})
# trace = pm.sample(1000, step=pm.Metropolis())
trace = pm.sample(1000)
pm.traceplot(trace)
plt.show()
The code throw an error if I use the default NUTS algorithm
ValueError: Bad initial energy: nan. The model might be misspecified.
“”"
The above exception was the direct cause of the following exception:
ValueError: Bad initial energy: nan. The model might be misspecified.
The above exception was the direct cause of the following exception:
If I change the sampling method to Metropolis
, there is no error but the analysis of the trace shows that \theta is constant on the full length of the chain.
So now I am stuck and I don’t know how to go further.
Could someone explain me how I can build my likelihood and use it in PyMC3?
Thanks in advance,
Best