I’m trying to model a gaussian signal (pA) in a background (pB), where the model is
p_model(x | f, x0, s) = f * pA(x | x0, s) + (1-f) * pB(x)
where pA is gaussian with mean x0 and std dev s.
Given a set of data {x}, the likelihood is given by the product of p_model evaluated at each each x in {x}. Note {x} is ‘datax’ in the code below.
Here is a snippet of my code:
# Define stochastics
f = pymc.Uniform('f', fmin, fmax)
x0 = pymc.Uniform('x0', xmin, xmax)
@pymc.deterministic
def tau(sd=s_sd):
return sd**(-2)
s = pymc.Normal('s', s_mu, tau)
#define the likelihood. We must input logp, which is just the LL.
likelihood = pymc.stochastic_from_dist('likelihood',
logp=LogLikelihood, #this is a func defined elsewhere
dtype=np.float, mv=True)
pmodel = likelihood('model', f, x0, s, observed=True, value=datax)
model = dict(datax=datax, f=f, x0=x0, s=s, pmodel=pmodel)
N = pymc.NormApprox(model)
N.fit()
I occasionally get the following error
ValueError Traceback (most recent call last)
<ipython-input-6-c0a7f2f34e19> in <module>
30
31 N = pymc.NormApprox()
---> 32 N.fit()
33
34 mu = N.mu[N.f, N.x0]
~/anaconda3/envs/py3custom/lib/python3.6/site-packages/pymc/NormalApproximation.py in fit(self, *args, **kwargs)
587 self.grad_and_hess()
588
--> 589 self._C = -1. * self.hess.I
590 self._sig = msqrt(self._C).T
591
~/anaconda3/envs/py3custom/lib/python3.6/site-packages/numpy/matrixlib/defmatrix.py in getI(self)
834 else:
835 from numpy.dual import pinv as func
--> 836 return asmatrix(func(self))
837
838 def getA(self):
~/anaconda3/envs/py3custom/lib/python3.6/site-packages/scipy/linalg/basic.py in inv(a, overwrite_a, check_finite)
943
944 """
--> 945 a1 = _asarray_validated(a, check_finite=check_finite)
946 if len(a1.shape) != 2 or a1.shape[0] != a1.shape[1]:
947 raise ValueError('expected square matrix')
~/anaconda3/envs/py3custom/lib/python3.6/site-packages/scipy/_lib/_util.py in _asarray_validated(a, check_finite, sparse_ok, objects_ok, mask_ok, as_inexact)
237 raise ValueError('masked arrays are not supported')
238 toarray = np.asarray_chkfinite if check_finite else np.asarray
--> 239 a = toarray(a)
240 if not objects_ok:
241 if a.dtype is np.dtype('O'):
~/anaconda3/envs/py3custom/lib/python3.6/site-packages/numpy/lib/function_base.py in asarray_chkfinite(a, dtype, order)
496 if a.dtype.char in typecodes['AllFloat'] and not np.isfinite(a).all():
497 raise ValueError(
--> 498 "array must not contain infs or NaNs")
499 return a
500
ValueError: array must not contain infs or NaNs
The error occurs because the hessian seems to contain NaNs and therefore cannot be inverted. I think this is because when calculating the elements of the hessian, the first order derivative of self.func_for_diff is found to be inf, so the second order derivative is NaN.
I’ve been running the code for many different random realisations of data. For some datasets it works fine, but for others I get this error. I’ve tried tweaking parameters such as the tolerance, eps, floating point precision etc., it helps a bit, but I’ve been unable to prevent this error occurring for all of the datasets.
Is there a way to avoid this error? Or could it be that the normal approximation method just fails from time to time?
Thank you!