I’m a PyMC3 (and Bayesian coding) novice, trying to sample what should be a very simple model. Specifically, I’m trying to model a straight line where there are two sources contributing to the observed scatter around the relation: first, the measurement uncertainties (which I have for each point); and second, the intrinsic scatter around the relation.
However, whenever I try to sample this, the NUTS sampler crashes with the error: ‘ValueError: Bad initial energy: inf. The model might be misspecified.’
Oddly, the pm.Metropolis sampler does work, but is way, way slower (needing >250000 draws to get comfortably past autocorrelation). I struggle to see how a model this simple could be so severely misspecified to crash the sampler?
Any help much appreciated! Complete code as follows:
# Import smorgasbord
import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm
import theano.tensor as tt
# Data points (hardcoded for this example)
x_points = np.array([0.48574, 0.54206, 0.50821, 0.44593, 0.40873, 0.20569, 0.08038, 0.30518, 0.0117, 0.29974, 0.28977, 0.32806, 0.32496, 0.38357, 0.45617, 0.50823, 2.51709, 1.98823, 1.78779, 1.58633, 1.23946, 1.25441, 1.26188, 1.12205, 0.94541, 1.19891, 0.76849, 0.72581, 1.46885, 0.66112, 0.74459, 0.75426, 2.10072, 0.68885, 0.67268, 0.61371, 2.45453, 1.6277, 2.37253, 1.13018, 2.12139, 1.59796, 1.58462, 0.62712, 1.34953, 1.18484, 1.25131, 1.22118, 0.76266, 1.13954, 0.76647, 1.11719, 0.67664, 0.98378, 1.4421, 1.09343])
y_points = np.array([8.62361, 8.58473, 8.67064, 8.66751, 8.61314, 8.62788, 8.66276, 8.62828, 8.72265, 8.67101, 8.65285, 8.58985, 8.68368, 8.63772, 8.61623, 8.57461, 8.3868, 8.35645, 8.37346, 8.32565, 8.48552, 8.43744, 8.41082, 8.45783, 8.47807, 8.29072, 8.54428, 8.65165, 8.4924, 8.57238, 8.57266, 8.58867, 8.35816, 8.62443, 8.5474, 8.56266, 8.25485, 8.38961, 8.30199, 8.46212, 8.4369, 8.37571, 8.25825, 8.58458, 8.37271, 8.35321, 8.36903, 8.33309, 8.62743, 8.36992, 8.61193, 8.38307, 8.61207, 8.54401, 8.42055, 8.53708])
y_uncs = np.array([0.0248, 0.03579, 0.02606, 0.02971, 0.03022, 0.02502, 0.02804, 0.03037, 0.03231, 0.03425, 0.02947, 0.02574, 0.03058, 0.04417, 0.03605, 0.03535, 0.02104, 0.06762, 0.06982, 0.02122, 0.0294, 0.0323, 0.03731, 0.02887, 0.0274, 1.22413, 0.03399, 0.01714, 0.07119, 0.03327, 0.03677, 0.02801, 0.0187, 0.02771, 0.02941, 0.02072, 0.05694, 0.03161, 0.03679, 0.01321, 0.02408, 0.01769, 0.07938, 0.02357, 0.08276, 0.01337, 0.01767, 0.06916, 0.01688, 0.01702, 0.0297, 0.01779, 0.01614, 0.04123, 0.0342, 0.02855])
# Initially, use simple least-squares fit to inform priors and test values
leastsq_fit = np.polyfit(x_points, y_points, 1, w=y_uncs**-1)
# Set up PyMC3 model to model relation
model = pm.Model()
with model:
# Specify prior for intercept
intercept = pm.Normal('Intercept', mu=leastsq_fit[1], sd=y_points.std(), testval=leastsq_fit[1])
# Specify prior for gradient
gradient = pm.Normal('Gradient', mu=leastsq_fit[0], sd=leastsq_fit[0], testval=leastsq_fit[0])
# Determine expected value
z_expected = ( gradient * x_points ) + intercept
# Calculate residuals, to inform psi
residuals = z_expected - y_points
# Specify prior for psi, the scale of the additional scatter arising from intrinsic variation
psi = pm.HalfNormal('Psi', sd=tt.mean(tt.abs_(residuals)), testval=tt.mean(tt.abs_(residuals)))
# Determine combined scatter by adding psi and uncertainties in quadrature
sigma = ( psi**2.0 + y_uncs**2.0 )**0.5
# Specify likelihood distribution of points
z_likelihood = pm.Normal('z_likelihood', mu=z_expected, sd=sigma, observed=y_points)
# Sample posterior distriution
trace = pm.sample(draws=5000, njobs=4, tune=1000)
#trace = pm.sample(step=pm.Metropolis(), draws=100000, njobs=4, tune=5000)