I would like to implement an extended likelihood function in my PyMC3 model. I’ve attempted using pm.DensityDist
and pm.Potential
, but I haven’t had any luck so far.
In the code, I generate a series of measurements and then histogram them into a one-dimensional spectrum that has a single gaussian peak. I then use the model to estimate the location of that peak via an unbinned fit.
To estimate the height of the peak, along with the location, I add an extra random variable to the model and an extra term to the likelihood. The new variable explores the amplitude of the peak, and the extra likelihood term essentially constrains the model’s predicted number of measurements such that it remains close to the observed number of measurements.
The code below shows a basic working example sans the extra likelihood term, as well as two failed attempts at including the extra likelihood term (commented out).
Would anyone be able to spot what I’ve done incorrectly? Thank you very much for your help.
import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm
################################################################################
# Generate data
################################################################################
# prepare plot ranges
xMin, xMax = 0, 20
xRange = (xMin, xMax)
xBinsN = 100
xBinWid = (xMax - xMin)/xBinsN
xBinCenters = np.arange(xMin + 0.5*xBinWid, xMax + 0.5*xBinWid, xBinWid)
# generate the observed data
loc = 4.2
scale = .6
size = 500 # number of observed counts
x_obs = np.random.normal(loc = loc, scale = scale, size = size)
# bin the observed data
hist_obs = np.histogram(a = x_obs, bins = xBinsN, range = xRange, density = False)[0]
# plot the measured data as a plt.errorbar() with poisson errors
plt.figure('fig_hist_obs')
plt.errorbar(x = xBinCenters, y = hist_obs, xerr = None, yerr = np.sqrt(hist_obs), color = 'k', marker = 'o', markersize = 3, linestyle = 'None', label = 'observed')
plt.legend()
################################################################################
# A working unbinned fit for mu
################################################################################
def logp(x, mu):
return pm.Normal.dist(mu = mu, sd = scale).logp(x) # log of likelihood
with pm.Model() as model:
mu = pm.Uniform('mu', lower = xMin, upper = xMax, testval = (xMax-xMin)/2.)
LL = pm.DensityDist('LL', logp, observed = {'x': x_obs, 'mu': mu})
step = pm.Metropolis([mu])
trace = pm.sample(draws = 10000, step = step, chains = 1)
################################################################################
# Non-working unbinned fit for mu & w using extend likelihood function
################################################################################
# def logp(x, mu, w):
# LL = pm.Normal.dist(mu = mu, sd = scale).logp(x) # log of likelihood
# L_integral = pm.math.sum( pm.math.exp(LL) ) * xBinWid # should equal 1 b/c PDF
# L_integral = w * L_integral # scale PDF by w in hopes of matching data
# LL_ext = pm.Poisson.dist(mu = L_integral).logp(size) # log of extended likelihood term
# return LL_ext + LL # LL_ext + LL = log(L_ext * L)
# with pm.Model() as model:
# mu = pm.Uniform('mu', lower = xMin, upper = xMax, testval = (xMax-xMin)/2.)
# w = pm.Uniform('w', lower = 1., upper = 2*size, testval = 1.2*size)
# LL_tot = pm.DensityDist('LL_tot', logp, observed = {'x': x_obs, 'mu': mu, 'w': w})
# step = pm.Metropolis([mu, w])
# trace = pm.sample(draws = 10000, step = step, chains = 1)
################################################################################
# Non-working unbinned fit for mu & w using pm.Potential
################################################################################
# def logp(x, mu):
# return pm.Normal.dist(mu = mu, sd = scale).logp(x) # log of likelihood
# def logp_ext(x, mu, w):
# LL = logp(x, mu)
# L_integral = pm.math.sum( pm.math.exp(LL) ) * xBinWid # should equal 1 b/c PDF
# L_integral = w * L_integral # scale PDF by w in hopes of matching data
# return pm.Poisson.dist(mu = L_integral).logp(size) # log of extended likelihood term
# with pm.Model() as model:
# mu = pm.Uniform('mu', lower = xMin, upper = xMax, testval = (xMax-xMin)/2.)
# w = pm.Uniform('w', lower = 1., upper = 2*size, testval = 1.9*size)
# LL = pm.DensityDist('LL', logp, observed = {'x': x_obs, 'mu': mu})
# LL_ext = pm.Potential('LL_ext', logp_ext(x_obs, mu, w))
# step = pm.Metropolis([mu, w])
# trace = pm.sample(draws = 10000, step = step, chains = 1)
################################################################################
# Get results of fit
################################################################################
print(pm.summary(trace))
pm.traceplot(trace);
mu = loc
if 'mu' in pm.summary(trace).index:
mu = pm.summary(trace).loc['mu']['mean']
w = size
if 'w' in pm.summary(trace).index:
w = pm.summary(trace).loc['w']['mean']
print('mu, w:', mu, w)
def result(x, mu):
return pm.math.exp(pm.Normal.dist(mu = mu, sd = scale).logp(x)).eval()
print('size:', size)
print('should equal 1:', np.sum(result(xBinCenters, mu) * xBinWid))
print('should equal size:', np.sum(result(xBinCenters, mu) * w * xBinWid))
plt.figure('fig_result')
plt.errorbar(x = xBinCenters, y = hist_obs, xerr = None, yerr = np.sqrt(hist_obs), color = 'k', marker = 'o', markersize = 3, linestyle = 'None', label = 'observed')
plt.plot(xBinCenters, result(xBinCenters, mu) * w * xBinWid, color = 'tab:red', linewidth=1, dashes = [2, 2, 2, 2], label = 'model')
plt.legend()
plt.show()
Some references: