A long time frequentist here trying to learn new ways. My science question is related to dynamic light scattering. The basic problem is fitting the auto-correlation curves, which can be approximated as a series of exponential decays. For the math folks this is a Fredholm problem of the first kind with an exponential kernel. It is notoriously ill-conditioned due to non-uniqueness. The classic way to do the problem is to build a regularizer on the second derivative.
I have build a simulation that illustrates the basic idea. I first setup grid of fixed times and then try to select a series of non-negative coefficients that fit the data. The problem is that is runs pretty slowly. If I set the target_accept too low, 0.9, I get divergences, if I set it too high, 0.98, get max tree depth warnings. I have a suspicion that it is not sampling efficiently. My question is does anyone see a way to do this problem better? Note I’m in Windows 10, so I can’t seem to get multiple cores to work.
here is the code
# Carl Houtman 4/10/25
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
import xarray as xr
import sys
import pytensor
pytensor.config.blas__ldflags = '-L\"C:/Program Files (x86)/Intel/oneAPI/mkl/2025.0/lib\" -lmkl_core -lmkl_intel_lp64 -lmkl_sequential'
az.style.use("arviz-darkgrid")
size = 400
true_bkgrnd = 0.1
true_amp1 = 0.5
true_time1 = 1.0
true_amp2 = 0.5
true_time2 = 10.0
true_noise = 0.01
# build data set (note log scale)
# y = background + a0 * exp(-x/t0) + a1 * exp(-x/t1)
x = np.logspace(-2, 2, num = size)
true_regression_line = true_bkgrnd + true_amp1 * np.exp( - x / true_time1) + true_amp2 * np.exp( - x / true_time2)
y = true_regression_line + np.random.normal(scale=true_noise, size=size) # add noise
# set the number of time points and build an array
number_times = 30
t = np.logspace(-2, 2, num = number_times)
# construct a model
with pm.Model() as gauss_model:
bkgrnd = pm.Gamma('bkgrnd', mu=1, sigma=0.5)
a = pm.Exponential('a', lam = 2, size=number_times)
sigma = pm.HalfCauchy("sigma", beta=2)
lam = bkgrnd + sum(a[i]*np.exp(-x/t[i]) for i in range(number_times))
obs = pm.Normal('y', lam, sigma=sigma, observed=y)
trace = pm.sample(draws=1000,
cores = 1,
chains=4,
target_accept=0.95,
progressbar=False,
idata_kwargs={"log_likelihood": True})
posterior_predictive = pm.sample_posterior_predictive(trace).posterior_predictive
print(az.summary(trace)
#az.plot_pair(trace, marginals=True, point_estimate="median", figsize=(12, 12), scatter_kwargs={"alpha": 0.01})
#az.plot_trace(trace)
#az.plot_posterior(trace)
a_array = np.array(trace.posterior["a"].mean(dim=("chain", "draw")))
a_sdt = np.array(trace.posterior["a"].std(dim=("chain", "draw")))
y_predict = np.array(posterior_predictive["y"].mean(dim=("chain", "draw")))
y_sd = np.array(posterior_predictive["y"].std(dim=("chain", "draw")))
y_upper = y_predict + y_sd
y_lower = y_predict - y_sd
fig, ax = plt.subplots(2,1,figsize=(7, 7))
ax[0].plot(x,y, "b.")
ax[0].plot(x,y_predict, "b-")
ax[0].plot(x,y_upper, "r-")
ax[0].plot(x,y_lower, "r-")
ax[0].set_xlim(0.01,100)
ax[0].set_ylim(0,1.5)
ax[0].set_xscale("log")
ax[0].set_xlabel("tau [ms]")
ax[1].plot(t,a_array, "k-")
ax[1].set_xlim(0.01,100)
ax[1].set_ylim(0,1)
ax[1].set_xscale("log")
ax[1].set_xlabel("tau [ms]")
plt.show()