Hello everyone, I’m having a problem with initializing my model in Jupyter notebooks. I am building a simple model for modeling the spreading of COVID-19, and am using a dataset from The Covid Tracking Project. I have the following model:
def sampleMCMC(data, pop):
# splitting data into infections and time as numpy arrays
dataDeath = data['death'].to_numpy()
time = np.linspace(0,len(data)-1, len(data))
# establishing model
with pm.Model() as model:
# create population number priors
error = pm.Normal('error', mu=0, sigma=pop/1000)
i0 = pm.Poisson('i0', mu=pop/100 + error)
s0 = pm.Deterministic('s0', pop - i0)
# create starting values based on data, does not inform inference but starts at a reasonable value
# start beta as average of positive-to-test and positive-to-population
beta_start = float((data['positive'].iloc[-1]/data['total'].iloc[-1] + data['positive'].iloc[-1]/pop) / 2)
# start gamma as average of recovered-to-positive and recovered-to-population
gamma_start = float((data['recovered'].iloc[-1]/data['total'].iloc[-1] + data['recovered'].iloc[-1]/pop) / 2)
# start rho as average of death-to-positive and death-to-population
rho_start = float((data['death'].iloc[-1]/data['positive'].iloc[-1] + data['death'].iloc[-1]/pop) / 2)
# creating priors for beta, gamma, and rho
beta = pm.InverseGamma('beta', mu=.05, sigma=.5, testval=beta_start)
gamma = pm.InverseGamma('gamma', mu=.047, sigma=.5, testval=gamma_start)
rho = pm.TruncatedNormal('rho', mu=.036, sigma=.1, lower=0, upper=1, testval=rho_start)
# create number of removed based on analytic solution and above parameters
sirRem = pm.Deterministic('sirRem',
pop - ((s0 + i0)**(beta/(beta - gamma)))*
(s0 + i0*tt.exp(time*(beta - gamma)))**(-gamma/(beta - gamma)))
# create number of deaths as a fraction of number of removed
sirDeath = pm.Deterministic('sirDeath', rho*sirRem)
# create variance prior
sigma = pm.HalfCauchy('sigma', beta=4)
# create likelihood with modelled counts and observed counts
obsDeath = pm.TruncatedNormal('obsDeath', mu=sirDeath, sigma=sigma,
lower=0, upper=pop, observed=dataDeath)
# specifying model conditions
step=pm.NUTS(target_accept=.99)
start=pm.find_MAP()
init="advi+adapt_diag"
print(model.check_test_point())
# execute sampling
model_trace = pm.sample(draws=500, tune=500, step=step, start=start, init=init, chains=8)
# plot results
pm.plot_trace(model_trace, var_names=('i0','beta','gamma','rho','sigma'))
pm.plot_posterior(model_trace, var_names=('i0','beta','gamma','rho','sigma'))
# return posterior samples and other information
return model_trace
This is the most recent of many many many iterations, and I’m quite happy with how it’s performing on data for about half the states in the US. For many of the other states, however, I get stuck in the following place:
I find this error really weird because none of my variables themselves are nan, but I get that “logp = nan” next to the progress bar. I’m at a loss for why this doesn’t work for some states. It works like a charm for others.
Any help would be greatly appreciated!