# Problems when changing the likelihood slightly

Hi all. I’ve been successfully running a hierarchical Bayesian model for repeated measurements, then changed the likelihood model and got a long error message I don’t understand.

In my problem, each patient has an unequal number of repeated measurements, the values and times are stored in the 1 dimensional arrays Y_flat_no_nans and t_flat_no_nans. The structure of which measurements belong to which patients is stored in group_id, a list of patient ids.
Additionally, each patient has a parameter set: psi, rho_r, rho_s and pi_r.
To get the mean of the likelihood from the parameters, group_id is used to index into the parameter arrays.

The change that caused the error was to change the likelihood from two populations r and s growing and decreasing independently:
M(t_{ij}; \psi_i, \pi_i^r, \rho_i^r, \rho_i^s) = \psi_i^0 \Big( \pi_i^r \exp\Big(\rho_i^r t_{ij}\Big) + (1-\pi_i^r) \exp \Big(\rho_i^s t_{ij}\Big) \Big), which in vector form is:

mu_Y = psi[group_id] * (pi_r[group_id]*np.exp(rho_r[group_id]*t_flat_no_nans) + (1-pi_r[group_id])*np.exp(rho_s[group_id]*t_flat_no_nans))


to a model where the two populations interact:
M(t_{ij}; \psi_i, \pi_i^r, \rho_i^r, \rho_i^s) = \psi_i^0 \Big( \pi_i^r \exp\Big(\rho_i^r \psi_i^0(1-\pi_i^r) (t_{ij} + \frac{1 - e^{\rho_i^s t_{ij}}}{\rho_i^s}) \Big) + (1-\pi_i^r) \exp \Big(\rho_i^s t_{ij}\Big) \Big), which is:

mu_Y = psi[group_id] * (pi_r[group_id]*np.exp(rho_r[group_id] * psi[group_id] * (1-pi_r[group_id]) * (t_flat_no_nans + (1 - np.exp(rho_s[group_id]*t_flat_no_nans)/(rho_s[group_id])))) + (1-pi_r[group_id])*np.exp(rho_s[group_id] * t_flat_no_nans))


Here’s the code; I’ll add the error message below.

from utilities import *
import arviz as az
import pymc as pm
##############################
# Function argument shapes:
# X is an (N_patients, P) shaped pandas dataframe
# patient dictionary contains N_patients patients in the same order as X

def linear_model(X, patient_dictionary):
df = pd.DataFrame(columns=["patient_id", "mprotein_value", "time"])
for ii in range(len(patient_dictionary)):
patient = patient_dictionary[ii]
mprot = patient.Mprotein_values
times = patient.measurement_times
for jj in range(len(mprot)):
single_entry = pd.DataFrame({"patient_id":[ii], "mprotein_value":[mprot[jj]], "time":[times[jj]]})
df = pd.concat([df, single_entry], ignore_index=True)
group_id = df["patient_id"].tolist()
Y_flat_no_nans = np.array(df["mprotein_value"].tolist())
t_flat_no_nans = np.array(df["time"].tolist())

N_patients, P = X.shape
P0 = int(P / 2) # A guess of the true number of nonzero parameters is needed for defining the global shrinkage parameter
X_not_transformed = X.copy()
X = X.T
yi0 = np.zeros(N_patients)
for ii in range(N_patients):
yi0[ii] = patient_dictionary[ii].Mprotein_values[0]

with pm.Model(coords={"predictors": X_not_transformed.columns.values}) as linear_model:
# Observation noise (std)
sigma_obs = pm.HalfNormal("sigma_obs", sigma=1)

# alpha
alpha = pm.Normal("alpha",  mu=np.array([np.log(0.002), np.log(0.002), np.log(0.5/(1-0.5))]),  sigma=1, shape=3)

# beta (with horseshoe priors):
# Global shrinkage prior
tau_rho_s = pm.HalfStudentT("tau_rho_s", 2, P0 / (P - P0) * sigma_obs / np.sqrt(N_patients))
tau_rho_r = pm.HalfStudentT("tau_rho_r", 2, P0 / (P - P0) * sigma_obs / np.sqrt(N_patients))
tau_pi_r = pm.HalfStudentT("tau_pi_r", 2, P0 / (P - P0) * sigma_obs / np.sqrt(N_patients))
# Local shrinkage prior
lam_rho_s = pm.HalfStudentT("lam_rho_s", 2, dims="predictors") # shape = (P,) without predictors
lam_rho_r = pm.HalfStudentT("lam_rho_r", 2, dims="predictors")
lam_pi_r = pm.HalfStudentT("lam_pi_r", 2, dims="predictors")
c2_rho_s = pm.InverseGamma("c2_rho_s", 1, 0.1)
c2_rho_r = pm.InverseGamma("c2_rho_r", 1, 0.1)
c2_pi_r = pm.InverseGamma("c2_pi_r", 1, 0.1)
z_rho_s = pm.Normal("z_rho_s", 0.0, 1.0, dims="predictors")
z_rho_r = pm.Normal("z_rho_r", 0.0, 1.0, dims="predictors")
z_pi_r = pm.Normal("z_pi_r", 0.0, 1.0, dims="predictors")
# Shrunken coefficients
beta_rho_s = pm.Deterministic("beta_rho_s", z_rho_s * tau_rho_s * lam_rho_s * np.sqrt(c2_rho_s / (c2_rho_s + tau_rho_s**2 * lam_rho_s**2)), dims="predictors")
beta_rho_r = pm.Deterministic("beta_rho_r", z_rho_r * tau_rho_r * lam_rho_r * np.sqrt(c2_rho_r / (c2_rho_r + tau_rho_r**2 * lam_rho_r**2)), dims="predictors")
beta_pi_r = pm.Deterministic("beta_pi_r", z_pi_r * tau_pi_r * lam_pi_r * np.sqrt(c2_pi_r / (c2_pi_r + tau_pi_r**2 * lam_pi_r**2)), dims="predictors")

# Latent variables theta
omega = pm.HalfNormal("omega",  sigma=1, shape=3) # Patient variability in theta (std)
theta_rho_s = pm.Normal("theta_rho_s", mu= alpha[0] + pm.math.dot(beta_rho_s, X), sigma=omega[0]) # Individual random intercepts in theta to confound effects of X
theta_rho_r = pm.Normal("theta_rho_r", mu= alpha[1] + pm.math.dot(beta_rho_r, X), sigma=omega[1]) # Individual random intercepts in theta to confound effects of X
theta_pi_r  = pm.Normal("theta_pi_r",  mu= alpha[2] + pm.math.dot(beta_pi_r, X),  sigma=omega[2]) # Individual random intercepts in theta to confound effects of X

xi = pm.HalfNormal("xi", sigma=1)
log_psi = pm.Normal("log_psi", mu=np.log(yi0), sigma=xi, shape=N_patients)
psi = pm.Deterministic("psi", np.exp(log_psi))

# Transformed latent variables
rho_s = pm.Deterministic("rho_s", -np.exp(theta_rho_s))
rho_r = pm.Deterministic("rho_r", np.exp(theta_rho_r))
pi_r  = pm.Deterministic("pi_r", 1/(1+np.exp(-theta_pi_r)))

# Old, works fine:
#mu_Y = psi[group_id] * (pi_r[group_id]*np.exp(rho_r[group_id]*t_flat_no_nans) + (1-pi_r[group_id])*np.exp(rho_s[group_id]*t_flat_no_nans))

# New, raising an error:
mu_Y = psi[group_id] * (pi_r[group_id]*np.exp(rho_r[group_id] * psi[group_id] * (1-pi_r[group_id]) * t_flat_no_nans + (1 - np.exp(rho_s[group_id]*t_flat_no_nans)/(rho_s[group_id])))) + (1-pi_r[group_id])*np.exp(rho_s[group_id] * t_flat_no_nans))

# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal("Y_obs", mu=mu_Y, sigma=sigma_obs, observed=Y_flat_no_nans)
return linear_model


And this is the error message, in two parts:

The error message suggests you have either nan sigma or nu in at least one of your HalfStudentT distributions. Can those P or P0 be zero when casted to int?

As a sanity check I would suggest you use keyword arguments nu and sigma when defining the distribution. There was a bug sometime ago when not passing the argument via keyword.

Thanks for the reply Ricardo. How could you tell it was in the HalfStudentT?
P or P0 will never be zero, I have made sure of that.
I’ve added keyword arguments to the HalfStudentT and InverseGamma distributions, yet I still get a similar error message.

Best,
Even