Sure.
I am trying to create a model with observed y_data that includes a lot of zeros and some positive values. Therefore, I am considering using a zero-inflated exponential likelihood distribution.
Here are all the details of the code:
r, SM1, SM2, and dt are np.array data.
def logp_zero_inflated_exponential(value: TensorVariable, psi: TensorVariable, mu: TensorVariable) -> TensorVariable:
if value == 0:
return np.log(psi)
else:
return np.log1p(-psi) - mu * value
with pm.Model() as model:
z = pm.HalfNormal('Z', sigma=500)
a = pm.HalfNormal('α', sigma=1000)
b = pm.Normal('β', mu=0, sigma=1000)
psi = pm.Beta('psi', 1, 1)
y_obs = r
t_est = a*(SM1**b + SM2**b)*dt/2
mu = z*np.abs(dsm) - t_est
Y_obs = pm.CustomDist('Y_obs', psi, mu, dist=logp_zero_inflated_exponential, observed=y_obs)
obj_optimizer = pm.adam(learning_rate=0.01)
with model:
fit_advi = pm.fit(method = pm.ADVI(), n = 200000, callbacks=[pm.callbacks.CheckParametersConvergence(diff='relative')], obj_optimizer=obj_optimizer, progressbar=False)
idata = fit_advi.sample(2000)
When I ran this code, I got the following error:
ValueError: Incompatible Elemwise input shapes [(451,), (0,)]
Previously, when I used PyMC3, I was able to successfully run similar code with different logp_zero_inflated_exponential functions and pm.DensityDist.
def logp_zero_inflated_exponential(value, psi, mu):
return tt.switch(tt.eq(value, 0), tt.log(psi), tt.log1p(-psi) - mu * value)
Y_obs = pm.DensityDist('Y_obs', logp_zero_inflated_exponential,
observed={'value': y_obs, 'psi': psi, 'mu': mu})
If you have any alternative approaches for dealing with a large number of zeros in the observed data while still considering the positive values of y_obs, please feel free to suggest them.
Thank you for your help.