pm.DensityDist issue (pymc3 vs pymc5)

Hyung · May 22, 2023, 5:22pm

Hi,

I am attempting to utilize the continuous zero-inflated exponential likelihood distribution using pm.DensityDist with:

def  logp_zero_inflated_exponential(value, psi, mu):
     return tt.switch(tt.eq(value, 0), tt.log(psi), tt.log1p(-psi) - mu * value)

psi = pm.Beta('psi', 1, 1)
z  = pm.HalfNormal('Z', sigma=500)
a  = pm.HalfNormal('α', sigma=1000) # weak informative prior (we know it should be positive) 
b  = pm.Normal('β', mu=0, sigma=1000)

ET_I_est = f_ET_I(SM1, SM2, a, b, dt)
mu = z*np.abs(dsm) - ET_I_est

Y_obs = pm.DensityDist('Y_obs', logp_zero_inflated_exponential, observed={'value': y_obs, 'psi': psi, 'mu': mu})

This is working fine with the version 3, however, when I use it with
Y_obs = pm.DensityDist('Y_obs', logp_zero_inflated_exponential, observed=np.array([y_obs, psi, mu]))
in the version 5, I got the following error:

    860                 Y_obs = pm.DensityDist('Y_obs', logp_zero_inflated_exponential, 
    861                                        observed={'value': y_obs, 'psi': psi, 'mu': mu})
    862             elif pmv == 5:
    863                 Y_obs = pm.DensityDist('Y_obs', logp_zero_inflated_exponential, 
--> 864                                        observed=np.array([y_obs, psi, mu]))
    865 # sampling or fit
    866 rng = 321

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.

I might be missing something with the Version 5 API. If you can provide me with instructions on how to use the Version 5 API correctly, that would be really helpful. If it works, I won’t need to use pymc3. By the way, the conda-forge pymc3 library installed on macOS works just fine with the mkl-service library. I’m not sure why the conda-forge pymc3 library for Linux comes with a numpy version that is not 1.22.

Additional information:

I use different logp_zero_inflated_exponential function for pymc3 (pmv == 3) and pymc5:

if pmv == 3:
  def logp_zero_inflated_exponential(value, psi, mu):
      return tt.switch(tt.eq(value, 0), tt.log(psi), tt.log1p(-psi) - mu * value)

if pmv == 5:
   def logp_zero_inflated_exponential(value, psi, mu):
      if value == 0:
        return pm.math.log(psi)
      else:
        return pm.math.log1p(-psi) - mu * value

Thank you for your help.

ricardoV94 · May 22, 2023, 5:46pm

The API has changed, you can check the new documentation here: pymc.CustomDist — PyMC dev documentation

Hyung · May 22, 2023, 5:54pm

Hi Ricardo,
I have tried the updated API and tried to modify the previous pymc3 code accordingly, but I had no luck. Could you please help me modify my previous pymc3 code to the pymc5 version?
Thank you for your time.

ricardoV94 · May 22, 2023, 6:23pm

If will be easier for me or others to help if you share what you tried.

Hyung · May 22, 2023, 7:16pm

Sure.

I am trying to create a model with observed y_data that includes a lot of zeros and some positive values. Therefore, I am considering using a zero-inflated exponential likelihood distribution.

Here are all the details of the code:

r, SM1, SM2, and dt are np.array data.

 def logp_zero_inflated_exponential(value: TensorVariable, psi: TensorVariable, mu: TensorVariable) -> TensorVariable:
      if value == 0:
        return np.log(psi)
      else:
        return np.log1p(-psi) - mu * value

with pm.Model() as model:

z  = pm.HalfNormal('Z', sigma=500)
a  = pm.HalfNormal('α', sigma=1000)
b  = pm.Normal('β', mu=0, sigma=1000)

psi = pm.Beta('psi', 1, 1)

y_obs = r
t_est = a*(SM1**b + SM2**b)*dt/2
mu = z*np.abs(dsm) - t_est
Y_obs = pm.CustomDist('Y_obs', psi, mu, dist=logp_zero_inflated_exponential, observed=y_obs)

obj_optimizer = pm.adam(learning_rate=0.01)
with model:
  fit_advi = pm.fit(method = pm.ADVI(), n = 200000, callbacks=[pm.callbacks.CheckParametersConvergence(diff='relative')], obj_optimizer=obj_optimizer, progressbar=False)
  idata    = fit_advi.sample(2000)

When I ran this code, I got the following error:

ValueError: Incompatible Elemwise input shapes [(451,), (0,)]

Previously, when I used PyMC3, I was able to successfully run similar code with different logp_zero_inflated_exponential functions and pm.DensityDist.

def logp_zero_inflated_exponential(value, psi, mu):
  return tt.switch(tt.eq(value, 0), tt.log(psi), tt.log1p(-psi) - mu * value)

Y_obs = pm.DensityDist('Y_obs', logp_zero_inflated_exponential, 
                                               observed={'value': y_obs, 'psi': psi, 'mu': mu})

If you have any alternative approaches for dealing with a large number of zeros in the observed data while still considering the positive values of y_obs, please feel free to suggest them.

Thank you for your help.

Topic		Replies	Views
Shape Issue with custom logp in DensityDist v5 modeling	7	334	January 13, 2024
Discrete DensityDist v5 modeling	5	505	January 7, 2023
pm.DensityDist problem under PMYC 4 v5	17	2407	June 14, 2022
How to do linear fitting with pymc3 by using pm.DensityDist Questions	6	1025	May 4, 2018
How to use costum logliklihood function effectively using pm.DensityDist	8	149	April 18, 2024

pm.DensityDist issue (pymc3 vs pymc5)

Related topics