Geometric distribution breaks when data contains 0

Hi!

It seems that Geometric distribution as implemented in PyMC3 does not allow data to have 0 as a value.

Here’s something that works on my end:

geom_test_df = pd.DataFrame({
    'indep_true_false': np.random.binomial(n=1, p=0.6, size=1000),
    'discrete_geom': np.random.geometric(0.06, size=1000)
})

Here’s the model:

with pm.Model() as geom_test_model:
    p_1 = pm.Uniform('prior_1', 0.01, 0.1)
    p_0 = pm.Uniform('prior_0', 0.01, 0.1)
    
    obs_1 = pm.Geometric("obs_1", p_1, observed=geom_test_df[
        geom_test_df['indep_true_false'] == 1
    ]['discrete_geom'])
    
    obs_0 = pm.Geometric("obs_0", p_0, observed=geom_test_df[
        geom_test_df['indep_true_false'] == 0
    ]['discrete_geom'])
    
    geom_test_trace = pm.sample(3000, tune=1000)

The lowest value of geom_test_df['discrete_geom'] is 1. I was expecting the value 0 to be possible. See Geometric distribution applet

Changing some values to be 0 leads to ValueError: Bad initial energy: inf. The model might be misspecified.

Geometric distribution is not supported for x=0, see https://en.wikipedia.org/wiki/Geometric_distribution

Oh I see – based on the Wikipedia link you gave, geometric distribution can be written in two different ways: one that takes in whole numbers, and one that takes in positive numbers only. Thanks!