Mean changing every data point

Hello everyone,

I have the following model.

with basic_model:
     lambda1 = pm.Gamma("lambda1", alpha=0.001, beta=0.001)
     p =  pm.Beta('p', 1, 1,)
     z = [0.0] * len(x['Length'].values)
     Y_obs = [0.0] * len(x['Length'].values)
 
     for i in range(len(x['Length'].values)):
     	z[i] = pm.Bernoulli('z[i]',p)
     	Y_obs[i] = pm.Poisson("Y_obs[i]", mu=lambda1*z[i]*x['Length'].values+0.001, observed=x['Count'].values[i])
     trace = pm.sample(7000, tune=2000, cores=1, return_inferencedata=True)

It is producing the error for the names Y_obs[i] and z[i]. I understand that I cannot change use the same name for the variables, but I couldn’t figure out how to change the rate of Y_obs[i] at every iteration. At a later stage, I will be changing the rates with if-else conditions, as well. How do I define a different mean for every data point?

To avoid having repeated variable names you should use f-strings for example so that ...Bernoulli(f"z[{i}]",... generates z[0], z[1]… instead of generating z[i] every single time.

That being said, this is generally not a good idea. Why don’t you use vectorized statements?

I am new to PyMC3 and coming from JAGS, this concept was easier for me to understand. I tried to do it the other way, but I had a gazillion. of tensor related errors.

I think something like this code below will technically work:

with basic_model:
     lambda1 = pm.Gamma("lambda1", alpha=0.001, beta=0.001)
     p =  pm.Beta('p', 1, 1,)

     pm.Poisson("Y_obs", mu=lambda1*p*x['Length'].values+0.001, observed=x['Count'])
     trace = pm.sample(7000, tune=2000, cores=1, return_inferencedata=True)

The problem is that p and lambda1 are completely degenerate, they only ever appear multiplying each other so only their product is constrained.

This works but doesn’t capture the “Counts are either 0 or Poisson(lambda1) and the degenerate probability is p” structure of Y_obs. I will also have to extend this model to a point where Y_obs[i] depends on Y_obs[i-1].

1 Like