Sum of binomial random variables with different p?

Suppose that we are sampling from three binomial distributions with n=2 but different p: x_i \sim \text{Binomial}(2, p_i), and then we sum those binomial realizations to create a new random variable z = x_1 + x_2 + x_3

Given some data for z, I want to build a Bayesian model to infer the distribution of p_i of each of the Binomials, I used the following code to generate some fake data:

# True probabilities of each of the Binomials
true_probs = [0.5, 0.85, 0.35]

N = 50  # dataset size
draws = []

np.random.seed(42)
for i in range(len(true_probs)):
    draw = np.random.binomial(n=2, p=true_probs[i], size=N)
    draws.append(draw)

draws = np.hstack(
    [np.array(draws[i])[:, np.newaxis] for i in range(3)]
)
z = np.sum(draws, axis=1)

I know that z will have 7 distinct values (from 0 to 6), I want my model to infer the posterior distribution of each p_i given the observed data of z, and this what I tried before getting stuck in likelihood function selection:

with pm.Model() as model:
    p = pm.Uniform('p', 0, 1, shape=(3,))

    x1 = pm.Binomial('x1', n=2, p=p[0])
    x2 = pm.Binomial('x2', n=2, p=p[1])
    x3 = pm.Binomial('x3', n=2, p=p[2])

    z_obs = pm.Deterministic('z_obs', x1 + x2 + x3)

I know that I cannot plug in observed= in pm.Deterministic, but I am stuck in linking the observed data to the prior distribution of p.

Hopefully my question is clear, can anyone suggest how to proceed?

I think you’re in the same case as this post, where you want to do a convolution of binomials. This results in a Poisson-Binomial distribution.

Unlike the linked post, you have small n, so you should be able to implement the distribution by implementing the logp on the Wikipedia page and then passing it to a pm.CustomDist

3 Likes

Many thanks, will look into that!