How to infer parameters for a continuous mixture model: Poisson-Beta model

Luo · November 23, 2022, 2:04pm

Hi, I want to implement Bayesian inference on a continuous mixture model, Poisson-Beta model, which can be written as follow formula:

p|\alpha,\beta\sim Beta(\alpha,\beta)
y|\mu,p \sim Poisson(\mu\cdot p)

where \alpha, \beta, \mu are the inferred parameters. To obtain the posterior distribution of these parameters, I set the corresponding prior distribution and hyperparameters:

a\sim uniform(-2,2),b\sim uniform(-2,2)
\alpha=10^{a},\beta=10^{b}
p\sim Beta(\alpha,\beta)
\mu\sim Uniform(0,50)
y\sim Poisson(\mu\cdot p)

where the prior distributions are \alpha,\beta\sim loguniform(-2,2) and \mu\sim uniform(0,50)
First, I generate the synthetic data:

import matplotlib.pyplot as plt
import scipy.stats as stats
import numpy as np
import pandas as pd
import seaborn as sns
import pymc as pm
import arviz as az
az.style.use('arviz-darkgrid')

num_n = 1000
a_real = -0.3
b_real = -0.3
alpha_real = 10**(a_real)
beta_real = 10**(b_real)
mu_real = 20
data_p = stats.beta.rvs(alpha_real, beta_real, size = num_n)
data = stats.poisson.rvs(mu_real*data_p)

plt.figure(figsize=(4, 3))
plt.hist(data, edgecolor = 'white', density=True)
plt.xlabel('counts', fontsize = 12)
plt.ylabel('PMF', fontsize = 12)
plt.tick_params(labelsize = 10)

question_2_0
Second, I build the model by the pymc

rng = 111
hierarchical_model = pm.Model()
with hierarchical_model:

    # Priors for unknown model parameters
     a = pm.Uniform('a', lower=-2., upper=2.)
     b = pm.Uniform('b', lower=-2., upper=2.)
     p = pm.Beta('p', 10**a, 10**b)
     mu = pm.Uniform('mu', 0, 50)
     y = pm.Poisson('y', mu*p, observed=data)
     idata = pm.sample_prior_predictive(samples=200, random_seed=rng)

Sampling: [a, b, mu, p, y]

However, I got the follow wrong results:

# Inference button (TM)!
with hierarchical_model:
    idata.extend(pm.sample(1000, tune=2000, chains=2, random_seed=rng))

az.plot_trace(idata)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [a, b, p, mu]

Sampling 2 chains for 2_000 tune and 1_000 draw iterations (4_000 + 2_000 draws total) took 37 seconds.

array([[<AxesSubplot: title={'center': 'a'}>,
        <AxesSubplot: title={'center': 'a'}>],
       [<AxesSubplot: title={'center': 'b'}>,
        <AxesSubplot: title={'center': 'b'}>],
       [<AxesSubplot: title={'center': 'p'}>,
        <AxesSubplot: title={'center': 'p'}>],
       [<AxesSubplot: title={'center': 'mu'}>,
        <AxesSubplot: title={'center': 'mu'}>]], dtype=object)

with hierarchical_model:
    pm.sample_posterior_predictive(idata, extend_inferencedata=True, random_seed=rng)

az.plot_ppc(idata, num_pp_samples=100)

<AxesSubplot: xlabel='y / y'>

I read the section on hierarchical and mixture models and found that the Poisson-beta model is a continuous mixture model. Therefore, I raise two problems:

I suspect that this wrong result is due to treating p as a parameter. However, p is a latent variable which is obey beta distribution. In fact, the likelihood function of poisson-beta model requires the integration of p. How do I implement this model in pymc?
If given any finite continuous mixture model, is it possible to implement Bayesian inference in pymc?

Thanks!

ckrapu · November 23, 2022, 4:38pm

The distributions shouldn’t match because in your creation of the array data, the values of a_real and b_real are fixed instead of sampled. Thus, these will have much lower dispersion (and a different distribution) than the outputs sampled under hierarchical_model.

~~For an apples-to-apples comparison, you should sample a large number of a_real and b_real values with numpy / scipy and then compare that distribution with the PyMC outputs.~~

I reread your code and had misunderstood the comparison. Can you change the following:

p = pm.Beta('p', 10**a, 10**b)

To instead read like

p = pm.Beta('p', 10**a, 10**b, shape=num_n)

As it stands, the PyMC model assumes a single p value. The data you simulated includes num_n values in p, thus the two distributions will be different.

Luo · November 24, 2022, 6:57am

Thank you so much for your time and help! The code works perfectly！

Topic		Replies	Views
Struggling with Beta mixture models Questions	3	1839	February 2, 2019
How to improve fit of Poisson Mixture Model in PyMC3? Questions	20	5026	May 24, 2018
Computing Bayes Factor with Bernoulli Distribution v5 modeling	11	92	June 3, 2025
Poisson Mixture Model & Label Switching v5	7	1509	August 3, 2022
Geometric-Beta mixture? Questions	0	406	January 28, 2021

How to infer parameters for a continuous mixture model: Poisson-Beta model

Related topics