Conversion rate model: Beta parameters from pm.Uniform

hadron · February 4, 2021, 11:15am

Hi pymc community,

this is my first post here, as my getting to know pymc3…
I posted this question in stackoverflow (python - Conversion rate posterior in pymc3 - Stack Overflow)
but this seems to be the more appropriate place for pymc experts.

I have conversions rates for every Monday over the past year for posts published on my webpage:

YMD	n_orders	n_views
20200120	5750	191265
20200127	5725	193661
20200203	6182	203081
20200210	6008	189918
20200217	5543	181347

If I take the average of conversions in my dataset (i.e. average over success/trials) I get a conversion of about 0.027. This is the range of my trials and success variables:

success	trials
min	1194.0
max	6182.0

Based on the min/max range of my observations I built the following model:

with pm.Model() as comparing_days:
    alpha_1 = pm.Uniform('alpha_1', 1000, 10000, shape=1)
    beta_1 = pm.Uniform('beta_1', 40000, 250000, shape=1)
    p_B = pm.Beta('p_B', alpha=alpha_1, beta=beta_1, shape=1)
    obs = pm.Binomial('obs', n=df_cr.trials, 
                      p=p_B, observed=df_cr.success, shape=1)

After running 50K samples (after 1K burn-in) using pm.sample I get the output below

So the alpha and beta parameter go up their max values, doesn’t seem to converge, also during sampling it complains about the low acceptance probability.

What am I doing wrong? Does my model make sense at all?

cluhmann · February 5, 2021, 5:56am

Welcome!

So first things first. You should ignore all aspects of this result because of the (large number of) presence of divergences (visually indicated by the black tick marks on the x-axis, but pymc3 also would have complained about them at the terminal). Once you see those, the details of the posterior can’t really be interpreted.

Two quick observations about your model. First, your priors over the alpha and beta parameters imply extremely precise/strong beliefs about the value of p_B. Depending on how much data you have available and your specific needs, this may or may not be desirable. Second, uniform priors often cause sampling difficulties because of the bounds that they create at each end of the interval.

Without knowing exactly what your data looks like, my suggestion would be to weaken the priors and probably re-specify the mode so that your parameters characterize the mean and SD of the beta distribution rather than alpha and beta. Because beta distributions have support over [0,1] an uninformative prior over the mean is somewhat easier to select. For the SD, you can choose a prior with support over the positive reals and build in as much confidence as you wish. Something like a gamma prior is fairly conventional for priors on scale parameters in hierarchical models. Here is a sketch of something that may be useful (beware, extremely untested):

with pm.Model() as comparing_days:
    mu = pm.Uniform('mu', upper=0, lower=1)
    sd = pm.Gamma('sd', alpha=1, beta=10)
    p_B = pm.Beta('p_B', mu=mu, sigma=sd)
    obs = pm.Binomial('obs', n=df_cr.trials, 
                      p=p_B, observed=df_cr.success)

hadron · February 5, 2021, 9:54am

Thank you so much @cluhmann!
I tried the following:

with pm.Model() as comparing_days:
    mu = pm.Uniform('mu', upper=0.1, lower=0.001)
    sd = pm.InverseGamma('sd', alpha=20, beta=0.1)
    p_B = pm.Beta('p_B', mu=mu, sigma=sd, shape=1)
    obs = pm.Binomial('obs', n=df_cr.trials, p=p_B, observed=df_cr.success, shape=1)

and this convergences much better, 29 divergences in both chains (50K iterations)

Is this number of divergences acceptable? What is an acceptable level of divergences?
Thanks again, this looks much better now!

ricardoV94 · February 5, 2021, 1:56pm

Zero (unfortunately). Even a single divergence should be a reason for concern.

What can you do? Besides changing your model (or priors, or data) you can also try to increase target_accept in pm.sample(). That sometimes helps the sampler enough to not diverge anymore. But first you should ask yourself if there might be something fundamentally flawed with your model, as that is often the reason why divergences crop up.

cluhmann · February 5, 2021, 4:17pm

Sad but true! 29 divergences out of 100K draws can likely be taken care of by tweaking the sampling.

It’s massive overkill, but if you (or anyone else) are looking for a deep dive into sampling diagnostics, I would highly recommend Aki’s keynote from PyMCon 2020.

hadron · February 7, 2021, 1:47pm

Hi there,
thank you both @cluhmann and @ricardoV94
I used following:

with pm.Model() as comparing_days:
    mu = pm.Uniform('mu', upper=0.1, lower=0.001)
    sd = pm.InverseGamma('sd', alpha=25, beta=0.03)
    p_B = pm.Beta('p_B', mu=mu, sigma=sd, shape=1)
    obs = pm.Binomial('obs', n=df_cr.trials, 
                      p=p_B, observed=df_cr.success, shape=1)

and got

0 divergences!
So this looks good now. Thanks also for the link to Aki’s keynote, I’ll definitely have a look at it.

Topic		Replies	Views
Struggling with Beta mixture models Questions	3	1823	February 2, 2019
Beta-Binomial conjugate prior -- pm.Binomial buggy results...? Questions	9	838	November 3, 2021
Interpreting beta-binomial parameterisation/posterior Questions	2	685	August 6, 2020
PyMC3-Sampling not taking place for the entire range of values between the given bounds of a Uniform prior Questions	7	1578	January 14, 2018
Sampler issues on Beta prior Binomial likelihood v5 bug	5	393	December 8, 2022

Conversion rate model: Beta parameters from pm.Uniform

Related topics