Hierarchical AB testing in PyMC3

I am working on a simple Hierarchical AB testing model with beta - binomial distributions.
I am following this tutorial which is written for pymc2.

import pymc

@pymc.stochastic(dtype=np.float64)
def hyperpriors(value=[1.0, 1.0]):
    a, b = value[0], value[1]
    if a <= 0 or b <= 0:
        return -np.inf
    else:
        return np.log(np.power((a + b), -2.5))

a = hyperpriors[0]
b = hyperpriors[1]

# This is what we don't know, but would like to find out
true_rates = pymc.Beta('true_rates', a, b, size=10)

# This is what we observed
trials = np.array([100, 100, 100, 100, 100, 100, 100, 100, 100, 100])
successes = np.array([40, 44, 47, 54, 63, 46, 44, 49, 58, 50])
observed_values = pymc.Binomial('observed_values', trials, true_rates, observed=True, value=successes)

model = pymc.Model([a, b, true_rates, observed_values])
mcmc = pymc.MCMC(model)

mcmc.sample(1000000, 500000)

I can’t find the right way to replicate the hyperpriors stochastic function in PyMC3. Googling returns hints about using a pm.Potential method but I’m at a loss as to how to constrain the “prior sample size” a+b.

1 Like

I’m not familiar at all with the potential functionality in pymc3, so I would do this a completely different way. For one, that joint distribution is not-normalized and it’s not clear how to draw from it properly (maybe \theta \propto \mathrm{Power}(-2.5), a \sim U(0, \theta), b=\theta - a …?). Instead I’d use the parameterization that makes the prior uniform:

with pm.Model() as hm:
  ratio = pm.Uniform('ratio', 0, 1)
  inv_root_sample = pm.Uniform('inv_root_sample', 0, 1)
  a = pm.Deterministic('a', (ratio/inv_root_sample**2))
  b = pm.Deterministic('b', (1 - ratio)/(inv_root_sample**2))
1 Like

I did something similar here. You write the model with pymc3 as:

import pymc3 as pm
import theano.tensor as tt
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

plt.style.use('seaborn-darkgrid')

trials = np.array([100, 100, 100, 100, 100, 100, 100, 100, 100, 100])
successes = np.array([40, 44, 47, 54, 63, 46, 44, 49, 58, 50])

First, the hyperpriors

def logp_ab(value):
    """
    Prior density for this problem
    """
    return tt.log(tt.pow(tt.sum(value), -5/2))

with pm.Model() as model:
    # Uninformative prior for alpha and beta
    ab = pm.HalfFlat('ab', shape=2, testval=np.asarray([1., 1.]))
    pm.Potential('p(a, b)', logp_ab(ab))
    
    # Our alpha and beta. Remove this code if you don't want to track alpha
    # and beta
    alpha = pm.Deterministic('alpha', ab[0])
    beta = pm.Deterministic('beta', ab[1])

Now

with model: 
    theta = pm.Beta('theta', alpha=ab[0], beta=ab[1], shape=trials.shape[0]) 
    p = pm.Binomial('y', p=theta, observed=successes, n=trials) 

The magic behind pymc3

with model:
    trace = pm.sample(draws=5_000, tune=5_000)

If you want to see the traceplot:

pm.traceplot(trace)
plt.show()

If you want to see the posterior distribution of each theta

pm.plot_posterior(trace, varnames=['theta']) 
plt.show()

He uses seaborn to show all of them in one plot

with model:
    poste = pm.sample_posterior_predictive(trace, vars=[theta], size=2000)

for i in range(poste['theta'].shape[1]):
    sns.kdeplot(poste['theta'][:, i], shade=True)

plt.show()

3 Likes

Ah this is brilliant! Specify the distribution of the ratio as Uniform and use the ratio to bind a and b. I’m getting stable and coherent results now. Thank you @ chartl

Thank you rosgori this is the direct translation of the pymc code to PyMC3. This helps a ton.

Interestingly I’m getting different values for (a, b), (alpha, beta), same order of magnitude, but very similar ones for theta.

@alexisperrier could you please share your experience with hierarchical models in context when you need to cooperate multiple days of trials?

I found out that it is not clear how exactly utilize a dataset that contains multiple days of data for multiple variants.