Revenue metric for ab testing

Raul_Carlomagno · August 29, 2019, 9:00pm

hi, i am new to pymc and was able to build a model for conversion rate metric A/B testing
but now, i am having to implement a model for revenue for A/B testing and i really dont know to model it in pymc. i am following this paper https://cdn2.hubspot.net/hubfs/310840/VWO_SmartStats_technical_whitepaper.pdf

i am modelling the bernoulli as binomial and inputs for model are conversions, sessions and total money, but i am stucked

A_money = 5000
A_conversions = 200
A_sessions = 10000
A_theta = 1/(1+A_money)

with pm.Model() as model:
   prior_control = pm.Beta('prior_control', alpha=1, beta=1) #uninformative prior
   prior_money_control = pm.Gamma()

   #  likelihood
   like_control = pm.Binomial('like_control', n=A_sessions, p=prior_control, observed=A_conversions)
   like_money_control = pm.Exponential('like_money_control', )

   # alpha
   #alpha = pm.Exponential('alpha',)
   # beta
   #beta = pm.Exponential('beta',)
   #g = pm.Gamma('g', alpha=alpha, beta=beta, observed=bw)

   trace = pm.sample(draws=10000, tune=1000, discard_tuned_samples=True, progressbar=True)

thank you in advance

chartl · September 1, 2019, 7:13pm

So this paper makes some very particular choices for prior distributions on the parameters; in particular it uses conjugate priors which can be fit explicitly.

Taking a look at the model, the most useful assumption is that purchase price is independent of purchase probability. This allows the posterior distribution to factor as:

P[\lambda, \theta | n, c, r] = P[\lambda | n, c]P[\theta | r]

The first of these is Beta (as Beta is conjugate to the bernoulli); and the second is Gamma (as gamma is conjugate to the exponential). By using the closed-form posteriors, the paper gets around MCMC entirely; so all the informatic questions (confidence intervals &c) can be obtained by using the regular samplers available in scipy.

If you want to explore arbitrary priors, you can set up a model like:

import pymc3 as pm
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sbn

lambda_true = 0.082
theta_true = 0.431
n_users=10
with pm.Model() as model_true:
    alpha = pm.Bernoulli('alpha', lambda_true, shape=n_users)
    r = pm.Exponential('r', theta_true, shape=n_users)
    user_rev = pm.Deterministic('revenue', alpha * r)
    test_data = pm.sample_prior_predictive(200)
    
plt.plot(test_data['revenue'][:,0])  # revenue from user 1

and perform inference like:

obs_purchase = test_data['alpha'][:,0]
obs_cost = test_data['revenue'][:,0]
obs_cost = obs_cost[obs_cost > 0]
with pm.Model() as model_infer:
    lam = pm.Beta('lambda', 1., 1.)
    thet = pm.HalfNormal('theta', 1.)
    alpha = pm.Bernoulli('alpha', lam, observed=obs_purchase, shape=1)
    r = pm.Exponential('r', thet, observed=obs_cost, shape=1)
    alpha_pred = pm.Bernoulli('alpha_p', lam)
    r_pred = pm.Exponential('r_p', thet)
    revenue = pm.Deterministic('predicted_revenue', r_pred * alpha_pred)
    trace = pm.sample(500, chains=3)
    
pm.traceplot(trace);

Note that using pm.Gamma as the prior on theta should give you the same posterior in the paper – though yours will be approximate (from MCMC) as opposed to exact.

Raul_Carlomagno · September 2, 2019, 8:01pm

@chartl , i really appreciate your help! nice explanation. i was trying to reproduce it without using arrays, just having aggregated info like this:

A_money = 5000
A_conversions = 200
A_sessions = 10000

so, thats why i am changing bernoulli to binomial
as this calculator does: https://vidogreg.shinyapps.io/bayes-arpu-test/
based on the same paper

but, it is not working as expected, i dont know what i am doing wrong, i tried lot of things
sorry for my first steps on this

with pm.Model() as model_infer:
    lam = pm.Beta('lambda', 1., 1.)
    #thet = pm.HalfNormal('theta', 1.)
    thet = pm.Gamma('theta', 1., 1.) #using gamma as you suggested

    #alpha = pm.Bernoulli('alpha', lam, observed=obs_purchase, shape=1)
    alpha = pm.Binomial('alpha', n=A_sessions, p=lam, observed=A_conversions)

    #r = pm.Exponential('r', thet, observed=obs_cost, shape=1)
    r = pm.Exponential('r', thet, observed=A_theta, shape=A_conversions)

    alpha_pred = pm.Bernoulli('alpha_p', lam)
    #alpha_pred = pm.Binomial('alpha_p', n=A_conversions, p=lam)

    r_pred = pm.Exponential('r_p', thet)

    revenue = pm.Deterministic('predicted_revenue', r_pred * alpha_pred)

    trace2 = pm.sample(10000, chains=4)

chartl · September 2, 2019, 9:03pm

This is all correct except for the shape in r:

A_money = 5000
A_conversions = 200
A_sessions = 10000

TH=0.40
A_theta = np.random.exponential(1./TH, size=(A_conversions,))

with pm.Model() as model_infer:
    lam = pm.Beta('lambda', 1., 1.)
    thet = pm.Gamma('theta', 1., 1.)
    alpha = pm.Binomial('alpha', n=A_sessions, p=lam, observed=A_conversions)
    r = pm.Exponential('r', thet, observed=A_theta)
    alpha_pred = pm.Bernoulli('alpha_p', lam)
    r_pred = pm.Exponential('r_p', thet)
    revenue = pm.Deterministic('predicted_revenue', r_pred * alpha_pred)
    trace2 = pm.sample(500, chains=4)

pm.traceplot(trace2);

Raul_Carlomagno · September 2, 2019, 9:37pm

thank you again @chartl

this

is it arbitrary? does it come from A_conversions / A_money? anyway it gives me 0.04

thank you!

chartl · September 2, 2019, 10:21pm

Don’t worry about it – I just use it to generate A_theta since I didn’t have any data. The parameter is just the exponential parameter: and where pymc3 uses theta, numpy uses 1/theta.

Raul_Carlomagno · September 3, 2019, 1:37pm

thank you @chartl
but for example, can i get A_theta from A_money?
because that exponential distribution should come from spent money
you wrote
A_theta = np.random.exponential(1./A_theta, size=(A_conversions,))
i also tried
A_theta = np.random.exponential(1./A_money, size=(A_conversions,))

but it didnt work as expected
thank you for your patience and time

chartl · September 3, 2019, 5:35pm

So A_money is the total money spent over all conversions right? The problem here is that you aren’t actually observing how much money is spent on each transaction; but only the total money.

It turns out that a sum of k exponential distributions is in fact a gamma distribution \Gamma(k, \theta)

so to do this

A_money = 5000
A_conversions = 200
A_sessions = 10000

with pm.Model() as model_infer:
    lam = pm.Beta('lambda', 1., 1.)
    thet = pm.Gamma('theta', 1., 1.)

    alpha = pm.Binomial('alpha', n=A_sessions, p=lam, observed=A_conversions)

    r = pm.Gamma('r', A_conversions, thet, observed=A_money)

    alpha_pred = pm.Bernoulli('alpha_p', lam)

    r_pred = pm.Exponential('r_p', thet)

    revenue = pm.Deterministic('predicted_revenue', r_pred * alpha_pred)

    trace2 = pm.sample(500, chains=4)

pm.traceplot(trace2);

Raul_Carlomagno · September 5, 2019, 1:12pm

hi @chartl,
“The problem here is that you aren’t actually observing how much money is spent on each transaction”
which would be this problem? it is that i loss precision in the model?

chartl · September 5, 2019, 5:09pm

You lose some modeling flexibility. On the one hand you can choose to model the distribution of spending directly

r_i \sim \mathrm{Exp}(\theta) or r_i \sim \mathrm{Pareto}(\alpha) -1

which requires the vector of money spent. This lets you use any distribution to infer the parameter and predict future expenditure.

Alternatively, you can choose to model the total spending

R = \sum_{i=1}^k r_i \sim \Gamma(\theta, k)

to infer the parameters, but now if you want to predict future expenditure (the distribution of r_i), you need some link between the parameters of P(R|\theta) and the parameters of P(r_i|\theta). This limits the distributions to those with closed-form sums, such as the exponential (sum of i.i.d. exponentials is gamma), normal (sum of normals is normal), or gamma (sum of gammas is gamma); but not things like lognormal or power.

Raul_Carlomagno · September 9, 2019, 1:31pm

perfect @chartl
understood
but, taking account scalibility, i am modelling a big ecommerce site, money spent vector could have 15 millons items.
i tried pymc3 with such a big numbers and it is very slow, or maybe i am doing something wrong

Borism · March 26, 2020, 1:53pm

Hey Raul,

I guess you never have to use the 15 million items. The idea is that you are doing an AB-test on a certain topic on the website right? You only collect items from both groups until you are certain enough that one of both is better than the other. 15 million items would be far more than you would need for such a test.

Zappageck · September 19, 2022, 9:45am

When comparing the above described approach with the pymc guide on AB testing the revenue is modeled a bit differently in the sense that in the guide the theta from the Beta distribution is used while here we use the theta to model the conversion via Binomial and Bernoulli which also leads to different results for revenue_per_visitor (mean and especially sd).

Above:

alpha = pm.Binomial('alpha', n=A_sessions, p=lam, observed=A_conversions)
r = pm.Gamma('r', A_conversions, thet, observed=A_money)
alpha_pred = pm.Bernoulli('alpha_p', lam)
r_pred = pm.Exponential('r_p', thet)
revenue_per_visitor  = pm.Deterministic('predicted_revenue', r_pred * alpha_pred)

vs.

Guide:

revenue_per_visitor = pm.Deterministic("revenue_per_visitor", thet * (1 / lam))

What is the reason for these differences? Is one of them wrong?

Topic		Replies	Views
Looking for any resources about modeling/predicting revenue/sales Questions	0	489	December 5, 2017
Conversion rate model: Beta parameters from pm.Uniform Questions	5	1119	February 7, 2021
Product of two probabilities distributions Questions	9	3901	June 8, 2021
Computing Bayes Factor with Bernoulli Distribution v5 modeling	11	81	June 3, 2025
Simulating AB test outcome with prior data from same test on different platform version agnostic modeling	0	272	August 30, 2023

Revenue metric for ab testing

Related topics