Bayesian Sequential A/B Testing if you can only track successful conversions


there are good articles about bayesian A/B testing with PyMC3 like What is A/B testing or Bayesian A/B Testing in PyMC3, but these typically assume that you can track all users who see your version A and B. Imagine the case where you need the GDPR consent for users to be tracked. In that case you will not know about all users who saw your versions, but you will know about all successful conversions, because the user will have bought something.

The article Simple Sequential A/B Testing describes an approach from a frequentist viewpoint. I was wondering how a PyMC3 model would look like if I can only track the successful conversions (every user who bought something)?

One option might be to simply ignore the fact that I can only track a fraction of the overall users who saw my versions. In the end both groups should have a similar rate of rejecting cookies/tracking.

Another option I was thinking about is using a Poisson rate model with using the base case as a kind of “clock”. E.g. the observed rate are the number of observed successful Bs in between two observed successful As. Then, if the rate is larger than 1, B wins and if the rate is smaller than 1 A wins. The intuition being that users should arrive for both cases at the same speed (slower during night, faster during day) and therefore taking one case as the “clock” should give a kind of rate.

I was wondering how people with more experience in bayesian A/B testing than I have would handle that case?

Thanks a lot,

If you feel comfortable about the assumption that everything else is holding equal between both variants (i.e. equal users impacted by rejected cookies/tracking, equal sample sizes, etc.) then just use a poisson model - but you there’s no need to make it complicated by making it a ratio of A/B. Instead just model the counts of both A and B and compare them.


import pymc as pm

with pm.Model() as pois_model: 
  # adjust prior to something more reasonable. Could also use a gamma prior 
  lambd = pm.Exponential('lambd', .001, dims='experiment_groups')
  rate = pm.Poisson('rate', 1/lambd[group], observed=data)

You can also use the gamma-poisson conjugate distribution to analytically solve the posterior and generate samples much faster instead of using MCMC.

In terms of the sequential testing you were interested in, you don’t have to worry about the math as much with the bayesian approach especially with appropriate priors - you can just assess simple concepts like P(variant > control) or calculate the difference between groups A and B to estimate the treatment effect and its uncertainty.

Just be warned that bayesian analysis can still suffer from frequentist limitations like peaking and p-hacking because we all tend to add frequentist ideas into our experiment design despite using bayesian modeling. Frank Harrell tends to talk about ideas like that a lot, Im sure if you googled it you could find more on the topic. I usually just wait until the end of a planned experiment duration unless its a large and obvious difference.