Hello all, I’ve been trying to learn and apply Bayesian sampling approaches, particularly to A/B tests and I am in the weeds. I am trying to follow the BEST approach documented in then pymc documentation here but I am coming into problems and have questions. I hope I can ask them here to get unstuck.
The setup is I have very right skewed data that is duration times between a control and test. This is my attempt at simulating data for that.
ctrl_ = sc.stats.skewnorm(loc=2.8, scale=3, a=25).rvs(10000) ctrl = np.r_[ctrl_, np.random.randint(20, high=500, size=300)] test_ = sc.stats.skewnorm(loc=1.4, scale=2.8, a=16).rvs(10000) test = np.r_[test_, np.random.randint(20, high=380, size=300)]
The goal is to determine if the test group is some % smaller than the test that is meaningful statistically.
So since the data is right skewed pretty badly, I decided to take the logarithm of each group.
ctrl_log = np.log(ctrl) test_log = np.logt(test) pooled_mn = np.r_[ctrl_log, test_log].mean() pooled_std = np.r_[ctrl_log, test_log].std() df = np.min([ctrl_log.size, test_log.size]) - 1 with pm.Model() as m1: ctrl_mu = pm.Normal("ctrl_mu", mu=pooled_mn, sigma=pooled_std * 2) test_mu = pm.Normal("test_mu", mu=pooled_mn, sigma=pooled_std * 2) nu = pm.Uniform("nu", lower=int(df/4), upper=df) ctrl_sigma = pm.HalfCauchy("ctrl_sigma", beta=2) ctrl_lam = ctrl_sigma ** -2 test_sigma = pm.HalfCauchy("test_sigma", beta=2) test_lam = test_sigma ** -2 ctrl_duration = pm.StudentT("ctrl_duration", nu=nu, mu=ctrl_mu, lam=ctrl_lam, observed=ctrl_log) test_duration = pm.StudentT("test_duration", nu=nu, mu=test_mu, lam=test_lam, observed=test_log) idata = pm.Sample(2000, tune=5000, target_accept=.95, return_inference=True)
So I had some questions that I am struggling with:
- When I run this model the posterior for
test_muare very close to the
average(test_log), respectively. Is that what I am going for or should it be closer to the parameters in the skew norm, log(2.8) and log(1.4)?
- When I try to convert the log transformed posterior for the mu’s back with np.exp(), they are very different from the untransformed data for ctrl and test. Is that a sign that the model is not good or that I am doing a naive conversion back of the transformed variable?
- My completely fabricated strategy was to produce the posterior of the difference between
test_muand if that difference did not include 0, claim the two are different. Than I would produce the posterior predictive using
pred = pm.sample_posterior_predictive(idata, var_names=["ctrl_duration", "test_duration"])and plot the posterior of
np.exp(pred["test_duration"])/np.exp(pred["ctrl_duration"]) - 1to see the % change. Is that a correct way to show the percent change posterior?
- I originally computed the percent change of the posteriors
test_mu/ctrl_mu - 1but since they are log transformed it seemed that was different from what I really wanted, the distribution of percent change of the original scale mu’s.
- Does it matter what transformation I use (e.g. square root) or if I standardize? The BEST document doesn’t say that and for some really bad skewed data the sampler seems to die on me without doing a transformation.
I hope all of this made sense and some kind soul would help me understand how to better use and understand Bayesian analysis for A/B testing. I really don’t want to go back to stats.ttest_ind . . . .
Thank you !!!