Fitting LogNormal to payment amounts data gives weird PPC plot

matjazmuhic · December 5, 2019, 3:35pm

Hi there,

I have data from payment amounts for different years (2018 and 2019). From plotting the data you can see it’s heavily skewed. I thought LogNormal might be a good fit as applying np.log and plotting it produces something that somewhat resembles the Normal distribution.

Here’s the histogram for the “raw” data at different “zoom levels” (x-axis is payment amount):

Here is the model (I used Exponential priors trying to keep the values small, is that ok?):

with pm.Model() as model:
  mu1 = pm.Exponential('mu1', lam=1)
  sigma1 = pm.Exponential('sigma1', lam=1)
  mu2 = pm.Exponential('mu2', lam=1)
  sigma2 = pm.Exponential('sigma2', lam=1)

  payments_2018 = pm.Lognormal('payments_2018', mu=mu1, sigma=sigma1, observed=bq_data_payments_2018['transaction_amount'])
  payments_2019 = pm.Lognormal('payments_2019', mu=mu2, sigma=sigma2, observed=bq_data_payments_2019['transaction_amount'])
  diff = pm.Deterministic('diff', mu2-mu1)
  lift = pm.Deterministic('lift', mu2/mu1)

  trace = pm.sample(10000, tune=2000)

Here’s the trace plot (it looks ok, at least to me):

But the PPC check (plotted) looks weird:

Any ideas why this is happening? I can’t wrap my head around it?

Also, is there any better way to model that kind of data? I’m mainly looking to compare it in a “A/B experiment” fashion.

Thanks in advance!

Topic		Replies	Views
Posterior does not fit observed well when group and a predictor is included Questions	1	342	October 26, 2020
Lognormal distribution for donations Questions	8	734	June 7, 2021
How to formulate a Lognormal likelihood using PyMC3? Questions	19	2364	March 10, 2021
Log Skew Normal distribution and expected value Questions	0	744	September 4, 2020
Poisson realization of log-normal distribution Questions	4	839	January 28, 2021

Fitting LogNormal to payment amounts data gives weird PPC plot

Related topics