I have purchase data that seems to be close to lognormal, but on the log scale it has some skewness (I’m assuming this is whats referred to as a log skew normal distribution but I could be wrong here)
I’m try to fit a model to the data that models the expected revenue per purchase and the expected total revenue of the dataset, but It doesn’t seem to be fitting to the data well / returning the proper parameters. Is there a better way to fit this model?
Also I think the expectation is wrong since I’m basing it on the expectation of a lognormal distribution - how would I improve it (the variable RevA, total_revA)?
import pandas as pd import numpy as np import scipy.stats as scs import pymc3 as pm import arviz as az alpha, loc, scale = 1.5, 4.8, 1.013 d_log_scale = scs.skewnorm(alpha, loc=loc, scale=scale).rvs(1000) d = np.exp( d_log_scale ) print( d_log_scale.mean(), d_log_scale.std() ) with pm.Model() as model: alpha_theta_a = pm.Normal('alpha_theta_a', 1, 1) sig_theta_a = pm.Exponential('sig_theta_a', 1) mu_theta_a = pm.Normal('mu_theta_a', 2, 2) theta_a = pm.SkewNormal('theta_a', mu=mu_theta_a, sigma=sig_theta_a, alpha=alpha_theta_a) sig_a = pm.Exponential('sig_a', 1) Ra = pm.Lognormal('Ra', theta_a, sig_a, observed=d) RevA = pm.Deterministic('Revenue', np.exp(theta_a + 0.5*sig_a**2)) total_revA = pm.Deterministic('total_revenue', RevA * len(d)) # sampling prior = pm.sample_prior_predictive() trace = pm.sample(1000, tune=1000) posterior_predictive = pm.sample_posterior_predictive(trace) # Save results to xarray object data = az.from_pymc3( trace=trace, prior=prior, posterior_predictive=posterior_predictive, model=model ) pm.summary(data.posterior)