I have two separate datasets. I want to build a posterior distribution based on Assymetric Laplace for the first and a Pareto distribution for the second.
In both cases, I would take those parameters and then empirically find a distribution that fits each parameter set best. My problem is, how do I go from raw data to getting the parameters needed. Namely, b, kappa and mu for Assymetric Laplace and alpha and m for Pareto?
For reference, if I was modeling a beta distribution, there is an explicit formula to go from mu and sigma of the dataset to alpha and beta. I donāt see an analogue for the distributions described above.
As per the request in the comments, I am adding example code for clarity:
from fitter import Fitter
import numpy as np
import pandas as pd
import pymc3 as pm
pymc3_avail_distr = ['normal', 'halfnorm', 'beta', 'expon', 'laplace',
'laplace_asymmetric', 't', 'cauchy', 'halfcauchy',
'gamma', 'invgamma', 'lognorm', 'chisquare',
'wald', 'pareto', 'vonmises', 'triangular',
'rice', 'logistic']
def get_best_distr(data):
f = Fitter(data,
distributions=pymc3_avail_distr)
f.fit()
res = f.get_best()
return res
df_for_prior = pd.DataFrame(np.random.uniform(low=1700000, high=1900000, size = (22,5)), index = np.arange(1998, 2020))
# transform the values in prior to get the Assymetric LaPlace parameters
# this is the step I am missing
# run fitter function on each parameter to get the best-fitting distribution along with the parameter values to use for specifying the priors
obs = np.array([1.726567e+06, 1.589836e+06, 1.643981e+06, 1.584314e+06])
with pm.Model() as model1:
b = # some distribution based on results above
kappa = #
mu = #
# Based on distr of raw data
pred = pm.AssymetricLaplace("modeling_var", b = b, kappa = kappa, mu = mu, observed=obs)
trace = pm.sample(1000, tune=800)