Let’s say I’ve generated a bivariate gamma from a gaussian copula model, such as:
import numpy as np
import pymc3 as pm
import scipy as sp
import matplotlib.pyplot as plt
import seaborn as sns
sigma=np.array([[1,-.8],[-.8,1]])
alphas=np.array([1,2])
betas=np.array([2,2])
N=1000
u=np.random.multivariate_normal((0,0),sigma,size=N)
c=sp.stats.norm.cdf(u)
generated=sp.stats.gamma.ppf(c,alphas,betas)
joint-plot for reference:
sns.jointplot(generated[:, 0],generated[:, 1], kind='kde', stat_func=None)
Now I want to fit a copula model with pymc3. In my understanding this would mean doing something like:
with pm.Model() as model:
chol_packed = pm.LKJCorr('chol_packed',n=2, eta=2)
chol = pm.expand_packed_triangular(2, chol_packed)
sigma = pm.Deterministic('sigma', chol.dot(chol.T))
us_dist = pm.MvNormal.dist(mu=np.zeros(2), chol=chol,shape=2)
alphas=pm.HalfNormal('alphas',1,shape=2)
betas=pm.HalfNormal('betas',1,shape=2)
ps=gamma_cdf(generated,alphas,betas)
logp=pm.Potential('logp',us_dist.logp(pm.math.probit(ps)).sum())
where gamma_cdf is a theano-functioning gamma cumulative density function implementation which unfortunately there’s not any (both in theano and pymc3). Moreover I have to mention that I need to fit this model in a large dataset with ADVI therefore i need also the grads.
So it all boils down to have the (lower) incomplete gamma function implemented as a theano ops in order to calculate the cdf:
This can be done (quite) easily wrapping up the scipy special funciton gammainc but the big issue is around getting the derivatives in. Some reference here; as you can see the partial derivative wrt s involves a Meijer G-function. This is where I got stuck…
Is there any way out?
The nearest thing already implemented in pymc3 is the beta incomplete function that is somehow related (but I haven’t found much reference about this).
Any help would be appreciated.
Thank you