# Empirical Bayes - estimation of parameters with mixture of gamma priors and observed data

Hello,

I am trying to calculate a disproportionality measure using an Empirical Bayes method. A really good work has been done in SAS to achieve the goal and available here : Disproportionality-Measures-with-SAS/ebgm.sas at master · statmike/Disproportionality-Measures-with-SAS · GitHub

I am trying to replicate this method in Python using Pymc library. However I’m can’t get the MCMC to work to estimate the parameters. The distribution is assumed to be a mixture of two gamma distribution (priors) and the data (N11, and E11 observed) allow to generate the posterior. I get an output in python but it is not aligned with the SAS results.

In particular here is the SAS code I am trying to replicate in Python :

ods output PostSumInt = work.MCMC_PARMS;
proc mcmc data=&outds. seed=32259 nmc=10000 thin=5 nthread=6 propcov=quanew monitor=(Mix_p alpha1 alpha2 beta1 beta2);
parms Mix_p 0.3333 alpha1 .2 alpha2 2 beta1 .1 beta2 4;

/**Generic gamma priors*/
prior alpha1 beta1 ~ gamma(1, iscale=1);
prior alpha2 beta2 ~ gamma(1, iscale=1);
prior Mix_p ~ uniform(0,1);

const1=lgamma(alpha1+N11) - lgamma(alpha1) - lgamma(N11+1);
const2=lgamma(alpha2+N11) - lgamma(alpha2) - lgamma(N11+1);
LogL1=const1 - N11*log(1+beta1/E11) - alpha1*log(1+E11/beta1);
LogL2=const2 - N11*log(1+beta2/E11) - alpha2*log(1+E11/beta2);
llike = log(Mix_p*exp(LogL1) + (1-Mix_p)*exp(LogL2));
model N11 ~ general(llike);
run;


Here is my code in Python :

with pm.Model() as model2:

def likelihood(theta,E11,N11):
Mix_p,alpha1,alpha2,beta1,beta2 = theta

const1=at.gammaln(alpha1+N11) - at.gammaln(alpha1) - at.gammaln(N11+1)
const2=at.gammaln(alpha2+N11) - at.gammaln(alpha2) - at.gammaln(N11+1)
LogL1=const1 - N11*np.log(1+beta1/E11) - alpha1*np.log(1+E11/beta1)
LogL2=const2 - N11*np.log(1+beta2/E11) - alpha2*np.log(1+E11/beta2)

return(np.log(Mix_p*np.exp(LogL1) + (1-Mix_p)*np.exp(LogL2)))

Mix_p = pm.Uniform('Mix_p', 0, 1)

alpha1 = pm.Exponential('alpha1', lam=0.2)
beta1 = pm.Exponential('beta1', lam=0.1)

alpha2 = pm.Exponential('alpha2', lam=2)
beta2 = pm.Exponential('beta2', lam=4)

#g1 = pm.Gamma('gamma1',alpha = alpha1, beta = beta1)
#g2 = pm.Gamma('gamma2',alpha = alpha2, beta = beta2)

theta = (Mix_p,alpha1,alpha2,beta1,beta2)

like = pm.Potential('likelihood',likelihood(theta,E11,N11))


I am not very familiar with Pymc so I have tried to user other people experience on other feeds but I don’t seem to get the right result.

Thank you in advance
Victor

Hi I’m also trying to replicate that sas code. Have you found a solution to this. if yes can you please share your code.

If I understand correctly, the observed N is actually from a mixture of beta-binomial distribution? (formula 6 from DuMouchel, W. (1999)): If that’s the case, expressing it as such is much easier:

with pm.Model() as m:
Mix_p = pm.Uniform('Mix_p', 0, 1)

alpha1 = pm.Exponential('alpha1', lam=0.2)
beta1 = pm.Exponential('beta1', lam=0.1)

alpha2 = pm.Exponential('alpha2', lam=2)
beta2 = pm.Exponential('beta2', lam=4)

# g1_dist = pm.Gamma.dist(alpha=alpha1, beta=beta1)
# g2_dist = pm.Gamma.dist(alpha=alpha2, beta=beta2)
# lmbda = pm.Mixture('lmbda', w = [Mix_p, 1-Mix_p], comp_dists = [g1_dist, g2_dist])

comp1_dist = pm.BetaBinomial.dist(n=E11, alpha=alpha1, beta=beta1)
comp2_dist = pm.BetaBinomial.dist(n=E11, alpha=alpha2, beta=beta2)
obs = pm.Mixture('obs', w = [Mix_p, 1-Mix_p], comp_dists = [comp1_dist, comp2_dist], observed=N11)
idata = pm.sample()


And then after inference you can use the posterior of Mix_p, alpha1, alpha2, beta1, beta2 to reconstruct the posterior of \lambda following formula 6 & 7 in the paper. 