Empirical Bayes - estimation of parameters with mixture of gamma priors and observed data

Hello,

I am trying to calculate a disproportionality measure using an Empirical Bayes method. A really good work has been done in SAS to achieve the goal and available here : Disproportionality-Measures-with-SAS/ebgm.sas at master · statmike/Disproportionality-Measures-with-SAS · GitHub

I am trying to replicate this method in Python using Pymc library. However I’m can’t get the MCMC to work to estimate the parameters. The distribution is assumed to be a mixture of two gamma distribution (priors) and the data (N11, and E11 observed) allow to generate the posterior. I get an output in python but it is not aligned with the SAS results.

In particular here is the SAS code I am trying to replicate in Python :

ods output PostSumInt = work.MCMC_PARMS;
	proc mcmc data=&outds. seed=32259 nmc=10000 thin=5 nthread=6 propcov=quanew monitor=(Mix_p alpha1 alpha2 beta1 beta2);
		parms Mix_p 0.3333 alpha1 .2 alpha2 2 beta1 .1 beta2 4;
	
		/**Generic gamma priors*/
		prior alpha1 beta1 ~ gamma(1, iscale=1);
		prior alpha2 beta2 ~ gamma(1, iscale=1);
		prior Mix_p ~ uniform(0,1);
	
		const1=lgamma(alpha1+N11) - lgamma(alpha1) - lgamma(N11+1);
		const2=lgamma(alpha2+N11) - lgamma(alpha2) - lgamma(N11+1);
		LogL1=const1 - N11*log(1+beta1/E11) - alpha1*log(1+E11/beta1);
		LogL2=const2 - N11*log(1+beta2/E11) - alpha2*log(1+E11/beta2);
		llike = log(Mix_p*exp(LogL1) + (1-Mix_p)*exp(LogL2));
		model N11 ~ general(llike); 
	run;

Here is my code in Python :

with pm.Model() as model2:
    
    def likelihood(theta,E11,N11):
        Mix_p,alpha1,alpha2,beta1,beta2 = theta
        
        const1=at.gammaln(alpha1+N11) - at.gammaln(alpha1) - at.gammaln(N11+1)
        const2=at.gammaln(alpha2+N11) - at.gammaln(alpha2) - at.gammaln(N11+1)
        LogL1=const1 - N11*np.log(1+beta1/E11) - alpha1*np.log(1+E11/beta1)
        LogL2=const2 - N11*np.log(1+beta2/E11) - alpha2*np.log(1+E11/beta2)
        
        return(np.log(Mix_p*np.exp(LogL1) + (1-Mix_p)*np.exp(LogL2)))

    
    Mix_p = pm.Uniform('Mix_p', 0, 1)
    
    alpha1 = pm.Exponential('alpha1', lam=0.2)
    beta1 = pm.Exponential('beta1', lam=0.1)
    
    alpha2 = pm.Exponential('alpha2', lam=2)
    beta2 = pm.Exponential('beta2', lam=4)
    
    #g1 = pm.Gamma('gamma1',alpha = alpha1, beta = beta1)
    #g2 = pm.Gamma('gamma2',alpha = alpha2, beta = beta2)
    
    theta = (Mix_p,alpha1,alpha2,beta1,beta2)

    like = pm.Potential('likelihood',likelihood(theta,E11,N11))

I am not very familiar with Pymc so I have tried to user other people experience on other feeds but I don’t seem to get the right result.

Thank you in advance
Victor

Hi I’m also trying to replicate that sas code. Have you found a solution to this. if yes can you please share your code.

Thanks in advance

If I understand correctly, the observed N is actually from a mixture of beta-binomial distribution? (formula 6 from DuMouchel, W. (1999)):
image

If that’s the case, expressing it as such is much easier:

with pm.Model() as m:
    Mix_p = pm.Uniform('Mix_p', 0, 1)
    
    alpha1 = pm.Exponential('alpha1', lam=0.2)
    beta1 = pm.Exponential('beta1', lam=0.1)
    
    alpha2 = pm.Exponential('alpha2', lam=2)
    beta2 = pm.Exponential('beta2', lam=4)

    # g1_dist = pm.Gamma.dist(alpha=alpha1, beta=beta1)
    # g2_dist = pm.Gamma.dist(alpha=alpha2, beta=beta2)
    # lmbda = pm.Mixture('lmbda', w = [Mix_p, 1-Mix_p], comp_dists = [g1_dist, g2_dist])

    comp1_dist = pm.BetaBinomial.dist(n=E11, alpha=alpha1, beta=beta1)
    comp2_dist = pm.BetaBinomial.dist(n=E11, alpha=alpha2, beta=beta2)
    obs = pm.Mixture('obs', w = [Mix_p, 1-Mix_p], comp_dists = [comp1_dist, comp2_dist], observed=N11)
    idata = pm.sample()

And then after inference you can use the posterior of Mix_p, alpha1, alpha2, beta1, beta2 to reconstruct the posterior of \lambda following formula 6 & 7 in the paper.
Screenshot 2022-08-25 at 08.00.41