Maximizing custom likelihood


How can I find the above in pymc3?
For example, consider that p(a) is the prior probability distribution over the alpha parameters of the dirichlet. p(x|a) is just the dirichlet function and p(theta | x) is some function of x. How can I marginalise over the x’s and find the maximiser of the likelihood in a?



You can try using the Custom theano Op to do numerical integration. Write down your model in PyMC3 as usual, and then get the model.logpt (which is a tensor of the output log(p(\theta \ | \, x)p(x \ | \,a)p(a))), and use the custom theano numerical integration to integrate re x, and then maximize this new function.

Not sure how straightforward it is though…


That’ll run? OK I’ll try it, post code here and update.

But before that, can you explain what this code here is doing?

import pymc3 as pm
import theano.tensor as tt
import pandas as pd
import numpy as np

pred_arr = [[2,3,4],[5,1,9]]
data = [2,3]
with pm.Model() as model:
    a_vec = pm.HalfNormal('la',sd=100,shape=(3))
    prior = pm.Dirichlet('a',a = a_vec,shape=(3))
    det = pm.Deterministic('truval',np.divide(prior,prior.sum()))
    likelihood = pm.Normal('y',mu =,det),sd = 0.1,observed = np.array(data).T)
    abra = pm.find_MAP()

presumably this isn’t doing what I wrote in the first post, does it just find the alphas which max the probability of the values on the simplex which max the likelihood of the data?

Are there any other tools that you know of that could help solve this problem? Could I write a computational graph representation of it and do the maximisation?

Thank you,
Abhishek Cherath.


No this isn’t doing what you wrote above. Your code above is doing argmax_{(a, x)} log(p(\theta \ | \, x)p(x \ | \,a)p(a) ie maximizing the logp regarding all the input parameters.

In general, you need to find a way to approximate the integral, which is not an easy task. Numerical integration above is one way, otherwise, there are ways such as Laplace approximation, Monte Carlo simulation (writing the integral as an expectation) with tricks like important sampling etc.


I think there might be ways to do what you want using the posterior samples. At least for estimating the marginal likelihood (as one number, not a function as you want), there are multiple ways (e.g., see my blog post on this)