Using a custom step method for latent mixing variables

I’m trying to model a fairly complicated self-exciting spatio-temporal point process model. The big details are in a preprint, but broadly, any event is seen as either drawn from the background (a fixed spatial distribution of events) or triggered by a previous event (a distribution centered at the previous event).

In maximum likelihood it’s fit a lot like a standard mixture model, using expectation-maximization. Get expected mixing proportions given current parameters, use these to maximize the expected log-likelihood to get new parameters, and iterate until convergence.

I’m trying to build a Bayesian version of this. There is a method for these kinds of models which proceeds along similar lines:

  1. Draw a possible mixture structure. That is, assign each event to either the background or to a prior event which triggered it, using the current parameter value estimates.
  2. Condition on this mixture structure and draw from the posterior of each parameter after this conditioning. (The conditioning makes the posteriors much more tractable.)
  3. Return to step 1 and repeat.

I’d like to implement a version of this in PyMC3 so my method is more flexible than my current hard-coded version.

To do this, I gather I’d need to write a custom step method which implements step 1, drawing the mixing structure, so I can use that structure when writing out step 2. I already have a very fast implementation of step 1 using Cython.

The PyMC3 documentation says it “easily incorporates custom step methods”, but I can’t find any examples or further documentation on how to do this. How can I build a sampler which uses a custom method to draw the mixture structure, so I can then use PyMC3’s built-in samplers for the parameters of the model?

(And is this even a reasonable approach to the problem?)



I think “easily incorporates custom step methods” might be an overstatement… we should probably change that in the doc…
One of the potential solutions I can think of (instead of implementing a new step method), is to take advantage of the pm.SMC. The key in SMC is a random method. So in your case, you can write a discrete distribution for your mixture structure, with a random method that generates possible mixture structure (you will also need a logp).
However, this won’t work if you don’t have a logp. In those case, a better solution is a likelihood free inference (e.g.,, we dont have it yet in PyMC3).

By logp do you mean the full log-likelihood? If so, that’s easy enough to calculate, although I’m not sure I understand how SMC works; I’ll need to read the paper.

An alternate approach I considered was simply sampling the mixture structure in my own code, then drawing one sample from the posterior of the remaining parameters given that structure – that is, only using PyMC3 for one sample at a time, and using that sample to draw a new structure. But I’m not sure if that’s reasonable; my simple model now has conjugate posteriors, so I can get exact posterior draws, but if I use MCMC then taking a single draw makes less sense to me.

Alex, did you manage to implement your model in pymc3? If so, could you share how you did it?

I did not – I ended up writing a custom Metropolis-Hastings sampler which draws the mixture structure on each iteration, then does the fairly simple draws from the parameters while conditioning on this structure.

(I also tried implementing the model in Stan directly, without using the latent mixture structure at all, but it is quite slow. The log-likelihood has O(n^2) terms in it, and I think Stan doesn’t like the cost of evaluating its gradients.)