Profiling the CategoricalGibbsMetropolis sampler

ckrapu · March 1, 2022, 9:34pm

For models with 3+ categorical variables and potentially missing data, the sampling time required for the data-complete model (no latent discrete variables) and the data-missing model (many missing discrete variables), the latter takes considerably longer to sample (~5-10x longer) and it appears that the cost of the CategoricalGibbsMetropolis sampler is the reason why. For context, here’s a minimal model that uses both the continuous and discrete samplers:

import pymc3 as pm

n = 1000

with pm.Model() as model:
    p = pm.Dirichlet('x', np.ones(n))
    y = pm.Categorical('y',p=p)
    trace = pm.sample()

I’ve spent some time looking through the PyMC implementation and am a little unsure as to what pieces take the most time to execute and what might be optimized. I was wondering what profiling strategy might work to help shine light on this. I’ve been working in Jupyter, so any extensions in that are fair game.

EDIT: I should also note that this does not appear to be a problem stemming from the posterior geometry under the alternating Gibbs / NUTS scheme. It is still the case when using HMC with a fixed number of steps.

Topic		Replies	Views
CategoricalGibbsMetropolis samples differently when specified explicitly Questions	2	380	October 10, 2019
CategoricalGibbsMetropolis vs HMC Questions	1	880	June 1, 2019
Preferred sampler for categorical predictors? Questions	10	1624	April 14, 2021
Marginalizing over missing categories Questions	1	712	June 17, 2020
Out of memory for simple Categorical model version agnostic	2	375	May 30, 2022

Profiling the CategoricalGibbsMetropolis sampler

Related topics