Building PyMC model programatically

japamat · November 20, 2022, 10:08pm

I am new to PyMC but have some experience working with Stan models.
Unfortunately, Stan support in python is not great, and so I am exploring PyMC.

For this project, I would like to build a model programmatically. That is, the user can select prior distributions and parameters for the distributions with a graphical user interface; the model is build according to the specifications and fit to data. So, the model can change and there is a large combination of possible model configurations. For the Stan-based implementation, I have a python function that generates Stan code, which is then compiled and executed. Now, I am wondering if my PyMC approach is valid/fast.

Here is a very simple example of how I would approach building a PyMC model with different priors (and later other differences in the model).
Using if statements outside the model context, model parts are added step-by-step:

import numpy as np
import pymc as pm

use_uniform = True # set by the user

obs = np.random.normal(size=100) + 2.3

mymodel = pm.Model()

if use_uniform:
    with mymodel:
        mean = pm.Uniform('mean', lower=-10, upper=10)
else:
    with mymodel:
        mean = pm.Normal('mean', mu=0, sigma=10)

with mymodel:
    y = pm.Normal('y', mu=mean, sigma=2.0, observed=obs)

    idata = pm.sample(2000)

Is this the right approach, or is there a better one?

ckrapu · November 21, 2022, 4:02am

I’ve done things like this before, and understand why you’d like to have model logic which you can toggle on and off. My 2 cents: either add logging info or make your variable names (e.g. ‘mean’ in pm.Uniform('mean',...) include something about the model configuration so you can more easily debug / trace where your results come from, in case you have multiple traces or estimates to compare.

If you can represent your model as the sum of multiple RVs, then you can use a pattern like below which can help with understanding the flow and keeping the clutter down.

if mode == 'hierarchical':
    mu += pm.Normal('group_effect',...)
if mode == 'autocorrelated':
    mu += pm.GRW('dynamic_intercept',...)

japamat · November 22, 2022, 12:00am

Thanks, that is very useful. I agree with the point about adding information for better debugging, which I haven’t yet had the pleasure to do (but which will likely be easier than debugging a pystan model).

The resulting model will be somewhat complex. It is for an application in the geosciences and certain parts, like optional use of a lookup table, will require being able to modify more than just the RVs or their priors. The approach of adding pieces of code to the model context, as outlined in the first post, should still work.

ricardoV94 · November 22, 2022, 1:47pm

@japamat Your approach should be perfectly fine

japamat · November 23, 2022, 8:16pm

Excellent, thanks for the comment! I will go ahead with that approach.

Topic		Replies	Views
Smooth Transition to PyMC3 v5	1	148	June 24, 2024
Difference in results between PyMC and STAN version agnostic modeling	4	814	July 5, 2022
Porting stan model to PyMC version agnostic	14	851	August 5, 2022
Different values between pymc3 and stan version agnostic modeling	2	630	March 31, 2022
Translating a Stan model to the most performant PyMC3 form Questions	1	746	October 27, 2021

Building PyMC model programatically

Related topics