Building PyMC model programatically

I am new to PyMC but have some experience working with Stan models.
Unfortunately, Stan support in python is not great, and so I am exploring PyMC.

For this project, I would like to build a model programmatically. That is, the user can select prior distributions and parameters for the distributions with a graphical user interface; the model is build according to the specifications and fit to data. So, the model can change and there is a large combination of possible model configurations. For the Stan-based implementation, I have a python function that generates Stan code, which is then compiled and executed. Now, I am wondering if my PyMC approach is valid/fast.

Here is a very simple example of how I would approach building a PyMC model with different priors (and later other differences in the model).
Using if statements outside the model context, model parts are added step-by-step:

import numpy as np
import pymc as pm

use_uniform = True # set by the user

obs = np.random.normal(size=100) + 2.3

mymodel = pm.Model()

if use_uniform:
    with mymodel:
        mean = pm.Uniform('mean', lower=-10, upper=10)
else:
    with mymodel:
        mean = pm.Normal('mean', mu=0, sigma=10)

with mymodel:
    y = pm.Normal('y', mu=mean, sigma=2.0, observed=obs)

    idata = pm.sample(2000)

Is this the right approach, or is there a better one?

I’ve done things like this before, and understand why you’d like to have model logic which you can toggle on and off. My 2 cents: either add logging info or make your variable names (e.g. ‘mean’ in pm.Uniform('mean',...) include something about the model configuration so you can more easily debug / trace where your results come from, in case you have multiple traces or estimates to compare.

If you can represent your model as the sum of multiple RVs, then you can use a pattern like below which can help with understanding the flow and keeping the clutter down.

if mode == 'hierarchical':
    mu += pm.Normal('group_effect',...)
if mode == 'autocorrelated':
    mu += pm.GRW('dynamic_intercept',...)

Thanks, that is very useful. I agree with the point about adding information for better debugging, which I haven’t yet had the pleasure to do (but which will likely be easier than debugging a pystan model).

The resulting model will be somewhat complex. It is for an application in the geosciences and certain parts, like optional use of a lookup table, will require being able to modify more than just the RVs or their priors. The approach of adding pieces of code to the model context, as outlined in the first post, should still work.

@japamat Your approach should be perfectly fine

Excellent, thanks for the comment! I will go ahead with that approach.