Separation between model and data/parameters

bob-carpenter · July 23, 2024, 3:19pm

If you generate x ~ normal(mu, sigma) and sort, you get the same distribution as if you have an ordering constraint and apply the same distribution to the ordered variables.

We think the same way in Stan. It’s just a matter of how you’re defining the forward model. In Stan, it looks like this:

parameters {
  ordered[K] y;
}
model {
  y ~ normal(mu, sigma);
}

It’d be the same with the ordered transformation that @jessgrabowski mentioned.

This can be a problem with constraints. The sorting trick is actually how to do Monte Carlo sampling most easily in this setting. You see the same thing in Stan with something like sum-to-zero constraints, which we don’t have built in. Or an ICAR model on spatial effects. It’s not easy to generate a random sum-to-zero or ICAR variate with Monte Carlo, so we have to resort to MCMC for prior and posterior checks (but then we can’t interleave discrete sampling).

P.S. Hope it’s OK to jump on your forums. I’m starting to work with normalizing flows in JAX and need a way to generate JAX models for testing, so I’m checking out PyMC and Pyro.

P.P.S. The original design for Stan assumed a graphical modeling framework like BUGS, but then I realized it was easy to add imperative features to make coding within Stan easier. We’ve come to realize that we gave up a lot of automation through that decision. I really like how data/parameter neutral and graphically-oriented Turing.jl is, but I don’t use Julia.

ricardoV94 · July 23, 2024, 8:54pm

What do you mean? Something about model being defined separately form the observations/ constraining transformations? Or something else?

Haha definitely OK

bob-carpenter · July 24, 2024, 3:13pm

Exactly. It’s like BUGS that way. Whereas in PyMC, you have things like Normal("y", mu=intercept + slope * x, sigma=sigma, observed=y) where observed=y is part of the model code for observations whereas something like y = Normal(...) would be used for parameters.

I just did a search through your examples and see you have an example

data = pm.MutableData("data", observed_data[0])
...
pm.Normal("y", mu=mu, sigma=1, observed=data)

which gives you some of what I was looking for in terms of allowing you to plug-and-play different data in the same model (I think). But it still builds observed=data in. So what I was wondering if there’s a way you could write a single PyMC model that you could use for data simulation to do something like simulation based calibration.

The other thing BUGS does which is super-convenient is allow partially instantiated variables (R directly supports this, too, which is costly in compute but useful). So you can do posterior prediction in a regression of y on x by having a partially known y and fully known x. In Stan this all has to get declared up front. We precompile the C++ class for the model with all of the data, parameters and generated quantities declared as such, but we don’t construct an instance until we have data. New data means an entirely new model instance because our models are immutable once constructed.

ricardoV94 · July 24, 2024, 3:59pm

@bob-carpenter I moved the thread to its own topic.

You can use pm.observe to define observed nodes in pre-existing model thse days. You can also do a lot of model mutation with pm.do, including setting parameters to specific values, taking prior predictive draws and then conditioning on those for calibration workflows.

Our README shows an example of such workflow: pymc/README.rst at main · pymc-devs/pymc · GitHub

ricardoV94 · July 24, 2024, 4:38pm

Colab that was used as the basis for that readme section: Google Colab

ricardoV94 · July 24, 2024, 4:41pm

I didn’t follow this part. Could you give it another try?

bob-carpenter · July 24, 2024, 5:42pm

BUGS lets you have missing data, for example,

y = c(3.2, 1.9, NA, NA)
x = c(1.5, 2.8, 3.1, -1.3)

A BUGS model like

y ~ dnorm(alpha + beta * x, sigma)
alpha ~ dnorm(0, 1) 
beta ~ dnorm(0, 1);
sigma ~ dlnorm(0, 1)

and use the above x, y as data, then you get posterior predictive inference for y[3] and y[4]. If you write a probabilistic model for x, then you can impute missing covariates. It’s super convenient.

ricardoV94 · July 24, 2024, 5:51pm

Ahh, I see. PyMC does exactly the same when it finds nan in the observation (but not yet when observations are set via the pm.observe command).

Topic		Replies	Views
How to generate an increasing sequence? version agnostic	10	120	July 24, 2024
Building PyMC model programatically version agnostic modeling	4	560	November 23, 2022
Simple generative model, but divergent Questions	0	481	November 4, 2020
Specifying prior parameter as input in data section? Questions	2	286	November 18, 2021
Did anybody replicate the ADVI examples in the "Automatic Variational Inference in Stan" paper? Questions	4	1772	April 9, 2024

Separation between model and data/parameters

Related topics