# Separation between model and data/parameters

If you generate `x ~ normal(mu, sigma)` and sort, you get the same distribution as if you have an ordering constraint and apply the same distribution to the ordered variables.

We think the same way in Stan. It’s just a matter of how you’re defining the forward model. In Stan, it looks like this:

``````parameters {
ordered[K] y;
}
model {
y ~ normal(mu, sigma);
}
``````

It’d be the same with the ordered transformation that @jessgrabowski mentioned.

This can be a problem with constraints. The sorting trick is actually how to do Monte Carlo sampling most easily in this setting. You see the same thing in Stan with something like sum-to-zero constraints, which we don’t have built in. Or an ICAR model on spatial effects. It’s not easy to generate a random sum-to-zero or ICAR variate with Monte Carlo, so we have to resort to MCMC for prior and posterior checks (but then we can’t interleave discrete sampling).

P.S. Hope it’s OK to jump on your forums. I’m starting to work with normalizing flows in JAX and need a way to generate JAX models for testing, so I’m checking out PyMC and Pyro.

P.P.S. The original design for Stan assumed a graphical modeling framework like BUGS, but then I realized it was easy to add imperative features to make coding within Stan easier. We’ve come to realize that we gave up a lot of automation through that decision. I really like how data/parameter neutral and graphically-oriented Turing.jl is, but I don’t use Julia.

3 Likes

What do you mean? Something about model being defined separately form the observations/ constraining transformations? Or something else?

Haha definitely OK

Exactly. It’s like BUGS that way. Whereas in PyMC, you have things like `Normal("y", mu=intercept + slope * x, sigma=sigma, observed=y)` where `observed=y` is part of the model code for observations whereas something like `y = Normal(...)` would be used for parameters.

I just did a search through your examples and see you have an example

``````data = pm.MutableData("data", observed_data[0])
...
pm.Normal("y", mu=mu, sigma=1, observed=data)
``````

which gives you some of what I was looking for in terms of allowing you to plug-and-play different data in the same model (I think). But it still builds `observed=data` in. So what I was wondering if there’s a way you could write a single PyMC model that you could use for data simulation to do something like simulation based calibration.

The other thing BUGS does which is super-convenient is allow partially instantiated variables (R directly supports this, too, which is costly in compute but useful). So you can do posterior prediction in a regression of `y` on `x` by having a partially known `y` and fully known `x`. In Stan this all has to get declared up front. We precompile the C++ class for the model with all of the data, parameters and generated quantities declared as such, but we don’t construct an instance until we have data. New data means an entirely new model instance because our models are immutable once constructed.

@bob-carpenter I moved the thread to its own topic.

You can use `pm.observe` to define observed nodes in pre-existing model thse days. You can also do a lot of model mutation with `pm.do`, including setting parameters to specific values, taking prior predictive draws and then conditioning on those for calibration workflows.

Our README shows an example of such workflow: pymc/README.rst at main · pymc-devs/pymc · GitHub

Colab that was used as the basis for that readme section: Google Colab

I didn’t follow this part. Could you give it another try?

BUGS lets you have missing data, for example,

``````y = c(3.2, 1.9, NA, NA)
x = c(1.5, 2.8, 3.1, -1.3)
``````

A BUGS model like

``````y ~ dnorm(alpha + beta * x, sigma)
alpha ~ dnorm(0, 1)
beta ~ dnorm(0, 1);
sigma ~ dlnorm(0, 1)
``````

and use the above `x, y` as data, then you get posterior predictive inference for `y[3]` and `y[4]`. If you write a probabilistic model for `x`, then you can impute missing covariates. It’s super convenient.

Ahh, I see. PyMC does exactly the same when it finds nan in the observation (but not yet when observations are set via the pm.observe command).