Stepwise Sampling of Multiple Observed Variables

Hi everyone!

I’m trying to do something fairly “simple”. Imagine I observe a process generating y2 and a sub-process which generates y, where y2 = f(y) I have an assumption about how y comes into place (i.e., I have a model of y) and how y2 and y are related (i.e., about f()).

I want to sample the parameters of the function that constitute y and those that make up the connection of y and y2. So far I just call multiple sample statements with obs= (which works for simpler models) but Im wondering how to approach this properly.

One idea would be the sample first from the posterior of the parameters of y (maybe in a separate call) or try to formulate a prior based on the empirical posterior samples of the parameters of y but I’m a bit lost right now.

This can be set up quite quickly (and as I said works), but for more complex models the chains dont converge.

Here is my MWE:

x = np.random.normal(0, 1, 1000)
m = 5.
b = 2.
y = m*x + b + np.random.normal(0, 1, 1000)

scale = np.random.normal(0, 1, 1000) + 5

y2 = y*scale

with pm.Model() as model:
    m = pm.Normal('m', mu=0, sigma=1)
    b = pm.Normal('b', mu=0, sigma=1)

    error = pm.HalfNormal('error', sigma=1)

    scale = pm.HalfNormal('scale', sigma=1)

    y = pm.Normal('y', mu=m*x + b, sigma=error, observed=y)

    y2 = pm.Normal('y2', mu=y*scale, sigma=1, observed=y2)

    trace = pm.sample(1000, tune=1000, cores=4, chains=4, init="adapt_diag", return_inferencedata=True)


I appreciate your time and help! Thanks!

If I understand correctly, you are already on the right track. Having multiple observed variables is conventional and if one contributes to the generative process of another, that’s fine. If you are having convergence problems, my guess is that the issue lies elsewhere (but that’s a guess). Do you have an example of a model with multiple observed variables that doesn’t converge?

Hi @cluhmann thanks for reaching out!

It took me a while to come up with something more complicated, and in the end, I just decided it’s best to work with a setting as close as possible to my data. I simulated some data and applied my routine, which you can find in a gist here:

We see that even with simulated data, the chains don’t converge and do not fully recover the parameters of the first stage (which I called the “performance measurement phase”); What makes things worse is probably that I observe only a single observation per person in the second stage (which I try to predict based on the performance measures).

Let me know if you need more! I know it’s a bit complicated setup in the Gist, but the modeling part in itself is (hopefully) quite straightforward.

Thank you for your time and patience!