Hi, I am somewhat new to PyMC but with some modelling experience (though not with latent models)

I am trying to estimate a latent (generalization) factor on a language model, where we observe its performance on a set of tasks. The performance can be a continuous value between 0-1 (rarely if ever 0 or 1).

In a simplified case we expect a DAG somewhat like:

g → performance on task A

g → performance on task B

g → performance on task C

the model currently (simplified) looks like:

```
coords = {"coeffs": tasks}
with pm.Model(coords=coords) as model:
g = pm.Normal("g", mu=0, sigma=1) # generalization factor of the model
phi = pm.Gamma("phi", alpha=2, beta=2)
for task in tasks:
# create observed variable
task_score = pm.MutableData(task, X_train[task])
# we assume that each task has its own intercept:
task_intercept = pm.Normal(f"{task}_intercept", mu=0, sigma=1)
mu_linear = g + task_intercept
mu = pm.math.invlogit(mu_linear)
# we parameterize the beta distribution to allow for a linear model
alpha = mu * phi
beta = (1 - mu) * phi
y = pm.Beta(f"{task}_y_obs", alpha=alpha, beta=beta, observed=task_score)
```

This model works (though could probably work better), however, to validate the model I am trying to estimate the performance on a task (let us say “A”) from some of the other tasks (“B” and “C”). I believe this should be possible. By setting the observed values for task_score and inferring the remaining.

The closest thing I believe I have to a solution is:

```
# generate predictions
task_to_predict = "A"
observed_tasks = [col for col in labels if col != task_to_predict]
with model:
# set observed values
for task in observed_tasks:
model[task].observations = X_test[task]
# sample
posterior_predictive = pm.sample_posterior_predictive(trace, var_names=['C'])
```

However this posterior just reflect the data within the trained samples (X_train). Not the newly observed data.