Hi @bob-carpenter,
Reading through your solution, I think you may have misinterpreted my goal, due to my admittedly poor naming convention.
To be more explicit: (X_seen, Y_seen) are known ahead of time, and can be thought of as the training data which determines the regression model’s posterior. It seems like you thought I was saying (X_unseen, Y_unseen) were both unknown, which is sensible interpretation given the naming… in actuality, Y_unseen is observed; it was solely named that to indicate that even though we observed these Y_unseen, we did not get to observe the corresponding X_unseen values that generated them. The goal is purely to find the posteriors for X_unseen, ie what were the plausible values for X_unseen that would generate the Y_unseen data.
A better naming convention might have been:
- X_seen (observed)
- Y_xseen (observed
- X_unseen (unobserved)
- Y_xunseen (observed)
In the code snippet I originally posted, you can see the logic, where:
X_seen comes from fixed data
X_unseen comes from priors
X is the concatenation of those two
Y comes from fixed data and contains Y_seen and Y_unseen in the corresponding order as X