Combining the outputs from two models trained using different data

I am new to PyMC so this may be a silly question.

Say I have height and age data from School A. I use this to model in PyMC the linear relationship between age and height.

I also have age data for one class from School B. I use this to generate a model of the distribution of the mean age in the class in PyMC.

Is there a way to combine these models to generate an estimate of the height of the class in School B. Perhaps passing the estimate of age from School B into the model of the linear relationship derived from School A?

It sounds like you could use a GLM. You fit the regression with school A and do posterior predictive on school B

Thanks @lucianopaz for the quick reply. It was helpful. I think doing posterior predictive could work.

Is the only way to combine to take samples of schoolB and do a posterior predictive? This produces sampled estimates for schoolB heights but doesn’t produce a Model() of heights which is what I am after.

Here is an example code of where I am so far (ignore the values, I know they don’t make sense as heights/ages):

import numpy as np
from pymc3 import Normal, Model, HalfCauchy, sample, sample_ppc
from theano import shared

x = np.array([-0.016,  0.002, -0.028,  0.072, -0.02 ,  0.02 ,  0.054, -0.054,
       -0.002,  0.07 ,  0.008,  0.008])
y = np.array([-0.16216216, -0.06122449, -0.10344828,  0.31468531,  0.08843537,
        0.11564626,  0.18367347, -0.12162162,  0.        ,  0.24137931,
        0.04827586,  0.01369863])

x_shared = shared(x)

with Model() as schoolA_model:

    intercept = Normal('intercept', 0, sd=10)
    slope = Normal('slope', 0, sd=10)
    σ = HalfCauchy('σ', 1)
    
    μ = intercept + slope * x_shared
    
    height = Normal('height', μ, sd = σ, observed = y)
    
    schoolA_sample = sample(1000)

biases = np.array([ 0.0045, -0.02311111, -0.06044444, -0.04533333, -0.05511111,
        0.0045, -0.052, -0.042])
stds = np.array([0.02815772, 0.02494216, 0.03052504, 0.02027313, 0.04192984,
       0.02815772, 0.03565109, 0.02389561])
ages = np.array([-0.04, -0.04, -0.1 , -0.08, -0.1 , -0.08, -0.06, -0.06])

# model school B
with Model() as schoolB_model:
    age = Normal('age', 0, sd=10)
    errors = ages - age
    age_like = Normal('age_like', errors, sd = stds, observed = biases)
    
    schoolB_sample = sample(2000, tune=2000)
    
x_shared.set_value(schoolB_sample.get_values('age')[2000:])

with schoolA_model:
    post_pred = sample_ppc(schoolA_sample, samples=500)

I think I’ve built what I was after - you can put it all in one model! Here is what I ended up with:

import numpy as np
from pymc3 import Normal, Model, HalfCauchy, sample

schoolA_age_obs = np.array([-0.016,  0.002, -0.028,  0.072, -0.02 ,  0.02 ,  0.054, -0.054,
       -0.002,  0.07 ,  0.008,  0.008])
schoolA_height_obs = np.array([-0.16216216, -0.06122449, -0.10344828,  0.31468531,  0.08843537,
        0.11564626,  0.18367347, -0.12162162,  0.        ,  0.24137931,
        0.04827586,  0.01369863])


schoolB_bias_obs = np.array([ 0.0045, -0.02311111, -0.06044444, -0.04533333, -0.05511111,
        0.0045, -0.052, -0.042])
schoolB_std_obs = np.array([0.02815772, 0.02494216, 0.03052504, 0.02027313, 0.04192984,
       0.02815772, 0.03565109, 0.02389561])
schoolB_age_obs = np.array([-0.04, -0.04, -0.1 , -0.08, -0.1 , -0.08, -0.06, -0.06])


with Model() as model:

    # model school A
    intercept = Normal('intercept', 0, sd=10)
    slope = Normal('slope', 0, sd=10)
    σ = HalfCauchy('σ', 1)
    
    μ = intercept + slope * schoolA_age_obs
    
    age2height_like = Normal('age2height_like', μ, sd = σ, observed = schoolA_height_obs)

    # model school B
    schoolB_age = Normal('schoolB_age', 0, sd=10)
    errors = schoolB_age_obs - schoolB_age
    age_like = Normal('age_like', errors, sd = schoolB_std_obs, observed = schoolB_bias_obs)
    
    # combine relevant variables
    schoolB_height_μ = intercept + slope * schoolB_age
    schoolB_height = Normal('schoolB_height', schoolB_height_μ, sd = σ)
    model_sample = sample(2000, tune=2000)