# Combining the outputs from two models trained using different data

I am new to PyMC so this may be a silly question.

Say I have height and age data from School A. I use this to model in PyMC the linear relationship between age and height.

I also have age data for one class from School B. I use this to generate a model of the distribution of the mean age in the class in PyMC.

Is there a way to combine these models to generate an estimate of the height of the class in School B. Perhaps passing the estimate of age from School B into the model of the linear relationship derived from School A?

It sounds like you could use a GLM. You fit the regression with school A and do posterior predictive on school B

Thanks @lucianopaz for the quick reply. It was helpful. I think doing posterior predictive could work.

Is the only way to combine to take samples of schoolB and do a posterior predictive? This produces sampled estimates for schoolB heights but doesn’t produce a Model() of heights which is what I am after.

Here is an example code of where I am so far (ignore the values, I know they don’t make sense as heights/ages):

``````import numpy as np
from pymc3 import Normal, Model, HalfCauchy, sample, sample_ppc
from theano import shared

x = np.array([-0.016,  0.002, -0.028,  0.072, -0.02 ,  0.02 ,  0.054, -0.054,
-0.002,  0.07 ,  0.008,  0.008])
y = np.array([-0.16216216, -0.06122449, -0.10344828,  0.31468531,  0.08843537,
0.11564626,  0.18367347, -0.12162162,  0.        ,  0.24137931,
0.04827586,  0.01369863])

x_shared = shared(x)

with Model() as schoolA_model:

intercept = Normal('intercept', 0, sd=10)
slope = Normal('slope', 0, sd=10)
σ = HalfCauchy('σ', 1)

μ = intercept + slope * x_shared

height = Normal('height', μ, sd = σ, observed = y)

schoolA_sample = sample(1000)

biases = np.array([ 0.0045, -0.02311111, -0.06044444, -0.04533333, -0.05511111,
0.0045, -0.052, -0.042])
stds = np.array([0.02815772, 0.02494216, 0.03052504, 0.02027313, 0.04192984,
0.02815772, 0.03565109, 0.02389561])
ages = np.array([-0.04, -0.04, -0.1 , -0.08, -0.1 , -0.08, -0.06, -0.06])

# model school B
with Model() as schoolB_model:
age = Normal('age', 0, sd=10)
errors = ages - age
age_like = Normal('age_like', errors, sd = stds, observed = biases)

schoolB_sample = sample(2000, tune=2000)

x_shared.set_value(schoolB_sample.get_values('age')[2000:])

with schoolA_model:
post_pred = sample_ppc(schoolA_sample, samples=500)``````

I think I’ve built what I was after - you can put it all in one model! Here is what I ended up with:

``````import numpy as np
from pymc3 import Normal, Model, HalfCauchy, sample

schoolA_age_obs = np.array([-0.016,  0.002, -0.028,  0.072, -0.02 ,  0.02 ,  0.054, -0.054,
-0.002,  0.07 ,  0.008,  0.008])
schoolA_height_obs = np.array([-0.16216216, -0.06122449, -0.10344828,  0.31468531,  0.08843537,
0.11564626,  0.18367347, -0.12162162,  0.        ,  0.24137931,
0.04827586,  0.01369863])

schoolB_bias_obs = np.array([ 0.0045, -0.02311111, -0.06044444, -0.04533333, -0.05511111,
0.0045, -0.052, -0.042])
schoolB_std_obs = np.array([0.02815772, 0.02494216, 0.03052504, 0.02027313, 0.04192984,
0.02815772, 0.03565109, 0.02389561])
schoolB_age_obs = np.array([-0.04, -0.04, -0.1 , -0.08, -0.1 , -0.08, -0.06, -0.06])

with Model() as model:

# model school A
intercept = Normal('intercept', 0, sd=10)
slope = Normal('slope', 0, sd=10)
σ = HalfCauchy('σ', 1)

μ = intercept + slope * schoolA_age_obs

age2height_like = Normal('age2height_like', μ, sd = σ, observed = schoolA_height_obs)

# model school B
schoolB_age = Normal('schoolB_age', 0, sd=10)
errors = schoolB_age_obs - schoolB_age
age_like = Normal('age_like', errors, sd = schoolB_std_obs, observed = schoolB_bias_obs)

# combine relevant variables
schoolB_height_μ = intercept + slope * schoolB_age
schoolB_height = Normal('schoolB_height', schoolB_height_μ, sd = σ)
model_sample = sample(2000, tune=2000)``````