Posterior and data mismatch in linear model with observed predictors

Following this question Combination of bayesian models in pymc3
and have more questions. Thanks in advance for your time.

I have defined a linear model with multiple variables so that all nodes are observed
HRmax ~ Age + BMI + HRrest:

with pm.Model() as model:
      df_hrmax = pm.Data('df_hrmax', df[['hrrest', 'age', 'bmi']])

      pred_hrrest = pm.Normal('pred_hrrest', mu=60, sd=20, observed=df_hrmax[:,0])
      pred_age = pm.TruncatedNormal('pred_age', mu=60, sigma=10,lower=20,upper=100, observed=df_hrmax[:,1]
      pred_bmi = pm.Normal('pred_bmi', mu=30, sd=5, observed=df_hrmax[:,2])

      resp_hrmax = pm.Data('resp_hrmax', df['hrmax'])

      intercept_hrmax = pm.Normal('intercept_hrmax', sd=100)
      
      error_std_hrmax = pm.HalfNormal('error_std_hrmax', sd=5)

      c_hrmax_age = pm.Normal('c_hrmax_age', sd=1)
      c_hrmax_bmi = pm.Normal('c_hrmax_bmi ', sd=1)
      c_hrmax_hrrest = pm.Normal('c_hrmax_hrrest ', sd=1)
      
      mu_hrmax = intercept_hrmax + pred_age* c_hrmax_age + pred_bmi*c_hrmax_bmi + pred_hrrest*c_hrmax_hrrest
     
      hrmax = pm.Normal('hrmax', mu=mu_hrmax, sd=error_std_hrmax, observed=resp_hrmax)
  1. Is this ok to define all nodes as observed? What are the downsides of this?

  2. Now when I want to test the model on unseen data and predict hrmax I do

     with model:
           pm.set_data({'df_hrmax ': X_test})
           post_pred = pm.sample_posterior_predictive(trace, samples=nsamples)
    

    while X_test can have some missing values like
    X_test = pd.DataFrame({'age': np.nan, 'bmi':25, 'hrrest': 60}, index=[0]) and then ‘age’ is derived from the model and is very close to its mu= 60,
    but if I know for sure that ‘age’ is 70 (as well as bmi 25 and hrrest 60:
    X_test = pd.DataFrame({'age': 70, 'bmi':25, 'hrrest': 60}, index=[0])
    I expect the prediction to take those TRUE values.
    Instead I get the same distribution no matter which values I pass in X-test.
    What I am doing wrong here?

  3. If I know the response hrmax, how do I visualize distributions of all other nodes, to see what is the most likely age, bmi and hrrest?

  4. If I get more data (predictors and response), how do I update my model? Is there a way to continue training, rather then do it from the beginning with extended df_hrmax?

  5. If I want to compare the model to the data, I overlap the datapoints (red) with posterior samples (blue) of two predictors (xaxis) and response (y axis). NB: variables here are different from age, bmi and hrrest, so the values are different, but they are also modeled as Normals.

I know that it should look like this


but my model generates different plots

Datapoints are the same for top and bottom, also x and y ranges, but in my model generated data doesn’t match actual data: the regression lines are actually opposite.
How to interpret this mismatch in posterior and what do adjust in the model
priors?