Two likelihoods dependent on missing predictors

jsevo · August 21, 2019, 9:55pm

Hi,

I am trying to solve a silly indexing issue and realize I’m not sure I understand pymc3 syntax. Suppose I want to model a y that is normally distributed, and have predictors x0 and x1. x1 has only sometimes been measured, but I do not wish to impute it. Instead, I want to define two likelihoods for y. The two likelihoods would have different linear models, the less complex one where x1 is missing has, instead of x1 as a predictor, an additional residual uncertainty (sigma). Thus, when complete data is available, I want to model y like so:

y \sim \mathcal{N}(\mu, \sigma) \\ \mu = a+\beta_0 * x_0 + \beta_1 * x_1

and in the case I only know x0:

y \sim \mathcal{N}(\mu, \sigma + \sigma_{x1}) \\ \mu = a+\beta_0 * x_0

In stan I can do this like so (with N observations, and x1_measured an indicator vector):

   // likelihood, which we only evaluate conditionally
   if(run_estimation==1){
        for (i in 1:N) {
           if (x1_measured[i]==1) y[i] ~ normal(mu[i] , sigma ) ;
           if (x1_measured[i]==0) y[i] ~ normal(mu[i] , sigma+sigma_x1 ) ; // more uncertainty when non-measured 
  }

I tried this approach in pymc3 (after defining priors etc, idx_without_m is an array of indeces):

   with pm.Model() as model: 
       # define priors...(not shown)
       mu_with = a + beta_x0*x0 + beta_x1*x1
       mu_without = a + beta_x0 * x0 

       y_likelihood_with = pm.Normal("y_with",mu_with, sigma, observed=np.array(dw)[idx_with_m])
       y_likelihood_without = pm.Normal("y_lik_without",mu_without, sigma+sigma_m, observed=np.array(dw)[idx_without_m])

Is this “canonical”? Should I used Theano shared variables etc?

aseyboldt · August 22, 2019, 2:30am

That looks fine. (At least pymc syntax wise, I really don’t know about the imputation/non-imputation).
There is no need to use theano shared variables, that is only something you can use to switch out datasets.
Shouldn’t the second sigma be something like tt.sqrt(sigma**2 + sigma_m**2)?

jsevo · August 22, 2019, 7:59pm

Thanks! And you are right.

Topic		Replies	Views
What does the hierarchical model look like when having missing in observed? Questions	12	1155	October 31, 2018
Calculating the likelihood based on not missing observed values Questions	5	1978	June 12, 2018
Measurement Uncertainties in Multiple Regression Questions	3	1441	April 28, 2020
Masking missing values of predictors Questions	3	1365	July 10, 2020
OOS predictions with missing input values Questions	6	4121	November 22, 2018

Two likelihoods dependent on missing predictors

Related topics