Bayesian model with constraints

Hi everyone,

I have a model that I can simplify to this:

beta_se = pm.HalfNormal('beta_se', 1.)
err_se = pm.HalfNormal('err_se', 1.)
beta= pm.Normal('beta', 0, beta_se, shape=(m, n))
    
y_pred_train = pm.Deterministic('y_pred_train ', tt.dot(X_train, beta))
y_pred_test = pm.Deterministic('y_pred_test ', tt.dot(X_test, beta))
 
##### not sure how to do this step:
if (y_pred_test >= y_test).all():
  y_obs = pm.Normal('y_lik', mu=y_pred_train , sd=err_se, observed=y_train)
else:
  likelihood = infinite

Basically: Iā€™m only interested in the B that cause my predictions of y in the prediction period to be greater than or equal to the true values of y_pred. Is there an idiomatic way to write this in pymc3? I can easily code this up in emcee - but - the chains I get take forever to converge and have a very poor autocorrelation.

You can add a pm.Potential(COND * -np.inf) effectively makes the likelihood 0 when COND is 1; this could be a problem if you can NEVER sample something with a positive probability, though, so you should put a valid test point.

So in your case:

y_obs = pm.Normal("y_lik", mu=y_pred_train, sigma=err_se, observed=y_train)
y_constr = pm.Potential("y_constr", tt.cast((y_pred_test >= y_test).all(), int) * (-np.inf) )

I think a better way to do this would be to define your own continuous Distribution and do the likelihood manipulation there. Or pm.HalfNormal with another parameter for shift; that might be the easiest to do, actually.

1 Like