Hi everyone,
I’m trying to create a model with multiple observed variables while trying to also investigate the dependency between these variables.
Example: I have some observed features x1, x2 and x3 and some observed output variables y1, y2. I want to do a bayesian regression using these variables, also using y1 as feature to perform to regression in y2 and vice-versa. Here is a code example:
alpha_y1 = pm.Normal('alpha_y1', mu=0, sd=1, shape=100)
alpha_y2 = pm.Normal('alpha_y2', mu=0, sd=1)
beta_x1 = pm.Normal('beta_x1', mu=0, sd=1)
beta_x2 = pm.Normal('beta_x2', mu=0, sd=1)
beta_x3 = pm.Normal('beta_x2', mu=0, sd=1)
beta_y1 = pm.Normal('beta_y1', mu=0, sd=1)
beta_y2 = pm.Normal('beta_y2', mu=0, sd=1)
p_y1 = pm.math.invlogit(alpha_y1[cat_idx] + beta_x1 * x1 + beta_y2 * y2) # varying effect indexed by some category + linear comb of other variable
y1 = pm.Bernoulli('y1', p=p_y1, observed=y1_obs)
mu_y2 = alpha_y2 + beta_y1 * p_y1[cat_idx]
y2 = pm.Normal('y2', mu=mu_y2, sd=1, observed=y2_obs)
Is this possible in PyMC3? What’s bugging me is the fact that the shape of p_y1 (100) is not necessarily equal to the shape of the observed arrays. That’s why I am being confused by the indexes here.
Thanks in advance for your help.