Adding covariate to linear model which is a difference

I have a multilevel linear model on some physiological data. It’s pretty straight forward: 2x2 factorial design with an interaction. I’ve been asked to include a covariate which is the difference between two conditions from the experiment. I have no idea how to include it in a model. The data are in tidy format (i.e., each row has one observation per sample ordered according to factor). The predictors are binary (0/1, e.g., x_1 = 0/1, x_2 = 0/1), while the covariate is a proportion (x_4(0) = 0.76, etc.,). The difference between the conditions that should be represented by the covariate is the same as x_1 (condition A vs. condition B). The reason for the covariate would be to see if there is some variance explained when the behavioral score in condition A is greater than in B, so those people that were better at A vs. B would also exhibit this in the physiological recordings (I’m not sure this makes sense, but that’s what I’ve been asked).

I thought about doing it like this:

y = a + b_1 * x_1 + b_2 * x_2 + b_3 * x_1 * x_2 + b4 * x_4 * x_1

So I model the covariate as an interaction between the proportion and the difference between the conditions in x_1. Does that make sense, or is it complete nonsense?

If you model it as a 2x2 factorial (i.e., your design matrix have 4 columns that each condition is represented uniquely by a row), then the differences of two condition is already encoded in your model.

If you want to see

you can use the posterior sample to compute that (just treat each slice of sample as if they are true parameter and feed into your function, repeat until you iterate through all samples)

Thanks for the quick reply!

The thing is, I have indeed 4 columns (of physiological data) in the design matrix But I have another type of data (proportion correct). And the proportion correct also has 4 columns for each of the 4 conditions. I want to take the difference between 2 of these conditions (proportion correct) and use this difference in the first design matrix (physiological data), as a regressor. But I don’t know how to do that in a linear model.

you can use the posterior sample to compute that (just treat each slice of sample as if they are true parameter and feed into your function, repeat until you iterate through all samples)

I’m not so sure I understand what you mean here. Which function are you thinking about?

Not sure why you would have 4 columns of proportion correct, if the data is in tidy format it should be only 1 response column, so that you can have something like y_hat = X.dot(b).

In this case, it would be a linear function (or rather, a linear contrast) of condition A and B: b_A - b_B