Linear regression: Bias (intercept) term "baked into" covariate data?

I’m working with a standard Bayesian linear regression model, **y** ~ N(X theta, sigma^2), where we have k covariates and n individuals so X = n-by-k, theta = k-by-1, and y = n-by-1 matrices. If we want a bias term in this model, it’s easy to have the last element of theta as the bias weight and then have the last covariate, i.e. the last column of X, be all ones.

My issue is all of the posts on pymc3 linear regression I’ve seen, both on this discourse and abroad, model the bias term explicitly as a separate variable, e.g. like this.

Given my problem setup (that goes beyond standard Bayesian linear regression), it’s much easier for me to specify the model with the bias term baked into the covariate data. Is that possible to do within a pymc3 model?

X = pm.distributions.continuous.Normal('X', mu=mu, sd=sd, shape=(n, k))
X = np.concatenate((X, np.ones(n, 1))  # is there a way to do this in pymc3?

Eventually figured out the necessary theano manipulations with help from this post. Figured I’d post anyways in case it helps others. (Mods, feel free to edit/delete as necessary.)

The ones column vector below needs to be created in a thean-ic way. Everything else is standard pymc3. Note that this specific code won’t work for k=1 because of the way pymc3 internally handles shape. This code can be modified as needed to handle that corner case, I’m just not posting it here for clarity. Also note the below code doesn’t incorporate any of the priors needed for actual Bayesian linear regression.

import numpy as np
import pymc3 as pm
import theano.tensor as T

k = 2 # number of covariates, not including bias term
n = 100 # number of individuals
with pm.Model():

    x_temp = pm.distributions.continuous.Normal('X', mu=np.array([0] * k), sd=1, shape=(n, k))
    ones = T.shape_padright(pm.math.ones_like(x_temp[:, 0]))
    
    x = pm.math.concatenate((x_temp, ones), axis=1)

    thetas = pm.distributions.multivariate.MvNormal('t', mu=np.array([0] * (k + 1)),
                                                         cov=np.diag(np.ones(k+1)),
                                                         shape=(1, k + 1))

    y = pm.distributions.continuous.Normal('y', mu=pm.math.flatten(pm.math.dot(thetas, x.T)), sd=1, shape=n)

    trace = pm.sampling.sample()
1 Like