Linear regression: Bias (intercept) term "baked into" covariate data?

Eventually figured out the necessary theano manipulations with help from this post. Figured I’d post anyways in case it helps others. (Mods, feel free to edit/delete as necessary.)

The ones column vector below needs to be created in a thean-ic way. Everything else is standard pymc3. Note that this specific code won’t work for k=1 because of the way pymc3 internally handles shape. This code can be modified as needed to handle that corner case, I’m just not posting it here for clarity. Also note the below code doesn’t incorporate any of the priors needed for actual Bayesian linear regression.

import numpy as np
import pymc3 as pm
import theano.tensor as T

k = 2 # number of covariates, not including bias term
n = 100 # number of individuals
with pm.Model():

    x_temp = pm.distributions.continuous.Normal('X', mu=np.array([0] * k), sd=1, shape=(n, k))
    ones = T.shape_padright(pm.math.ones_like(x_temp[:, 0]))
    
    x = pm.math.concatenate((x_temp, ones), axis=1)

    thetas = pm.distributions.multivariate.MvNormal('t', mu=np.array([0] * (k + 1)),
                                                         cov=np.diag(np.ones(k+1)),
                                                         shape=(1, k + 1))

    y = pm.distributions.continuous.Normal('y', mu=pm.math.flatten(pm.math.dot(thetas, x.T)), sd=1, shape=n)

    trace = pm.sampling.sample()
1 Like