I’m trying to do a simple multivariate linear regression.
import pymc3 as pm
import theano.tensor as tt
# my_data has dimensions m x n, set elsewhere
# y has dimension n, set elsewhere
# Both are np.array
with pm.Model() as model:
coeffs = pm.Uniform("coeffs", lower=-1, upper=1, shape=n)
modeled_y = tt.tensordot(coeffs, my_data)
obs = pm.Normal("obs", mu=modeled_y, sd=1, observed=y)
I think this works, but I feel uneasy about the way that I enter modeled_y, which is a tensor. In the examples I’ve seen, they’ve passed mu as a np.array. It seems like if modeled_y is a tensor, then I would want y to be a tensor as well. But when I pass observed=tt.as_tensor_variable(y), the results are strange and bad. I am confused what is going on.
Can somebody help me understand this better? coeffs is a set of trial values. modeled_y is a tensor where each element is a set of trial points. Then what is obs; or what would it be if I hadn’t plugged in “observed=y”? Would that be a tensor of the same shape, but with a transformed set of points for each element? When I plug in “observed=y”, with y the same shape as the tensor but as an np.array, does it say (as I think it does) that the i-th element of y is said to be observed from the i-th element of modeled_y? What is it doing if I pass y as a tensor?
Thank you.