Confusion about shape/format of variable

T.J_Gaffney · March 23, 2019, 7:55pm

I’m trying to do a simple multivariate linear regression.

import pymc3 as pm
import theano.tensor as tt

# my_data has dimensions m x n, set elsewhere
# y has dimension n, set elsewhere
# Both are np.array
with pm.Model() as model:
  coeffs = pm.Uniform("coeffs", lower=-1, upper=1, shape=n)
  modeled_y = tt.tensordot(coeffs, my_data)
  obs = pm.Normal("obs", mu=modeled_y, sd=1, observed=y)

I think this works, but I feel uneasy about the way that I enter modeled_y, which is a tensor. In the examples I’ve seen, they’ve passed mu as a np.array. It seems like if modeled_y is a tensor, then I would want y to be a tensor as well. But when I pass observed=tt.as_tensor_variable(y), the results are strange and bad. I am confused what is going on.

Can somebody help me understand this better? coeffs is a set of trial values. modeled_y is a tensor where each element is a set of trial points. Then what is obs; or what would it be if I hadn’t plugged in “observed=y”? Would that be a tensor of the same shape, but with a transformed set of points for each element? When I plug in “observed=y”, with y the same shape as the tensor but as an np.array, does it say (as I think it does) that the i-th element of y is said to be observed from the i-th element of modeled_y? What is it doing if I pass y as a tensor?

Thank you.

junpenglao · March 24, 2019, 9:47am

Internally, you can kind of understand that everything is a tensor, observed=tt.as_tensor_variable(y) should also work, but sometimes there are shape problems. The easiest way to validate is check the logp - if the two ways are different, try making y into shape=(n, 1).

gbernstein · March 24, 2019, 6:15pm

Sorry, gotta run out and don’t have a chance to fully read/respond, but I coincidentally just posted a related topic that might be helpful

T.J_Gaffney · March 26, 2019, 6:56am

I think I did have a shape issue that was causing a problem.

Topic		Replies	Views
Including Observations in the Model, a beginner question version agnostic gaussian_process , modeling , sampling	14	142	October 20, 2024
Shape error when making out-of-sample predictions version agnostic shape_issue , prediction	13	406	January 10, 2024
Sampling semantics of multiple observed variables Questions	5	2570	April 25, 2019
Modeling of multiple regression model with array form v5 development , linear_model , shape_issue , modeling	12	1235	December 17, 2024
Matrix Multiplication With Multiple Dimensions in PYMC Model v5 modeling	4	1094	July 7, 2022

Confusion about shape/format of variable

Related topics