Multidimensional indexing

Hi,

Quite a simple question, but could not find an answer - I am trying to draw from a gaussian shape (2,3), then index rows with one independent variable, and columns with another.

As in Ip = Ip_mu[v_i, v_h], in the following:

obs = np.random.normal(0,1, size=(20,1))
v_i = np.random.binomial(1,0.5, size=(20,1))
v_h = np.random.binomial(2,0.5, size=(20,1))
N_i = len(np.unique(v_i))
N_h = len(np.unique(v_h))

model = pm.Model()
with model:
    Ip_mu = pm.Normal('Ip_mu', mu=0, sd=1, shape = (N_i, N_h))
    Ip = Ip_mu[v_i, v_h]
    nx = pm.Normal('nx', mu=Ip, sd = 1, observed=obs)

…/lib/python3.6/site-packages/theano/tensor/subtensor.py:2190: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use arr[tuple(seq)] instead of arr[seq]. In the future this will be interpreted as an array index, arr[np.array(seq)], which will result either in an error or a different result.

I get the above warning - which I do not really understand.

I am wondering if I should go about this in a different way?

Thanks

P.S. I realize the toy example doesn’t make much sense - just constructed it to clarify the question.

I just ran your model, but I’m not getting any error.

Right, so on my system it gives the warning I mentioned (not an error). Maybe it’s a version thing (running version 3.5, on python 3.6).

But in any case, I was more worried about whether indexing the RV like this: Ip = Ip_mu[v_i, v_h]
is actually how this is meant to be done (rather than the warning itself). Can’t find it documented anywhere.

In other words, I am wondering if someone can confirm that pymc3 internally does something equivalent to this (in terms of indexing):

Ip = np.zeros(np.shape(obs))
for ix_row in range(len(obs)):
    Ip_mu = np.random.normal(0,1, size=(N_i, N_h))
    Ip[ix_row] = Ip_mu[v_i[ix_row], v_h[ix_row]]

Thanks

sorry, not sure about that. I hope someone else can help.

The indexing behavior in theano is the same as in numpy. If you are unsure about it, you can also assign a test value to Ip_mu and validate the Ip using Ip.tag.test_value:

test_value = np.random.randn(N_i, N_h)
with model:
    Ip_mu = pm.Normal('Ip_mu', mu=0, sd=1, shape=(N_i, N_h), testval=test_value)
    Ip = Ip_mu[v_i, v_h]

Ip2 = np.zeros(np.shape(obs))
for ix_row in range(len(obs)):
    Ip2[ix_row] = test_value[v_i[ix_row], v_h[ix_row]]

assert_equal(lp2, lp.tag.test_value)
1 Like

I guess you are using the latest Numpy release 1.15.0?

Your warning is the result of Numpy deprecating multidimensional indexing with anything but a tuple. See release notes for Numpy 1.15.0 Downstream libraries like Theano will need to update their indexing code.

3 Likes

Thanks for the testing tip, @junpenglao - very useful. And @JWarmenhoven that does seem to be the source of the warning (that is my numpy v.).

I started getting the same warning for my models after I updated numpy… is it fine to ignore?

Yes.
Downstream libraries such as Theano will most likely have been updated before Numpy, in a future version, raises errors on non-tuple indexing of ndarrays.