Any suggestions on continous error "ValueError: shape mismatch: value array of shape (62,) could not be broadcast to indexing result of shape (0,)"

Han · March 19, 2024, 2:50am

Hello,
I am a new user of Pymc. I can not debug my code because I am really confused about the error: “shape mismatch”. In my mind, the line that located by this error

expend_con = pt.set_subtensor(expend_con[i, ind_tem[i]], concatenated_values)

has no problem and can be debugged, but error appears when pm.sample() runs. I attached the code and the data that needed, I would be really appreciate if anyone can give me some directions.

Many thanks in advance!

Combo_Dmatrix.csv (11.0 KB)
COMBO_OTU.csv (15.5 KB)
truevalues.csv (1.0 KB)
ynew.csv (35.2 KB)

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pymc as pm
import pytensor.tensor as pt
import arviz as az

D_phylo = pd.read_csv('Combo_Dmatrix.csv')
COMBO_OTU = pd.read_csv('COMBO_OTU.csv')
y_obs = pd.read_csv('ynew.csv')
y_true = pd.read_csv('truevalues.csv')

D_phylo = D_phylo.to_numpy()
D_phylo = D_phylo[:,1:]
COMBO_OTU = COMBO_OTU.to_numpy()
COMBO_OTU = COMBO_OTU[:,1:]

y_obs = y_obs.to_numpy()
y_obs = y_obs[:,1:]
y_true = y_true.to_numpy()

K1_combo = int(62 * 0.85)

cov =  1**2* np.exp(-2* 0.25* D_phylo)
samples = y_obs.shape[0]
specise = y_obs.shape[1]
phai = np.zeros([samples,1])

N = np.sum(COMBO_OTU[0:samples,:], axis=1).reshape(-1, 1)

tt = 1 #samples
spe = ['spe{}'.format(i) for i in range(specise)]
sam = ['sam{}'.format(i) for i in range(tt)]

coords = {"specie": spe, "sample": sam}
ind = np.arange(specise)
ind = pt.as_tensor_variable(ind)
cov = pt.as_tensor_variable(cov)

pp = pt.as_tensor_variable(np.zeros([tt,specise]))
true_abundance = pt.as_tensor_variable(np.zeros([tt,specise]))
obs_like = np.array(y_obs[:,:].tolist())
NN = np.sum(np.array(y_obs[:,:].tolist()),axis = 1)
  
if __name__ == '__main__':
    with pm.Model(coords=coords) as Multi:        
        #########Prior########
        alpha = pm.Normal('alpha',mu =  10,   sigma = 1,  shape = specise)       
        theta = pm.Normal('theta', mu = 10 , sigma = 1, shape = specise)
       
        T = pm.HalfNormal('T',sigma = 10)
        d = pm.HalfNormal('d',sigma = 10, shape = specise)
        
        p = pm.Deterministic('p',1-pm.math.exp(-d/T)) #1/(1+pm.math.exp(-alpha))
        delta = pm.Bernoulli("delta",p = p, dims = ("sample", "specie"))   #np.ones([specise])
                               
        ind_tem = [ind[pm.math.lt(delta[i,:],1)] for i in range(tt)]
        pll = [ind_tem[i].shape[0] for i in range(tt)]       
                   
       
        th_t = []
        ct = []
        for i in range(tt):
            theta_tem = pt.as_tensor_variable(np.ones([specise]))
            cov_tem = pt.as_tensor_variable(np.diag(np.ones([specise])))
            th_t.append(pt.set_subtensor(theta_tem[0:pll[i]], theta[ind_tem[i]])) 
            ct.append(pt.set_subtensor(cov_tem[0:pll[i],0:pll[i]], cov[ind_tem[i]][:, ind_tem[i]]))
        
        
        U = pm.MvNormal("U", mu = th_t, cov = ct)    
        U = pt.ones([tt,specise])
        expend_con = pt.zeros([tt,specise])
        
        for i, p2 in enumerate(pll):
            tem = pm.math.sum(pm.math.exp(U[i, :p2-1]))
            concatenated_values = pm.math.concatenate((pm.math.exp(U[i, :p2-1]) / (1 + tem), [1 / (1 + tem)]))
    
            if  pm.draw(p2) == 0:
                expend_con = pt.set_subtensor(expend_con[i,:], concatenated_values)                              
            else:
                expend_con = pt.set_subtensor(expend_con[i, ind_tem[i]], concatenated_values)#concatenated_values[0]
          
        ########Likelihood######## 
                         
        likelihood_y = pm.Multinomial('likelihood', n =NN[0:tt], p = expend_con[0:tt,:], observed = obs_like[0:tt,:]) #N should be total counts of species of each sample

        ########sampling######## 
        trace = pm.sample(step = pm.Slice())#
        az.plot_trace(trace)
        
        ########output########
        summary = az.summary(trace,hdi_prob=0.95,stat_focus = 'mean')
        print(summary)

iavicenna · March 19, 2024, 9:08am

Hi,

Since eval statements are working fine for the likelihood (which means model would have a valid structure at that what ever random point eval choose) but fails to evaluate at the initial point (as suggested by the error stack involving point = model.initial_point()), something in your model does not make sense in the initial point. Indeed if you look at

pll[0].eval( {'delta': np.array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])})

it gives a value of [0]. Later on you use its first element p2 as

U[i, :p2-1]

which is a zero dimensional array which likely is the root of your problems (since the error also says something about not being able to broadcast zero dimensional array with sth else).

You can for instance try setting something like


initvals={"delta":np.array([[0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1,
        0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1,
        0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0]])}

and later

delta = pm.Bernoulli("delta",p = p, dims = ("sample", "specie"),
                             initval=initvals["delta"])

Now the model starts running though very slowly on my end. Don’t know why maybe bad initial point. But I also see that you have a peculiar structure here where

for i, p2 in enumerate(pll):
    tem = pm.math.sum(pm.math.exp(U[i, :p2-1]))
    concatenated_values = pm.math.concatenate((pm.math.exp(U[i, :p2-1]) / (1 + tem), [1 / (1 + tem)]))

    if  pm.draw(p2) == 0:
        expend_con = pt.set_subtensor(expend_con[i,:], concatenated_values)
    else:
        expend_con = pt.set_subtensor(expend_con[i, ind_tem[i]], concatenated_values)#concatenated_values[0]

you update some of your values based on pm.draw(p). I wonder if there is a better way to achieve this (why not just p2==0 for instance? Also probably use switch statement to make it vectorized). Or maybe it is fine I don’t know but it is (in my limited experience) first time seeing a model structure with pm.draw. I will let more experienced folks comment on that.

ps: I have also noticed you are using slice sampling for everything but if you let pymc choose with trace = pm.sample(), it actually runs much faster.

ricardoV94 · March 19, 2024, 9:53am

You don’t want to use vanilla python if/else in PyMC, as only one of the branches will be used from that point on. You need to use pytensor.ifelse.ifelse or pytensor.tensor.switch.

In general if you have a very nested graph you should probably use a Scan for it: scan – Looping in PyTensor — PyTensor dev documentation which is the pytensor way of representing loops.

ricardoV94 · March 19, 2024, 9:54am

This pymc-example may be good to browse: Fitting a Reinforcement Learning Model to Behavioral Data with PyMC — PyMC example gallery

iavicenna · March 19, 2024, 10:18am

Just out of curiosity, which one gets chosen in building the graph, whatever the initial value for p2 decides?

ricardoV94 · March 19, 2024, 11:14am

Yes, all PyTensor sees is the python branch that was actually executed when that code ran

Han · March 28, 2024, 3:43am

Thank you so much for your suggestions. There are indeed solved the problem.

One more quick question:
My current likelihood is Multinomial. When I set the observation matrix 1 * 62, the program took around 4 minutes, but very very slow when observation matrix is 96 * 62. I checked the time needed of codes before pm.sample(), less than 1 s. I wonder, is my data large or the multinomial is inherently slow? Any suggestions on speeding up?

Han · March 28, 2024, 3:45am

Hi,

Thank you so much for helping me debugging and sorry for the late reply. As suggested by @ricardoV94 , I mixed pytensor and “normal” if..else. pytensor.ifelse.ifelse and pytensor.scan solved my problem.

Han · March 28, 2024, 3:49am

I put the revised code for my notes

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pymc as pm
import pytensor.tensor as pt
import arviz as az
import pytensor
import time

D_phylo = pd.read_csv('Combo_Dmatrix.csv')
COMBO_OTU = pd.read_csv('COMBO_OTU.csv')
y_obs = pd.read_csv('ynew3.csv')
y_true = pd.read_csv('truevalues.csv')

D_phylo = D_phylo.to_numpy()
D_phylo = D_phylo[:,1:]
COMBO_OTU = COMBO_OTU.to_numpy()
COMBO_OTU = COMBO_OTU[:,1:]

y_obs = y_obs.to_numpy()
y_obs = y_obs[:,1:]
y_true = y_true.to_numpy()

K1_combo = int(62 * 0.85)
# rho = 0.25
# sigma = 1
cov =  1**2* np.exp(-2* 0.25* D_phylo)
samples = y_obs.shape[0]
specise = y_obs.shape[1]
phai = np.zeros([samples,1])

N = np.sum(COMBO_OTU[0:samples,:], axis=1).reshape(-1, 1)

tt = samples
chain_length = 2000
tune_length = 1000
spe = ['spe{}'.format(i) for i in range(specise)]
sam = ['sam{}'.format(i) for i in range(tt)]

coords = {"specie": spe, "sample": sam}
ind = np.arange(specise)
ind = pt.as_tensor_variable(ind)
cov = pt.as_tensor_variable(cov)

pp = pt.as_tensor_variable(np.zeros([tt,specise]))
true_abundance = pt.as_tensor_variable(np.zeros([tt,specise]))
obs_like = np.array(y_obs[:,:].tolist())
NN = np.sum(np.array(y_obs[:,:].tolist()),axis = 1)



zk = np.arange(specise, dtype='int32')
zk = pt.as_tensor_variable(zk)
zp = np.ones([specise], dtype='int64')*int(62)
zp = pt.as_tensor_variable(zp)

start = time.time() 


def def_test2(i,pll, U,expend_con,ind_tem):
    p2 = pll
    tem = pm.math.sum(pm.math.exp(U[i, :p2-1]))
    concatenated_values = pm.math.concatenate((pm.math.exp(U[i, :p2-1]) / (1 + tem), [1 / (1 + tem)]))
    expend_con = pytensor.ifelse.ifelse(pt.eq(p2,0),
                                        pt.set_subtensor(expend_con[i,:], pt.zeros(62)+1/62),
                                        pt.set_subtensor(expend_con[i, ind_tem], concatenated_values))
    return expend_con
  
if __name__ == '__main__':
    with pm.Model(coords=coords) as Multi:        
        #########Prior########
        #alpha = pm.Normal('alpha',mu =  5,   sigma = 1,  shape = specise)       
        theta = pm.Normal('theta', mu = 1 , sigma = 1, shape = specise)
        
        # T = pm.Gamma('T',7.5,1) #15  gamma or lognormal
        T = pm.Gamma('T',0.01,0.01) #15  gamma or lognormal
        #D = pm.HalfNormal('D',sigma = 1, shape = specise)
        p = pm.Deterministic('p',1-pm.math.exp(-D_phylo[-1,:]/T)) # the last column is the reference taxon
        delta = pm.Bernoulli("delta",p = p, shape = specise)   #np.ones([specise])
        # print(time.time()-start)          
        
        ind_tem = ind[pm.math.lt(delta[:],1)]
        pll = ind_tem.shape[0] 
        
        z_ind_tem = zk   
        ind_tem = pytensor.ifelse.ifelse(~pt.all(pll), z_ind_tem, ind_tem)
        zpll =    pytensor.ifelse.ifelse(~pt.all(pll), ind_tem.shape[0], pll)
        
        th_t = []
        ct = []
        for i in range(1):
            theta_tem = pt.as_tensor_variable(np.ones([specise]))
            cov_tem = pt.as_tensor_variable(np.diag(np.ones([specise])))
            t1 =pt.set_subtensor(theta_tem[0:zpll], theta[ind_tem])
            t2 = pt.set_subtensor(cov_tem[0:zpll,0:zpll], cov[ind_tem][:, ind_tem])
            th_t.append(t1) 
            ct.append(t2)  
               
        U = pm.MvNormal("U", mu = th_t, cov = ct, dims = ("sample", "specie"))            
                               
        i = pt.arange(tt)
        expend_con = pt.zeros([tt,specise])+1e-10
        Results, _ = pytensor.scan(def_test2,
                                    sequences = [i],
                                    non_sequences = [pll, U,expend_con,ind_tem])
        
        for i in range(tt):
            expend_con = pt.set_subtensor(expend_con[i,:],Results[i,i,:])
              
        ########Likelihood######## 
                         
        likelihood_y = pm.Multinomial('likelihood', n =NN[0:tt], p = expend_con, observed = obs_like[0:tt,:]) #N should be total counts of species of each sample
        print(time.time()-start) 
        ########sampling######## 
        trace = pm.sample(chains = 4, cores = 8,  step = pm.Metropolis()) #chain_length, tune = tune_length,
        az.plot_trace(trace)
        
        ########output########
        summary = az.summary(trace,hdi_prob=0.95,stat_focus = 'mean')
        print(summary)

Han · March 28, 2024, 5:43am

I think the slow sampling is related to
U = pm.MvNormal("U", mu = th_t, cov = ct, shape = (tt,specise))
This 62 dimensional normal distribution accounts the most time spend, which is understandable.

iavicenna · March 28, 2024, 9:59am

I have seen in some instances that supplying chol instead of cov speeds up mvnormal sampling. And that seems to be the recommended usage:

https://www.pymc.io/projects/docs/en/stable/api/distributions/generated/pymc.MvNormal.html

Also have you tried NUTs instead of Metropolis? Finally you can also try numpyro:

Topic		Replies	Views
Error during sampling - aesara.tensor.concatenate v5	0	357	December 14, 2022
Shape error when making out-of-sample predictions version agnostic shape_issue , prediction	13	401	January 10, 2024
Shape error using transform v5 theano , bug , modeling	0	385	October 10, 2022
Time-series regression - shape error / Input dimension mis-match Questions time_series	8	1149	August 6, 2022
PyMC/PyTensor Error with Multivariate Normal Distribution: Incompatible Elemwise Input Shape v5 modeling	1	607	December 29, 2023

Any suggestions on continous error "ValueError: shape mismatch: value array of shape (62,) could not be broadcast to indexing result of shape (0,)"

Related topics