Multivariate model not behaving as expected

import pymc as pm
model = pm.Model()

with model:

mus = pm.Normal("mus", mu=hads["distance_modulus"], 
                tau=1.0 / (hads["distance_modulus_error"]**2),
                shape=(len(hads)), initval=hads["distance_modulus"])

M_0 = pm.Normal("M_0", mu=0, 
                tau=1.0/((5.0 * np.ones(len(band_list)))**2), 
                shape=(len(band_list)))

alpha = pm.Normal("alpha", mu=0, 
                  tau=1.0/((5.0 * np.ones(len(band_list)))**2), 
                  shape=(len(band_list)))

EBV = pm.Normal("EBV", mu=hads['gaia_extinction'], 
                  tau=1.0/(np.ones(len(hads))**2), 
                  shape=(len(hads)), initval=hads['gaia_extinction'])

sigma_by_band = pm.Uniform("sigma_by_band", 0.00001*ones(len(band_list)),shape=(len(band_list)))

b = pm.math.concatenate((mus, M_0, alpha, EBV),axis=0)

all_sigma = pm.math.stack([sigma_by_band[k] for k in observed_apparent_mag_band_mapping])

likelihood = pm.Normal('y',  
    mu = pm.math.dot(X,b),
    tau=1.0/( all_sigma**2 + observed_apparent_mag_errs**2 ), 
    observed=observed_apparent_mags)

here, X is a matrix (numpy.ndarray), I think something is wrong when I use pm.math.dot(X,b), because the model tends to predict all the variables (for example, all the ‘mus’) to be quite close to each other, although they should be quite different. How could I solve the problem?

Can you provide a complete code involving some data and the hads object that you are using for initializing your priors.

Otherwise one thing to do is to break inside the model
and try things like

b.shape.eval()
pm.math.dot(X,b).shape.eval()

to see if you are getting object of the shape you think you
should. Another option is to do prior predictive check

https://www.pymc.io/projects/docs/en/stable/learn/core_notebooks/posterior_predictive.html

to see if your priors generate results for y which is reasonable to start with. Finally similar to above you can also check things like

pm.math.dot(X,b).eval({mus:mu_initi_value, M_0:...})

to check if reasonable initial values for your priors give reasonable mus for your regression.

Finally since your sigma seems very small you may wanna check again by eval

tau = 1.0/( all_sigma**2 + observed_apparent_mag_errs**2 )

is not very large (which would indicate sigma for your likelihood is very small which may be problematic for sampling.

Continuing the discussion from Multivariate model not behaving as expected:

Thanks for your answer! ‘hads’ records the periods and absolute magnitudes of each star (this is a astrophysical project). Actually, if I write the code like this,

likelihood = {}
counter = 0
for i in range(len(band_list)):
    for j in range(len(hads)):
        likelihood[f"m_{counter}"] = pm.Normal(f'm_{counter}',
            mu = mus[j] + alpha[i] * hads['fundamentalized_log_periods'][j] + M_0[i] 
                    + EBV[j] * (extinction_ccm_a[i] * const_R_V + extinction_ccm_b[i]),
            sigma = sigma,
            observed = observed_apparent_mags[counter])
        counter += 1

the result will be alright (but when I use Macbook, it would report ‘bracket nesting level exceeded maximum of 256’), so I think there’s something wrong with the pm.dot. Though the code work in Windows in a dataset of 15 stars, if I expand the dataset to over 100 stars, it would exceed the maximun level. So I may have to use matrix dot product. How could I change the code?

Well if you have a station where you can write the looped and non looped version together in a model (probably a smaller version of the data you are using to make things easier), it is easy to debug where the two values deviate from each other using .eval() statement on random variables. Just be careful that if you say for instance RV.eval() and RV depends on some other parameter a, if you do not initialize a then it will randomly assign it so better to say something like RV.eval({a:1}) (note a is supplied to the dict as a variable not as a string name).

However if i’s are your rows and j’s columns then your normal’s mu should be roughly be

mus[None,:] + alpha[:,None]*Lp[None,:] +
M_0[:,None] + 
EBV[None,:]*(ExtA[:,None] * RV + ExtB[:,None])

where Lp = logperiods, ExtA = extinction_ccm_a etc. And then you can use observed as it is modulo correctly specifying the shape of the likelihood maybe. Can’t check without a fully running code though.

In general if you want to get two (one dimensional) vectors A, B and want to multiply them to get a matrix whose i,j th element is A_iB_j then you can do M = A[:,None] * B[None,:]. I suspect that is what you need to do here a lot to vectorise things.