Modeling of multiple regression model with array form

jessegrabowski · July 7, 2023, 2:59am

I don’t really know what “structured multiple regression” means to be honest, but from your code it seems you want to do several regression simultaneously, which is no problem. Here’s a similar example, where a user wanted to run a collection of independent regressions all in one go. In your case, you want to run 3 regressions together on the same MCMC chain, which is no problem. There are just some rules to respect:

Don’t mix numpy and pytensor. As I already stated earlier in the thread, you cannot use numpy functions on PyMC random variables, because they are symbolic pytensor variables, and numpy does not know how to handle these. If you want to group together variables, you should generate them together using the size keyword, pass sequences to the distribution prarameters (from which PyMC will use to infer the size), or use pm.math.stack/pm.math.concatenate (as the case may be).
The batch dimension always needs to come first. There’s a demonstration of this here. It will be important for you to familiarize yourself with how dimensions work to do this kind of problem.

Your specific error comes from breaking rule (1) above (mixing pytensor and numpy). When you do this you will get nonsense error like this one, that result from unexpected interactions between the two packages. Never mix them. But there are other problems in your code:

y has shape (3, 1000). Remember, the batch dimension needs to come first, so we will need to transpose it.
You cannot specify pm.Uniform over \mathbb R. That’s called pm.Flat.
You are taking the sum of Slope_Temp over the batch dimension, so the quantity Intercept + Slope_Temp.sum(axis=1) will be only 3 numbers, because Intercept is shape (3,), x is shape (6, 1000), and Slope is shape (3, 6), so (3,) + ((3,6) @ (6, 1000)).sum(axis=1) = (3,) + (3, 1000).sum(axis=1) = (3,) + (3,) = (3,). meaning that all your regressions will be flat lines. I doubt this is what you intended.

Here is how I would start to re-write your code:

with pm.Model() as Structured_Model:
    Sigma = pm.HalfCauchy('Sigma', beta=[1, 6, 4])
    Intercept = pm.Normal("Intercept", mu=0, sigma=[9,4,5])
    
    Slope = pm.math.stack([
        pm.math.stack([pm.Normal('beta11',10,3),
                       pm.Normal('beta12',1,4), 
                       pm.Flat('beta13'),
                       pm.Normal('beta14',7,1),
                       pm.Normal('beta15',2,0.25),
                       pm.Flat('beta16')]),
        pm.math.stack([pm.Flat('beta21'),
                       pm.Flat('beta22'),
                       pm.Normal('beta23',6,2),
                       pm.Normal('beta24',8,np.sqrt(3)),
                       pm.Normal('beta25',3,1),
                       pm.Normal('beta26',5,10)]),
        pm.math.stack([pm.Normal('beta31',9,1),
                       pm.Flat('beta32'),
                       pm.Normal('beta33',12,0.9),
                       pm.Flat('beta34'),
                       pm.Normal('beta35',12,0.9),
                       pm.Flat('beta36')])])

From here, fix the remaining shape and code errors using the above rules.

Topic		Replies	Views
Matrix Multiplication With Multiple Dimensions in PYMC Model v5 modeling	4	1091	July 7, 2022
Create model matrix Questions	7	1490	June 23, 2021
Hierarchical multiple regression with multiple level groups v5 modeling	5	1141	June 23, 2023
Problem with coords/dims in hierarchical model v5	4	816	January 5, 2023
Multiple output regression with cross-dependent variables Questions	0	497	October 5, 2020

Modeling of multiple regression model with array form

Related topics