Passing normal distributions as mean for

Is it possible to use a vector of normal distributions as mean for a multivariate normal?

#Defined data
x_data
y_data
z_data
with pm.Model() as basic_model:
    x = pm.Normal('xval', mu=x_data.mean(), sigma=x_data.std())
    y = pm.Normal('yval', mu=y_data.mean(), sigma=y_data.std())
    z = pm.Normal('zval', mu=z_data.mean(), sigma=z_data.std())
    
    likelihood = pm.MvNormal(
        "observations", mu=np.array([x, y, z]), cov=np.eye(3,3) * 0.1, 
        observed=np.array([x_data, y_data, z_data])

    trace.sample()

Hi Zepater,

You can do exactly this, but you need to use pytensor functions instead of numpy. You can see some examples and explanation here.

In your case, you could also simplify things greatly by using vectorized distributions. Basically this:

    x = pm.Normal('xval', mu=0, sigma=1)
    y = pm.Normal('yval', mu=1, sigma=2)
    z = pm.Normal('zval', mu=2, sigma=3)
    d = pt.stack([x, y, z])

Is the same as this:

d = pm.Normal('d', mu=[0,1,2], sigma=[1,2,3])

So in your case, you could write:

# Matrix of data:
X = np.c_[x_data, y_data, z_data]

with pm.Model() as basic_model:
    mu = pm.Normal('mu', mu=X.mean(axis=0), sigma=X.std(axis=0))
    y_hat = pm.MvNormal('y_hat', mu=mu, cov=pt.eye(3) * 0.1, observed=X)
    idata = pm.sample()

Thank you for the help!

Before I pass x, y, z to the data, I perform some operations on it.

x = pm.Normal('xval', mu=0, sigma=1)
y = pm.Normal('yval', mu=1, sigma=2)
z = pm.Normal('zval', mu=2, sigma=3)

#Perform some rotations of the x,y,z here

d = pt.stack([x, y, z])
# Matrix of data:
X = np.c_[x_data, y_data, z_data]

with pm.Model() as basic_model:
    y_hat = pm.MvNormal('y_hat', mu=d, cov=pt.eye(3) * 0.1, observed=X)
    idata = pm.sample()

An error I got during the process

ValueError: Incompatible Elemwise input shapes [(length_of_xyz_data, 3), (3,3)]

This snippet runs for me without problems:

X = np.random.normal(size=(100, 3))

with pm.Model() as basic_model:
    x = pm.Normal('xval', mu=0, sigma=1)
    y = pm.Normal('yval', mu=1, sigma=2)
    z = pm.Normal('zval', mu=2, sigma=3)

    #Perform some rotations of the x,y,z here

    mu = pm.Deterministic('mu', pt.stack([x, y, z]))
    cov = pt.eye(3) * 0.1

    y_hat = pm.MvNormal('y_hat', mu=mu, cov=cov, observed=X)
    idata = pm.sample()

If that runs for you, we could take a look at the transformations of x, y, z you are doing?

Yes, that ran without problems!

Here’s the transformations I am doing where I define three angles.

with pm.Model() as basic_model: 
    x = pm.Normal('xval', mu=0, sigma=1)
    y = pm.Normal('yval', mu=1, sigma=2)
    z = pm.Normal('zval', mu=2, sigma=3)
    angle1 = pm.Norma('angle1', mu=0, sigma=np.pi/4)
    angle2 = pm.Norma('angle2', mu=0, sigma=np.pi/4)
    angle3 = pm.Norma('angle2', mu=0, sigma=np.pi/4)

    #Perform some rotations of the x,y,z here

    phi = np.linspace(0,2 * np.pi, 100)
    theta = np.linspace(0, np.pi, 50)
    phi, theta = np.meshgrid(phi, theta)

    xc = x*np.sin(theta) * np.cos(phi)
    yc = y*np.sin(theta) * np.sin(phi)
    zc = z*np.cos(theta)

    R1 = np.array([np.cos(angle1), -np.sin(angle1), 0], [np.sin(angle1), np.cos(angle1), 0], [0,0,1]])
    R2 = np.array([1,0,0], [0, np.cos(angle2), -np.sin(angle2)], [0, np.sin(angle2), np.cos(angle2)]])
    R3 = np.array([[np.cos(angle3), -np.sin(angle3), 0], [np.sin(angle3), np.cos(angle3), 0], [0,0,1]])

   rotated_coords = np.dot(R3, np.dot(R2, np.dot(R1, np.array([xc.flatten(), yc.flatten(), zc.flatten()]))))

   rotated_coords = rotated_coords.reshape(3, -1)

    mu = pm.Deterministic('mu', pt.stack([rotated_coords[0][0], rotated_coords[1][0], rotated_coords[2][0]]))
    cov = pt.eye(3) * 0.1

    y_hat = pm.MvNormal('y_hat', mu=mu, cov=cov, observed=X)
    idata = pm.sample()

You want to get into the habit of using the pytensor functions instead of numpy inside pymc models. Sometimes things work as expected, but it’s not technically supported. In particular, all of those np.array calls with pymc variables will not work as expected.

I guess the only thing missing is meshgrid, which just looks like this:

def pt_meshgrid(*xi):
    ndim = len(xi)
    s0 = (1,) * ndim
    outputs = [x.reshape(s0[:i] + (-1,) + s0[i + 1:]) for i, x in enumerate(xi)]
    return pt.broadcast_arrays(*outputs)

So you should rewrite you code as:

with pm.Model() as basic_model: 
    x = pm.Normal('xval', mu=0, sigma=1)
    y = pm.Normal('yval', mu=1, sigma=2)
    z = pm.Normal('zval', mu=2, sigma=3)
    angle1 = pm.Normal('angle1', mu=0, sigma=np.pi/4)
    angle2 = pm.Normal('angle2', mu=0, sigma=np.pi/4)
    angle3 = pm.Normal('angle3', mu=0, sigma=np.pi/4)

    #Perform some rotations of the x,y,z here

    phi = pt.linspace(0,2 * np.pi, 100)
    theta = pt.linspace(0, np.pi, 50)
    phi, theta = pt_meshgrid(phi, theta)

    xc = x*pt.sin(theta) * pt.cos(phi)
    yc = y*pt.sin(theta) * pt.sin(phi)
    zc = z*pt.cos(theta)

    R1 = pt.stack([[pt.cos(angle1), -pt.sin(angle1), 0], [pt.sin(angle1), pt.cos(angle1), 0], [0,0,1]])
    R2 = pt.stack([[1,0,0], [0, pt.cos(angle2), -pt.sin(angle2)], [0, pt.sin(angle2), pt.cos(angle2)]])
    R3 = pt.stack([[pt.cos(angle3), -pt.sin(angle3), 0], [pt.sin(angle3), pt.cos(angle3), 0], [0,0,1]])

    rotated_coords = pt.dot(R3, pt.dot(R2, pt.dot(R1, pt.stack([xc.flatten(), yc.flatten(), zc.flatten()]))))

    rotated_coords = rotated_coords.reshape((3, -1))

    mu = pm.Deterministic('mu', pt.stack([rotated_coords[0][0], rotated_coords[1][0], rotated_coords[2][0]]))
    cov = pt.eye(3) * 0.1

    y_hat = pm.MvNormal('y_hat', mu=mu, cov=cov, observed=X)
    idata = pm.sample()

That meshgrid function would make a good first PR, you’d just have to write a couple tests for it :slight_smile:

Okay, I will look into that! Thank you! When I evaluated rotated_coords, I obtain the following error.

Value Error: Cannot reshape array of size 100 into shape ()

The code I posted runs for me without problems. What versions of pymc/pytensor are you running?

Version 5.10.3. I think the issue that I found was that the size of the observed data (as opposed to X in this minimum working example) is (2600,3). When I only use a smaller subset, say (50,3), it works. Is there a size limit?

No, there shouldn’t be

What I noticed was that the size of phi, theta in the linspace definitions seem to have an effect of the number of observed values allowed. Could you increase the size of the random X to 200 and see if that causes any issues on your end?

In the code I posted all the computation is independent of X (it’s just used for observations). Could you share exactly the code you are running?