# Matrix operations involving observed and latent variables

Hi all,

Long time Stan user giving PyMC3 a shot. I’m trying to wrap my head around implementing a custom log likelihood that involves matrix operations with observed data and latent variables. Specifically, I don’t understand how to do math with random variables and `TensorVariables`.

Here’s an dumb model that illustrates my confusion (this is not the model I actually want to implement but illustrates where I’m getting tripped up). The generative model is

``````vg, ve ~ priors
b | vg ~ MVN(0, C*vg)
y | b, ve ~ MVN(Xb, I * ve)
``````

where `C` is a known m-by-m matrix and `b` is a vector of latent effects. I simulate these data via:

``````import numpy as np
import pymc3 as pm
from pymc3 import math as pmm

n=200
m=100
vg=.4
ve=.6
X = np.random.multivariate_normal(np.zeros(m), np.eye(m), size=n)
u = .1
C = (np.eye(m) + np.outer(u,u))
S = C*vg/m
beta = np.random.multivariate_normal(np.zeros(m), S)
e = np.random.multivariate_normal(np.zeros(n), np.eye(n)*ve, size=1)
y = X @ beta + e
``````

I have no trouble implementing this using the built-in MVN functions. Here’s a working PyMC3 implementation:

``````with pm.Model() as lmm:
## theta
vg  = pm.HalfCauchy(name='vg', beta=1.0);
ve  = pm.HalfCauchy(name='ve', beta=1.0);

## b | theta
b = pm.MvNormal(name='b',mu=np.zeros(m),cov=C*vg/m,shape=(1,m))
linpred = b @ X.T

## y | Xb, theta
loglik = pm.Normal(name="loglik", mu=linpred, sigma=ve, observed=y, shape =n)
``````

Even though `b` is a latent vector and `X` is some observed data, there’s no problem computing `linpred`. However, when I try to implement the MVN lpdf manually, I run into `TypeError: unsupported operand type(s) for @: 'TensorVariable' and 'MultiObservedRV'`. Here’s my broken attempt to implement this manually:

``````with pm.Model() as lmm2:
## theta
vg  = pm.HalfCauchy(name='vg', beta=1.0);
ve  = pm.HalfCauchy(name='ve', beta=1.0);

## b | theta
def mvn_lpdf(C):
cov=C*vg/m
return -(pmm.logdet(cov)+ pmm.dot(pmm.matrix_inverse(cov) @ b,b))
b = pm.DensityDist(name='b',logp=mvn_lpdf,observed={"C":C})
linpred = b @ X.T

## y | Xb, theta
loglik = pm.Normal(name="loglik", mu=linpred, sigma=ve, observed=y, shape =n)
``````

What’s causing the type errors in the ` return -(pmm.logdet(cov)+ pmm.dot(pmm.matrix_inverse(cov) @ b,b))` line?

Thanks,
Richard

Welcome!

I can’t get your example to run at all because the `b` in that same line is undefined. what am I missing?

Quick note:

in this line both `b`s should be `b.T` but I don’t appear to be able to edit this.

To answer your question, my understanding of PyMC3 is worse than I realized! I didn’t know that a variable defined in the context of one model `lmm` is within scope for a separate model `lmm2`. Since I was running `lmm` first, I didn’t even get the undefined error, so I’ve been trying to debug the wrong thing.

`b` is supposed to be a latent variable with density `mvn_lpdf`. I would like model `lmm2` to be the same as `lmm`. It’s not clear to me how I define a latent variable in terms of its density. FWIW, in Stan, I’d construct a function `mvn_lpdf(b,C, vg)`, declare `b` as a vector parameter, and then call `target+=mvn_lpdf( b | C, vg)` in the model block.

You can check out the documentation for `DensityDist` here and the developers guide here for some info about the signature of the logp function passed when creating a `DensityDist` object.

I believe that the following is a working model that is roughly halfway between the two models you presented. It uses a DensityDist to define a custom RV, but instead of hand-coding the `logp()`, I just used the `logp()` from `pm.MvNormal`. But hopefully that illustrates how to use the DensityDist. Now you just need to replace the `logp()` with your own custom function. The `random` argument is used during prior predictive sampling and may not be needed in your case (not 100% on that).

``````with pm.Model():
## theta
vg  = pm.HalfCauchy(name='vg', beta=1.0);
ve  = pm.HalfCauchy(name='ve', beta=1.0);

mvnormdist = pm.MvNormal.dist(mu=np.zeros(m),
cov=C*vg/m,
shape=(1,m))
b = pm.DensityDist(
'b',
logp=mvnormdist.logp,
random = mvnormdist.random,
shape=(1,m)
)
linpred = b @ X.T

## y | Xb, theta
loglik = pm.Normal(name="loglik", mu=linpred, sigma=ve, observed=y, shape=n)
``````

The main difference between the two original codes is the use of the `observed` kwarg. There are two things to be aware of. DensityDist objects are very flexible and powerful and at the same time (or because of that) are tricky to use.

Observed variables don’t take different values for each draw while sampling. You said you have used Stan. Stan does the same. When using the `~` notation in Stan, both observed and latent variables are declared with the `x ~ distribution()` syntax, and being observed or not depends on whether `x` is defined in the `data` block or not. Like in PyMC, you can use `x` at any other point in the model too, but if it’s observed its values will be constant. Your first `b` definition sets a multivariate normal prior given a mu and a covariance, the second doesn’t.

In PyMC <4.0, DensityDist can take a single array as `observed` or can take a dictionary. When taking a dictionary it doesn’t matter if the dictionary has a single item, the generated object is this `MultiObservedRV` object which has potentially multiple observed variables in it (that are also completely arbitrary) and therefore can’t be used for any linear algebra operation.

Bonus note. DensityDist is generally used for observed variables, that only need the logp term to sample from the posterior. To still be able to do forward sampling (i.e. sample from the prior, prior predictive or posterior predictive distributions) you need to also use the random kwarg as @cluhmann showed in his example which also avoids using the observed kwarg and is the recommended way to use DensityDist.

1 Like

Thanks @cluhmann and @OriolAbril for the help so far, but I’m still having difficulty seeing how to define a custom density. Using a built in logpdf such as `mvnormdist.logp` in @cluhmann 's example works fine, but my reason for wanting to use `DensityDist()` is to implement a distribution that isn’t built in. The only example of a custom lpdf I can find in the documentation is the exponential survival distribution, but this is a simple univariate density. How can I implement a lpdf that involves matrix operations?

Here’s a simpler example that hopefully helps get my question across. First I generate the data

``````import numpy as np
import pymc3 as pm
from pymc3 import math as pmm

n=200 ## sample size
m=100 ## number of random effects
var_beta=.5/np.sqrt(m) ## variance of  latent random effects
var_resid=.5 ## residual variance

## design matrix
X = np.random.multivariate_normal(np.zeros(m), np.eye(m), size=n)

## latent random effects
beta = np.random.multivariate_normal(np.zeros(m), np.eye(m)*var_beta, size=1).T

## residuals
e = np.random.multivariate_normal(np.zeros(n), np.eye(n)*var_resid, size=1).T

## outcome
y = X @ beta + e
``````

I can estimate the variance components using the builtin MVN lpdf without issue:

``````with pm.Model() as lmm:
## priors on variance components
var_beta_hat = pm.HalfCauchy(name='var_beta_hat', beta=1.0);
var_resid_hat  = pm.HalfCauchy(name='var_resid_hat', beta=1.0);

## distribution of latent random effects beta given variance components
mvnormdist = pm.MvNormal.dist(mu=np.zeros(m),
cov=np.eye(m)*var_beta_hat/m,
shape=(1,m))
## DensityDist with built in MVN logp
beta_hat = pm.DensityDist('beta_hat', logp=mvnormdist.logp, shape=(1,m))

## Do some linear algebra with the latent variable
linear_predictor = beta_hat @ X.T

## distribution of outcome given everything else
y_hat = pm.Normal(name="y_hat", mu=linear_predictor, sigma=var_resid_hat, observed=y, shape=n)
``````

However, I’m unable to figure out how to code the density (`-.5*(log det C + b^T C^-1 b + m log 2 pi)`) by hand:

``````with pm.Model() as lmm2:
## priors on variance components
var_beta_hat = pm.HalfCauchy(name='var_beta_hat', beta=1.0);
var_resid_hat  = pm.HalfCauchy(name='var_resid_hat', beta=1.0);

###########  What can I do to make this work without using a built lpdf
## distribution of latent random effects beta given variance components
def custom_logp(beta_hat):
mu_beta = np.zeros(m) ## mean of latent variable
cov_beta = np.eye(m)*var_beta_hat2 ## variance of latent variable
## components of MVN lpdf
tmp = np.linalg.solve(cov_beta, beta_hat)
qform = tmp @ beta_hat
ldet = np.log(np.linalg.det(cov_beta))
return -.5*(qform + ldet + m * log(2*np.pi)
###########

## DensityDist with built in MVN logp
beta_hat = pm.DensityDist('beta_hat', logp=custom_logp, shape=(1,m))

## Do some linear algebra with the latent variable
linear_predictor =  beta_hat.T @ X

## distribution of outcome given everything else
y_hat = pm.Normal(name="y_hat", mu=linear_predictor, sigma=var_resid_hat, observed=y, shape=n)
``````

How can I rewrite the emphasized block of code above to make this work? Again, I explicitly want to avoid using the built-in MVN lpdf as, though it works for this toy example, my intended application involves a niche distribution that isn’t part of PyMC. Thanks again

Doesn’t the `custom_logp` you have (but using theano and pymc.math operations) work already?

I have modified your 2nd model to this:

``````with pm.Model() as lmm2:
## priors on variance components
var_beta_hat = pm.HalfCauchy(name='var_beta_hat', beta=1.0);
var_resid_hat  = pm.HalfCauchy(name='var_resid_hat', beta=1.0);

###########  What can I do to make this work without using a built lpdf
## distribution of latent random effects beta given variance components
def custom_logp(beta_hat):
cov_beta = tt.eye(m)*var_beta_hat ## variance of latent variable
## components of MVN lpdf
qform = pmm.dot(beta_hat, pmm.dot(pmm.matrix_inverse(cov_beta), beta_hat.T))
ldet = tt.log(pmm.det(cov_beta))
return -.5*(qform + ldet + m * tt.log(2*np.pi))
###########

## DensityDist with built in MVN logp
beta_hat = pm.DensityDist('beta_hat', logp=custom_logp, shape=(1,m))

## Do some linear algebra with the latent variable
linear_predictor =  pmm.dot(X, beta_hat.T)

## distribution of outcome given everything else
y_hat = pm.Normal(name="y_hat", mu=linear_predictor, sigma=var_resid_hat, observed=y, shape=n)
``````

which compiles and seems to sample. I am not sure I have converted everything right, my theano knowledge is very basic (I suppose there is some solve there too, but I don’t know about it so I haven’t used it) and it looks quite slow (not sure what could be done about this though).

1 Like

I understand that you don’t want to use one of the built-in distributions. My intention was to simply illustrate how your `logp()` function would need to operate. You can also just check out the `logp()` method of pymc’s MVNormal class (defined here) if you want to see how that works.

1 Like

@OriolAbril Aha I see–now I understand your first comment about observed being the problem. This is very helpful, I should be able to implement my desired model. And, re:the model running slowly, that’s not too surprising given the way I wrote the model. E.g., it explicitly constructs a diagonal matrix and inverts it, doesn’t use the Cholesky used by `pm.MvNormal`, explicitly computes the determinant of an isotropic matrix, etc. So hopefully I should be able to get something speedier in my application.

@cluhmann Apologies, I hope I didn’t sound combative at all, your recommendations, especially pointing me toward the dev documentation, are greatly appreciated!

1 Like

And just for completeness, here’s a working version of the second model with a more efficient representation of the custom lpdf (a speed up of 2 orders of magnitiude relative to my naive use of `pm.MvNormal` in `lmm` above):

``````import pymc3 as pm
from pymc3 import math as pmm
from theano import tensor as tt

with pm.Model() as lmm3:
## priors on variance components
var_beta_hat = pm.HalfCauchy(name='var_beta_hat', beta=1.0);
var_resid_hat  = pm.HalfCauchy(name='var_resid_hat', beta=1.0);

## distribution of latent random effects beta given variance components
def custom_logp(beta_hat):
qform = pmm.dot(beta_hat.T,beta_hat)/var_beta_hat
ldet = m*tt.log(var_beta_hat)
return -.5*(qform + ldet + m * tt.log(2*np.pi))
###########

## DensityDist with built in MVN logp
beta_hat = pm.DensityDist('beta_hat', logp=custom_logp, shape=(m,1))

## Do some linear algebra with the latent variable
linear_predictor =  pmm.dot(X, beta_hat)

## distribution of outcome given everything else
y_hat = pm.Normal(name="y_hat", mu=linear_predictor, sigma=var_resid_hat, observed=y, shape=n)
``````
3 Likes