# Using multi-multivariate PDFs when defining the model's likelihood and sampling step method

## TL;DR:

I want to draw samples from the posterior PDF of some parameters using Metropolis, the parameters are many vectors that are statistically independent. How can I define the step method such that it draws proposals from many low dimensional normal PDFs?

My data includes a set of a set of bidimensional (plane) coordinates, an array of size `(n,m,2)`. There is a likelihood for every 2-D coordinate, expressed as a bivaluate mean and a specific covariance. How can I define in my model many multivariate normal distributions asociated with the data?

## In Extensum:

I have some data and a probabilistic model with parameters. The model involves some calculations on the data that I canâ€™t write using Theano so I created a new Theano Op. The model depends on many parameters: an 8 dimensional vector and thirty-three 6 dimensional vectors, so in total 8+33*6=206 scalar parameters. Also I know that all this vectors are statistically independent but I expect statistical dependence between the elements of a single vector. So no inter-vector dependance but yes intra-vector dependance. When fed some input data and the parameters my model outputs the means and covariances of a set of 2 dimensional coordinates, so the likelihood is actually many 2-D gaussians.

First I wrote my own implementation of Metropolis using numpy and scipy. It works but it took a heck lot of work to test and to deal with the complexities of the model and the data. Also I would prefer to relay on a more standardised implementation. This way I can just cite a known library instead of explaning in length something that is not really central to my work.

So I came upon PyMC3, inmediatelly re-coding stuff to adapt my code to the library. It works but is amazingly slow! It takes about 40secs for a single iteration when my own implementation took about two orders of magnitude less. I propose the following sources of slowness:

1- Having to draw proposals from a 206-D gaussian. I had to stack all parameters into one long vector because I donâ€™t know to define a step function for many variables each with itâ€™s own covariance in PyMC3. How can I define a step function `step = pm.Metropolis(....)` that could then be given to `pm.step` function as argument? I want this step function to draw proposals from one 8-D and thirty-three 6-D normal distributions. I guess that this should be faster. Also itâ€™s a better representation of the model I think. (This other post talks about defining a custom step function but I just want to compose a list/dictionay/whatever of 34 existing step functions).

2- Defining multi-multivariate likelihood. The probability of the data (`n*m` bidimensional coordinates) is evaluated as the probability of a very long vector with all the `n*m*2` (`33*54*2=3564`) values given normal distribution in 3564 dimensions. In my own implementation I could evaluate n*m=1782 separate bidimensional gaussians. The prblem is that I have many (bivaluate) means each with itâ€™s own covariance. I tried to follow th examples and but it results in error because when defining the likelihood as multivariate normal It only accepts a matrix of two indexes as covariance. How can I define the likelihood as many multivariate normal distributions?

3- Maybe Theano is just slow? â€¦ I donâ€™t know.

So basically stacking everything into huge multidimensional vectors is working but I figure PyMC3 might have some way to do it more afficiently and elegantly.

## The relevant part of code

``````class ProjectionT(theano.Op):
'''
here I just wrap some numeric python routine that uses numpy and others
so that it can be called by theano
'''
# xInt, xExternal are sized (8,) and (33,6) respectively
itypes = [T.dvector, T.dmatrix]
# xyM, cM are sized (33,54,2) and (33,54,2,2) respectively
otypes = [T.dtensor3, T.dtensor4]

# Python implementation:
def perform(self, node, inputs_storage, output_storage):

Xint, Xext = inputs_storage
xyM, cM = output_storage

# do something with data and inputs to calculate xy, cm
# this is something I can't port to theano because of reasons...
xy = ...
cm = ...

xyM[0] = xy # write to outputs
cM[0] = cm

# optional:
check_input = True

# ==== model definition
projectionModel = pm.Model()
projTheanoWrap = ProjectionT() # instance of the function written in python

with projectionModel:
# Priors for unknown model parameters, uniform PDF between xLow and xUpp
xAll = pm.Uniform('xAll', lower=xLow, upper=xUpp, shape=allLow.shape)

xIn = xAll[:Ns[-1]] # reshape to 8-D vector
xEx = xAll[Ns[-1]:].reshape((n, 6)) # reshape to n=33 6-D vectors

xyM, cM = projTheanoWrap(xIn, xEx) # apply function written in python

# the following is the way I found of defining the PDF of the data
# It should be 54*33=1782 bidimensional gaussians, each with it's own
# mean and covariance. But I stack everithing into une big vector
mu = T.reshape(xyM, (-1,)) # the data's PDF should be centered here

# A big covariance with smaller covariance block in the diagonal
bigC = T.zeros((nTot, nTot))
c3Diag = T.reshape(cM, (-1, 2, 2))  # list of 2x2 covariance blocks
bigC = T.set_subtensor(bigC[xInxs, yInxs], c3Diag)

# this is the PDF that the data (3564 dimensions) should follow
Y_obs = pm.MvNormal('Y_obs', mu=mu, cov=bigC, observed=yObs)
# I would prefer to express this likelihood as a list of 2D gaussians

# ==== Metropolis sampling
start = {'xAll' : xAll0}

nDraws = 500
nChains = 6

Scov = allUpp - allLow # some

with projectionModel:
# define one big step function fo all 206 parameters
step = pm.Metropolis(S=Scov, scaling=1e-8)
# I would prefer a list os steps i.e.:
# step = [pm.Metropolis(S=...) for i in range(...)]

trace = pm.sample(draws=nDraws, step=step, start=start, njobs=nChains,
chains=nChains)``````

You can try profiling your code to find out where is it slow. I think the theanoOp might be one (you said you canâ€™t port to theano, so I guess you are calling an outside library?), the other one is the MvNormal with large Cov matrix. So to answer your question:

1, when you said â€śdefine a step function for many variables each with itâ€™s own covariance in PyMC3â€ť, you meant the cov of the proposal right? You can try the following:

``````    xIn = pm.Uniform('xAll', lower=xLow, upper=xUpp, shape=8)
xEx = pm.Uniform('xAll', lower=xLow, upper=xUpp, shape=(n, 6))

...
step1 = pm.Metropolis(vars=[xIn], S=Scov1)
step2 = pm.Metropolis(vars=[xEx], S=Scov2)

pm.sample(step=[step1, step2])
``````

Few point to be careful:

• Scov2 is a `33*6 by 33*6` cov matrix, since you would like thirty-three 6-D normal distributions, you just need to perform Kronecker product between a 33D identity matrix and a 6D cov matrix.
• PyMC3 automatically transform parameters into the real line. So you should either scale your proposal, or do something like `xIn = pm.Uniform('xAll', lower=xLow, upper=xUpp, shape=8, transform=None)`

2, yes a large cov matrix is likely very slow. But since you have â€śn*m=1782 separate bidimensional gaussiansâ€ť, you can either try the new `pm.MatrixNormal` distribution, or write your own logp function to compute the logp more efficiently.

1 Like

1, I implemented the changes and now I use two step functions just like in you code snippet. So thanks, I like it that way much more. For Scov2 I gave the standar deviations for each element, so a (33,6) matrix was enough and it works perfectly.

2, oukey, I wrote my logp function because I thought that MatrixNormal would be faster but not as fast as It couldget (and figuring out how to calculate `U` and `V` was going to be as much work as learning to write my logp function). But Iâ€™m having some issues with some tensors shapes, Iâ€™ll get it to work eventually.

3, profiling? Iâ€™ll check it out.

1 Like

2, I circumvented the problem of writing the logp function. Instead of having many bidimensional vectors to be evaluated in many corresponding bivaluate nomal PDFs I project all the vectors into the basis vectors that diagonalise each PDF, effectively obtaining many points that must be evaluated against a standard normal distribution (mean zero and covariance of one).

1 Like

linear algebra for the win