Multivariate LogNormal distribution

Hi,

I was trying to define a Multivariate LogNormal distribution in PyMC 5.1.1, i.e., the entry-wise exponential of a Multivariate Normal distribution. I need it both for sampling using .dist and inside an MCMC model.

Should I create a class inheriting from pm.MvNormal, or should I use pm.CustomDist? Could someone provide me with an example? I think it should not be a problem once I understand how custom distributions should be defined if they are obtained from another distribution through an invertible function.

Thank you

CustomDist should be fine.

If you are not in a hurry this will actually be trivial after Infer logp for elemwise transformations of multivariate variables by ricardoV94 Âˇ Pull Request #6607 Âˇ pymc-devs/pymc Âˇ GitHub getâs merged, probably in the next release.

In that case all you will need to do is something like

``````def dist(mu, cov, size):
return pm.math.exp(pm.MvNornal.dist(mu, cov, size=size))

with pm.Model():
mu = ...
cov = ...
x = pm.CustomDist("x", mu, cov, dist=dist)
``````

PyMC will be able to figure out the logp automatically from the dist function. For now you would still implement a `logp` function.

Note that if you donât need the normalized density directly you can just define a MvNormal variable and exponentiate it afterwards.

Thanks for your help. However, I would like to have a distribution that works as the others, i.e. I would like to have a class on which I can call a `.dist` and I can use inside a model. For this reason, I tried the following:

``````def MvLogNormal_logp(value, mu, cov):
res = pt.switch(pt.gt(value, 0), pt.log(value), -np.inf)
return pm.MvNormal.logp(res, mu, cov)

def MvLogNormal_dist(mu, cov, size):
return pt.exp(pm.MvNormal.dist(mu, cov, size=size))

class MvLogNormal:
def __new__(
cls,
name,
mu,
cov,
):
return pm.CustomDist(
name, mu, cov, dist=MvLogNormal_dist, logp=MvLogNormal_logp
)

@classmethod
def dist(cls, mu=None, cov=None, **kwargs):
if mu is None:
mu = kwargs["mu"]
if cov is None:
cov = kwargs["cov"]
return pm.CustomDist.dist(
mu,
cov,
class_name="MvLogNormal",
logp=MvLogNormal_logp,
dist=MvLogNormal_dist,
)
``````

However, I get the following error when trying to use it inside a model:

`**TypeError** : Cannot convert Type TensorType(float64, ()) (of Variable Alloc.0) into Type TensorType(float64, (9,)). You can try to manually convert Alloc.0 into a TensorType(float64, (9,)).`

What might be causing it? Do you have any other way to accomplish what I would like to do, in a simpler manner? Thanks a lot for your help!

I would have to check what youâre doing in more detail to find the error, but it should be possible to write it like you did.

Note that your logp is not a proper density, as you didnât had the jacobian from the exponentiation operation. This may or not matter for your use case

Sure, I am completely available to discuss the detail.

I am not using the NUTS sampler, which requires gradients, but just a plain Metropolis sampler. Therefore, not having the gradients should not be a problem.

The error is raised when doing `model.initial_point()` inside the initialization of the Metropolis Sampler. I donât know what might be causing it, but it seems that the `size` argument is not correctly passed to some function, causing some shape-inconsistency problems.

What would you like to check in more detail? Do you want me to provide you with a minimal working example where it fails?

Thanks!

We just released pymc 5.2.0, which can infer logp of transformed multivariate distributions.

I am able to run the following snippet:

``````import pymc as pm
import numpy as np
import arviz as az

def mvlognormal_dist(mu, cov, size):
return pm.math.exp(pm.MvNormal.dist(mu, cov, size=size))

class MvLogNormal:
def __new__(cls, name, mu, cov, **kwargs):
return pm.CustomDist(name, mu, cov, dist=mvlognormal_dist, ndim_supp=1, **kwargs)

@classmethod
def dist(cls, mu, cov, **kwargs):
return pm.CustomDist.dist(mu, cov, class_name="MvLogNormal", dist=mvlognormal_dist, ndim_supp=1, **kwargs)

with pm.Model() as m:
mu = pm.Normal("mu", shape=(3,))
x = MvLogNormal("x", mu=mu, cov=np.eye(3)*1e-3, observed=np.exp([1, 2, 3]))
trace = pm.sample()

az.summary(trace)
``````
``````        mean     sd  hdi_3%  hdi_97%  ...  mcse_sd  ess_bulk  ess_tail  r_hat
mu[0]  0.999  0.031   0.940    1.055  ...      0.0    7103.0    3448.0    1.0
mu[1]  1.999  0.032   1.935    2.054  ...      0.0    6510.0    3054.0    1.0
mu[2]  2.997  0.032   2.937    3.055  ...      0.0    5575.0    2972.0    1.0
``````

Thanks for your help. However, I am still experiencing problems when using the new distribution as a prior. You can reproduce the error I get by using

``````import pymc as pm
import numpy as np
import arviz as az

def mvlognormal_dist(mu, cov, size):
return pm.math.exp(pm.MvNormal.dist(mu, cov, size=size))

class MvLogNormal:
def __new__(cls, name, mu, cov, **kwargs):
return pm.CustomDist(name, mu, cov, dist=mvlognormal_dist, ndim_supp=1, **kwargs)

@classmethod
def dist(cls, mu, cov, **kwargs):
return pm.CustomDist.dist(mu, cov, class_name="MvLogNormal", dist=mvlognormal_dist, ndim_supp=1, **kwargs)

with pm.Model() as m:
x = MvLogNormal("x", mu=np.zeros(3), cov=np.eye(3)*1e-3)
y = pm.MvNormal("y", mu=x, cov=np.eye(3), observed=np.exp([1, 2, 3]))
trace = pm.sample(chains=1, step=pm.Metropolis())

az.summary(trace)
``````

You probably need to specify an initial point `initval=pt.exp(mu)` as the default is zero. Otherwise you can specify a `moment` function to produce the initial point used for sampling.

If you are sampling with NUTS, you should also specify `transform=pm.distributions.transforms.log` so the sampler wonât propose invalid values.

BUT, in that case you might as well use a MvNormal directly with a log transform for the same result. Or, even better, a MvNormal that you manually exponentiate without the artificial transform. You only need custom distributions when they are used as likelihood or in distribution factories like Mixture or RandomWalk.

I am sampling using the Metropolis sampler, so the transform should not be a problem. I did not understand very well from the documentation what the `moment` function should be, but I tried implementing it using

``````def mvlognormal_moment(rv, size, mu, cov):
return pt.exp(pm.MvNormal.moment(rv, size, mu, cov))
``````

Still, the following script gives me the error `mvlognormal_moment() takes 4 positional arguments but 5 were given`. Why does it happen, and what should the `moment` function do?

``````import pymc as pm
import numpy as np
import arviz as az
import pytensor.tensor as pt

def mvlognormal_dist(mu, cov, size):
return pt.exp(pm.MvNormal.dist(mu, cov, size=size))

def mvlognormal_moment(rv, size, mu, cov):
return pt.exp(pm.MvNormal.moment(rv, size, mu, cov))

class MvLogNormal:
def __new__(cls, name, mu, cov, **kwargs):
return pm.CustomDist(
name,
mu,
cov,
dist=mvlognormal_dist,
moment=mvlognormal_moment,
ndim_supp=1,
**kwargs
)

@classmethod
def dist(cls, mu, cov, **kwargs):
return pm.CustomDist.dist(
mu,
cov,
class_name="MvLogNormal",
dist=mvlognormal_dist,
moment=mvlognormal_moment,
ndim_supp=1,
**kwargs
)

with pm.Model() as m:
x = MvLogNormal("x", mu=np.zeros(3), cov=np.eye(3)*1e-3,)
y = pm.MvNormal("y", mu=x, cov=np.eye(3), observed=np.exp([1, 2, 3]))
trace = pm.sample(chains=1, step=pm.Metropolis())

az.summary(trace)
``````

Problem is `MvNormal.moment` I think. You should call `moment(MvNormal)` like is done for the logp. You might need to import `moment` as it might not be available at the pymc root level.

But the moment of the MvNormal is just mu, so your function can just return `pt.exp(mu)` directly.

Moment returns the initial point to be used for sampling a distribution. Itâs how `model.initial_point()` is populated.

I tried also this solution, but still it gives me some error when computing the logp of a vector that has negative entries (while I expected it to return zero probability). I tried specifying the logp as:

``````def mvlognormal_logp(value, mu, cov):
res = pt.switch(pt.gt(value, 0), pt.log(value), -np.inf)
return pm.MvNormal.logp(res, mu, cov)
``````

or as

``````def mvlognormal_logp(value, mu, cov, *args, **kwargs):
res = pt.switch(pt.gt(value, 0), pt.log(value), -np.inf)
return pm.logp(pm.MvNormal.dist(mu, cov), res)
``````

but still, this gives some errors. I am attaching here the output

``````---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[33], line 36
32 bip_fem.set_mcmc_model(prior_type, prior_kwargs, noise_type, noise_kwargs)
34 with bip_fem.mcmc:
35     trace_fem = pm.sample(
---> 36         DRAWS, tune=TUNE, step=Adaptive_Metropolis(**metropolis_kwargs), chains=CHAINS
37     )
38 trace_fem.to_netcdf(path+".nc")

File c:\Users\ivanb\Documents\GitHub\BayesianSurrogateModeling_SemesterProject\src\utils\mcmc_sampler.py:24, in Adaptive_Metropolis.__init__(self, vars, S, proposal_dist, scaling, tune, tune_interval, model, mode, elemwise_update, **kwargs)
11 def __init__(
12     self,
13     vars=None,
(...)
22     **kwargs
23 ):
---> 24     super().__init__(
25         vars, S, proposal_dist, scaling, tune, tune_interval, model, mode, **kwargs
26     )
27     self.elemwise_update = elemwise_update

File ~\AppData\Roaming\Python\Python310\site-packages\pymc\step_methods\metropolis.py:225, in Metropolis.__init__(self, vars, S, proposal_dist, scaling, tune, tune_interval, model, mode, **kwargs)
222 self.mode = mode
224 shared = pm.make_shared_replacements(initial_values, vars, model)
--> 225 self.delta_logp = delta_logp(initial_values, model.logp(), vars, shared)
226 super().__init__(vars, shared)

File ~\AppData\Roaming\Python\Python310\site-packages\pymc\model.py:759, in Model.logp(self, vars, jacobian, sum)
757 rv_logps: List[TensorVariable] = []
758 if rvs:
--> 759     rv_logps = joint_logp(
760         rvs=rvs,
761         rvs_to_values=self.rvs_to_values,
762         rvs_to_transforms=self.rvs_to_transforms,
763         jacobian=jacobian,
764     )
765     assert isinstance(rv_logps, list)
767 # Replace random variables by their value variables in potential terms

File ~\AppData\Roaming\Python\Python310\site-packages\pymc\logprob\basic.py:319, in joint_logp(rvs, rvs_to_values, rvs_to_transforms, jacobian, **kwargs)
315 if values_to_transforms:
316     # There seems to be an incorrect type hint in TransformValuesRewrite
317     transform_rewrite = TransformValuesRewrite(values_to_transforms)  # type: ignore
--> 319 temp_logp_terms = factorized_joint_logprob(
320     rvs_to_values,
321     extra_rewrites=transform_rewrite,
322     use_jacobian=jacobian,
323     **kwargs,
324 )
326 # The function returns the logp for every single value term we provided to it.
327 # This includes the extra values we plugged in above, so we filter those we
328 # actually wanted in the same order they were given in.
329 logp_terms = {}

File ~\AppData\Roaming\Python\Python310\site-packages\pymc\logprob\basic.py:209, in factorized_joint_logprob(rv_values, warn_missing_rvs, ir_rewriter, extra_rewrites, **kwargs)
206 while q:
207     node = q.popleft()
--> 209     outputs = get_measurable_outputs(node.op, node)
211     if not outputs:
212         continue

File ~\AppData\Roaming\Python\Python310\site-packages\pymc\logprob\abstract.py:164, in get_measurable_outputs(op, node)
162 """Return only the outputs that are measurable."""
163 if isinstance(op, MeasurableVariable):
--> 164     return _get_measurable_outputs(op, node)
165 else:
166     return []

File c:\Users\ivanb\anaconda3\envs\bayesiansurrogatemodeling\lib\functools.py:889, in singledispatch.<locals>.wrapper(*args, **kw)
885 if not args:
886     raise TypeError(f'{funcname} requires at least '
887                     '1 positional argument')
--> 889 return dispatch(args[0].__class__)(*args, **kw)

File ~\AppData\Roaming\Python\Python310\site-packages\pymc\distributions\distribution.py:410, in _get_measurable_outputs_symbolic_random_variable(op, node)
402 @_get_measurable_outputs.register(SymbolicRandomVariable)
403 def _get_measurable_outputs_symbolic_random_variable(op, node):
404     # This tells PyMC that any non RandomType outputs are measurable
(...)
407     # In the rare case this is not what one wants, a specialized _get_measuarable_outputs
408     # can dispatch for a subclassed Op
409     if op.default_output is not None:
--> 410         return [node.default_output()]
412     # Otherwise assume that any outputs that are not of RandomType are measurable
413     return [out for out in node.outputs if not isinstance(out.type, RandomType)]

File ~\AppData\Roaming\Python\Python310\site-packages\pytensor\graph\basic.py:200, in Apply.default_output(self)
198     raise ValueError(f"{self.op}.default_output should be an int or long")
199 elif do < 0 or do >= len(self.outputs):
--> 200     raise ValueError(f"{self.op}.default_output is out of range.")
201 return self.outputs[do]

ValueError: CustomSymbolicDistRV_y{inline=True}.default_output is out of range.
``````

Yes it was a bug, it was already fixed in Allow default_output to be any valid Python index by ricardoV94 Âˇ Pull Request #274 Âˇ pymc-devs/pytensor Âˇ GitHub

The next release of PyMC should incorporate that

Do you know approximately when it will be released? Is there a way to get a pre-release versione?

Sometime next week the latest I think

With the new version I was finally able to get something that works. I am posting it here so that others may benefit from it as well. Thank you @ricardoV94 for your help, and please feel free to correct my code below if you think there are any mistakes.

``````def mvlognormal_dist(mu, cov, size):
return pt.exp(pm.MvNormal.dist(mu, cov, size=size))

def mvlognormal_moment(rv, size, mu, cov, *args, **kwargs):
return pt.exp(pm.MvNormal.moment(rv, size, mu, cov))

def mvlognormal_logp(value, mu, cov, *args, **kwargs):
res = pt.switch(pt.gt(value, 0.0), pt.log(value), 1) # 1 is a dummy value
product = pt.prod(value, axis=-1)
product = pt.switch(pt.neq(product, 0.0), product, 1)
logp_addition = pt.switch(pt.all(pt.gt(value, 0.0), axis=-1), 0, -np.inf)
return pm.logp(pm.MvNormal.dist(mu, cov), res) / product + logp_addition

class MvLogNormal:
def __new__(cls, name, mu, cov, **kwargs):
return pm.CustomDist(
name,
mu,
cov,
dist=mvlognormal_dist,
moment=mvlognormal_moment,
logp=mvlognormal_logp,
ndim_supp=1,
**kwargs
)

@classmethod
def dist(cls, mu, cov, **kwargs):
return pm.CustomDist.dist(
mu,
cov,
class_name="MvLogNormal",
dist=mvlognormal_dist,
moment=mvlognormal_moment,
logp=mvlognormal_logp,
ndim_supp=1,
**kwargs
)
``````
2 Likes

You donât even need the `mvlognormal_logp`. PyMC can figure it out from the `dist` function alone

1 Like

import pymc as pm
import numpy as np
import arviz as az

def mvlognormal_dist(mu, cov, size):
return pm.math.exp(pm.MvNormal.dist(mu, cov, size=size))

class MvLogNormal:
def new(cls, name, mu, cov, **kwargs):
return pm.CustomDist(name, mu, cov, dist=mvlognormal_dist, ndim_supp=1, **kwargs)

``````@classmethod
def dist(cls, mu, cov, **kwargs):
return pm.CustomDist.dist(mu, cov, class_name="MvLogNormal", dist=mvlognormal_dist, ndim_supp=1, **kwargs)
``````

with pm.Model() as m:
mu = pm.Normal(âmuâ, shape=(3,))
x = MvLogNormal(âxâ, mu=mu, cov=np.eye(3)*1e-3, observed=np.exp([1, 2, 3]))
trace = pm.sample()

az.summary(trace)

but I have got a mistake
AttributeError: module âpymcâ has no attribute âCustomDistâ
Why and How can I solve it?

You probably have an older version of pymc, try to update to the latest.