Creating a LogStudentT distribution

Hi everyone,

What is the suggested way to use a LogStudentT variable, a la LogNormal? Is there a more straightforward way than implementing a new Distribution?

The use-case is the following. I’d like to do model comparison between a StudentT and a LogStudentT using the “loo” criterion. The data I’m working with has some outliers, hence the robust distribution.

Thanks for any help or suggestions.

PS. I really like pymc. Thanks for maintaining and developing it.

3 Likes

If you have the last version of PyMC installed, you can use CustomDist (untested snippet):

import pymc as pm

 def log_studentt(nu, mu, sigma, size):
     return pm.math.exp(pm.StudentT.dist(nu=nu, mu=mu, sigma=sigma, size=size))

 with pm.Model() as m:
     nu = ...
     mu = ...
     sigma = ....
     pm.CustomDist("log_studentt", nu, mu, sigma, random=log_studentt, observed=data)

https://www.pymc.io/projects/docs/en/stable/api/distributions/generated/pymc.CustomDist.html

3 Likes

Im curious @ricardoV94, in the docs you linked for CustomDist it says:

In some cases, if a user provides a random function that returns an Aesara PyTensor graph, PyMC will be able to automatically derive the appropriate logp graph when performing MCMC sampling.

This is such a case right?

Yes, although I am not very happy for having overloaded the random kwarg, maybe a separate one would be better.

Other examples here https://twitter.com/pymc_devs/status/1615733480347897858?s=20

1 Like

Thanks, I really appreciate the proposed solution and comments!

When I run the following code:

import pymc as pm
import numpy as np

def log_studentt(nu, mu, sigma, size):
    pm.math.exp(pm.StudentT.dist(nu, mu = mu, sigma=sigma, size=size))

data = np.random.standard_t(10,size=(15))
data = data + 1.

with pm.Model() as m:
    nu = 10
    mu = pm.Normal("mu",mu=0,sigma=10)
    sigma = 1
    pm.CustomDist("log_studentt", nu, mu, sigma, random=log_studentt, observed=data)

    idata = pm.sample(4000, tune = 2000)

az.plot_trace(idata)

I get the error:

NotImplementedError: Attempted to run logp on the CustomDist ‘log_studentt’, but this method had not been provided when the distribution was constructed. Please re-build your model and provide a callable to 'log_studentt’s logp keyword argument.

I’m running PyMC v5.0.2.

Does that mean it didn’t infer a logp graph, and/or did I make a mistake?

It was a bug in my code. As I mentioned I hadn’t tested it. The log_studdentt was not returning anything :stuck_out_tongue:

Should be

def log_studentt(nu, mu, sigma, size):
    return pm.math.exp(pm.StudentT.dist(nu, mu = mu, sigma=sigma, size=size))

Note that the way you are generating data in your example, can easily produce negative numbers, meaning your model will have -inf logp

1 Like

Oh right, I also overlooked it :sweat_smile:.

It’s running now, thanks a lot!

This is the working snippet of code:

def log_studentt(nu, mu, sigma, size):
    return pm.math.exp(pm.StudentT.dist(nu, mu = mu, sigma=sigma, size=size))

data = np.exp(np.random.standard_t(10,size=(150)) + 1.)
data = data

with pm.Model() as m:
    nu = 10
    mu = pm.Normal("mu",mu=0,sigma=10)
    sigma = 1
    pm.CustomDist("log_studentt", nu, mu, sigma, random=log_studentt, observed=data)
    
    idata = pm.sample(4000, tune = 2000)
    
az.plot_trace(idata)
1 Like

I updated the examples above to be correct as well. Apologies for the clumsy copy paste :slight_smile:

1 Like

No problem, we got there :). I made some silly errors as well.

1 Like

Hi,

Thank you for the valuable suggestions for pymc.

I would like to ask you if this script can be run in pymc5 because I receive some errors.

Should anything be changed on how the functions are called?

Thank you

You should pass the function to the dist kwarg now, instead of random

pm.CustomDist(... dist=log_studentt, ...)