Creating a LogStudentT distribution

agwy · January 19, 2023, 9:00pm

Hi everyone,

What is the suggested way to use a LogStudentT variable, a la LogNormal? Is there a more straightforward way than implementing a new Distribution?

The use-case is the following. I’d like to do model comparison between a StudentT and a LogStudentT using the “loo” criterion. The data I’m working with has some outliers, hence the robust distribution.

Thanks for any help or suggestions.

PS. I really like pymc. Thanks for maintaining and developing it.

ricardoV94 · January 19, 2023, 10:26pm

If you have the last version of PyMC installed, you can use CustomDist (untested snippet):

import pymc as pm

 def log_studentt(nu, mu, sigma, size):
     return pm.math.exp(pm.StudentT.dist(nu=nu, mu=mu, sigma=sigma, size=size))

 with pm.Model() as m:
     nu = ...
     mu = ...
     sigma = ....
     pm.CustomDist("log_studentt", nu, mu, sigma, random=log_studentt, observed=data)

https://www.pymc.io/projects/docs/en/stable/api/distributions/generated/pymc.CustomDist.html

bwengals · January 20, 2023, 3:28am

Im curious @ricardoV94, in the docs you linked for CustomDist it says:

In some cases, if a user provides a random function that returns an ~~Aesara~~ PyTensor graph, PyMC will be able to automatically derive the appropriate logp graph when performing MCMC sampling.

This is such a case right?

ricardoV94 · January 20, 2023, 7:55am

Yes, although I am not very happy for having overloaded the random kwarg, maybe a separate one would be better.

Other examples here https://twitter.com/pymc_devs/status/1615733480347897858?s=20

agwy · January 20, 2023, 9:59am

Thanks, I really appreciate the proposed solution and comments!

When I run the following code:

import pymc as pm
import numpy as np

def log_studentt(nu, mu, sigma, size):
    pm.math.exp(pm.StudentT.dist(nu, mu = mu, sigma=sigma, size=size))

data = np.random.standard_t(10,size=(15))
data = data + 1.

with pm.Model() as m:
    nu = 10
    mu = pm.Normal("mu",mu=0,sigma=10)
    sigma = 1
    pm.CustomDist("log_studentt", nu, mu, sigma, random=log_studentt, observed=data)

    idata = pm.sample(4000, tune = 2000)

az.plot_trace(idata)

I get the error:

NotImplementedError: Attempted to run logp on the CustomDist ‘log_studentt’, but this method had not been provided when the distribution was constructed. Please re-build your model and provide a callable to 'log_studentt’s logp keyword argument.

I’m running PyMC v5.0.2.

Does that mean it didn’t infer a logp graph, and/or did I make a mistake?

ricardoV94 · January 20, 2023, 10:11am

It was a bug in my code. As I mentioned I hadn’t tested it. The log_studdentt was not returning anything

Should be

def log_studentt(nu, mu, sigma, size):
    return pm.math.exp(pm.StudentT.dist(nu, mu = mu, sigma=sigma, size=size))

Note that the way you are generating data in your example, can easily produce negative numbers, meaning your model will have -inf logp

agwy · January 20, 2023, 10:28am

Oh right, I also overlooked it .

It’s running now, thanks a lot!

This is the working snippet of code:

def log_studentt(nu, mu, sigma, size):
    return pm.math.exp(pm.StudentT.dist(nu, mu = mu, sigma=sigma, size=size))

data = np.exp(np.random.standard_t(10,size=(150)) + 1.)
data = data

with pm.Model() as m:
    nu = 10
    mu = pm.Normal("mu",mu=0,sigma=10)
    sigma = 1
    pm.CustomDist("log_studentt", nu, mu, sigma, random=log_studentt, observed=data)
    
    idata = pm.sample(4000, tune = 2000)
    
az.plot_trace(idata)

ricardoV94 · January 20, 2023, 11:32am

I updated the examples above to be correct as well. Apologies for the clumsy copy paste

agwy · January 20, 2023, 11:35am

No problem, we got there :). I made some silly errors as well.

Tedvi · May 6, 2023, 3:38pm

Hi,

Thank you for the valuable suggestions for pymc.

I would like to ask you if this script can be run in pymc5 because I receive some errors.

Should anything be changed on how the functions are called?

Thank you

ricardoV94 · May 6, 2023, 7:19pm

You should pass the function to the dist kwarg now, instead of random

pm.CustomDist(... dist=log_studentt, ...)

Topic		Replies	Views
Distributions .dist.logp() Questions	1	346	July 24, 2021
What method is required to implement custom distribution in pymc? v5	1	1369	May 14, 2023
Custom Distribution with pm.CustomDist v5 modeling	4	1753	June 27, 2023
Wrapping a scipy distribution with CustomDist v5	2	126	June 14, 2024
Drawing values from custom distribution Questions	8	995	September 14, 2022

Creating a LogStudentT distribution

Related topics