Robust positive continuous likelihood

This may be the wrong place to ask this, so feel free to redirect.

I have data that is positive and continuous but has outliers. I normally use the Gamma distribution to model positive continuous data or parameters. Is there some equivalent robust version the way that the Student’s T is used as a robust likelihood in place of the Normal distribution? If not based on Gamma, is there any standard way to do a robust likelihood for positive continuous variables?

(Added in edit:) There is a generalized Gamma distribution which has an extra shape parameter p. I also thought that the easiest thing to do (programmatically) is just to use a truncated Student’s t.

Opher

You try a LogStudentT distribution. The generative graph is just the exp of student T, so it’s easy to implement using a custom dist:

import pymc as pm

def log_student_t(mu, sigma, nu, size=None):
    return pm.math.exp(pm.StudentT.dist(mu=mu, sigma=sigma, nu=nu, size=size))

with pm.Model() as m:
    # priors for mu, sigma, nu
    y_hat = pm.CustomDist(mu, sigma, nu, dist=log_student_t, observed=y_data)

For the prior on nu, @bwengals was doing some work on a model like this with a special prior for nu, but the specifics escape me. Hopefully he can chime in himself.

We also recently experimented with a mixture of LogNormals, so each datapoint was classified via latent variable into either a low sigma or high sigma distribution, each with the same mean. The idea was an analogy to the student T as a mixture of infinite normals.

But as to the question of whether there’s a “standard” approach here, I haven’t seen one if there is.

2 Likes

Thanks this is helpful. I added some ideas I’d had before above. If there’s nothing standard, we’ll just pick something and go with it.

RE the prior on nu, the PC prior is a good choice. There’s a PR for it in PyMC experimental if you’re brave and not on windows. Or, to keep it simple, you can use Gamma(alpha=2, beta=0.1). That’s very close to the PC prior that says, “I think there’s a 50% chance that nu is greater than 30”, where 30 is the sort of recognized place where student t’s look very normal.