This may be the wrong place to ask this, so feel free to redirect.

I have data that is positive and continuous but has outliers. I normally use the Gamma distribution to model positive continuous data or parameters. Is there some equivalent robust version the way that the Student’s T is used as a robust likelihood in place of the Normal distribution? If not based on Gamma, is there any standard way to do a robust likelihood for positive continuous variables?

(Added in edit:) There is a generalized Gamma distribution which has an extra shape parameter `p`

. I also thought that the easiest thing to do (programmatically) is just to use a truncated Student’s t.

Opher

You try a LogStudentT distribution. The generative graph is just the exp of student T, so it’s easy to implement using a custom dist:

```
import pymc as pm
def log_student_t(mu, sigma, nu, size=None):
return pm.math.exp(pm.StudentT.dist(mu=mu, sigma=sigma, nu=nu, size=size))
with pm.Model() as m:
# priors for mu, sigma, nu
y_hat = pm.CustomDist(mu, sigma, nu, dist=log_student_t, observed=y_data)
```

For the prior on `nu`

, @bwengals was doing some work on a model like this with a special prior for `nu`

, but the specifics escape me. Hopefully he can chime in himself.

We also recently experimented with a mixture of LogNormals, so each datapoint was classified via latent variable into either a low sigma or high sigma distribution, each with the same mean. The idea was an analogy to the student T as a mixture of infinite normals.

But as to the question of whether there’s a “standard” approach here, I haven’t seen one if there is.

2 Likes

Thanks this is helpful. I added some ideas I’d had before above. If there’s nothing standard, we’ll just pick something and go with it.

RE the prior on `nu`

, the PC prior is a good choice. There’s a PR for it in PyMC experimental if you’re brave and not on windows. Or, to keep it simple, you can use Gamma(alpha=2, beta=0.1). That’s very close to the PC prior that says, “I think there’s a 50% chance that `nu`

is greater than 30”, where 30 is the sort of recognized place where student t’s look very normal.