I’m trying to implement a robust logistic regression, using an analogous logic as robust linear regression. I can find STAN implementation (called “Robit”) of this such as the main section of code copied below (original post here):
In the model block, the Bernoulli p is transformed by the Student T cdf. I’m wondering if there is a straight forward way of translating this into pymc3. To my knowledge, there is no student t distribution in the math module and I don’t have enough knowledge to code Theano-compliant math myself.
data {
int N;
vector[N] x;
int y[N];
real df;
}
parameters {
real a;
real b;
}
model {
vector[N] p;
for (n in 1:N) p[n] = student_t_cdf(a + b*x[n], df, 0, sqrt((df - 2)/df));
y ~ bernoulli(p);
}
Robit not probit? Strange naming… Anyway, I had to create the cdf and invcdf of a couple of distributions recently, Theano’s generally quite kind to this pursuit. I found the harder part was the actual math!
There’s a StudentT and a Cauchy (equivalent if nu=1) in pymc3 (Continuous — PyMC3 3.10.0 documentation), both of which have a logcdf. So possibly you could use a dirty hack to exponentiate the logcdf output, or dig into the code to guide your writing of your own cdf function. The Cauchy looks much more pleasant pymc3/continuous.py at 97b54f0da10074b0f3f62ff7040de2dc7c79a7ad · pymc-devs/pymc3 · GitHub
Robit = Robust Probit
So essentially replacing the normal errors with a fatter tail like student’s distribution. I think you should be able to accomplish with a hierarchical latent variable probit model? Just getting my feet wet so I won’t try to share my bad code, just the GLM Hierarchical example:
https://docs.pymc.io/en/v3/pymc-examples/examples/generalized_linear_models/GLM-hierarchical.html#Hierarchical-Model