Help with modeling gamma distribution

I am trying to solve a non-linear regression problem where the observations (Y) follow a gamma distribution.

The model which I have used takes the below form
Yi = Gamma (alpha = f(Xi), beta = beta),
where beta has some positive prior and f is neural network with Xi as ith input sample. Beta is common for all the data points and then alpha is estimated from the neural network. I am using ADVI to estimate the parameters of the above model. When I sample the posterior, it almost matches the data in terms of shape but the mean is very far. In my case, shape is important but mean of Yi is very important. If in this case, I use Normal instead of Gamma, then the mean is super accurate but the distribution is obviously off. Is there a model choice or parameter estimation strategy which I could use to improve the mean and model shape? Any suggestions @jessegrabowski?

You can try using the mu/sigma parametrization of the Gamma instead?


Thanks for the suggestion. I tried that but it did not help much. I think I am doing the complete analysis wrong. So first I learn the distribution of sigma and all the parameters from the data. When I get a new set of data (n number of Xi), I feed them to a model to get a distribution of estimated Yi (2000 samples) for each Xi. How should I compare the ground truth Yi values against the distribution I got from the model? Should I compare the mean, median, or mode of 2000 samples? What type of estimate should I get from the distribution of multiple Yi so that I can compare it against the mean of ground truth Yi?

Also, is there a way to force the mean of the output of f(Xi) to follow the gamma distribution?

You can search for posterior predictive checks. That’s the standard way to evaluate how well a model can “fit” a dataset.

Taking a step back, it’s perhaps worth elaborating what you’re trying to do exactly (and what looks like a failure to you).

For instance in generalized linear models you don’t usually care about the marginal distribution of the data but the conditional distributions (conditioned on the predictors).

This is sometimes called “error” and perhaps that’s what you want to check for? If you share more details it may help

1 Like

Hi @ricardoV94, you are right, I should be checking for error and its distribution. Also, I got confused between marginal and conditional in this case, but things are clear now.