My target variable - TOTAL_CONV.
I started with Regular Multiple Linear Regression - no luck at all.
Then Bayesian with pymc3 (likelihood - Normally distributed) - ppc gives me negative results which is not acceptable
Then Changed Likelyhood function to LogNormal Distribution and Beta(slope) to HalfNormal and it looks better. Results are reasonable and positive.
Now I’m trying to Do StudentT or InverseGamma for Likelihood and it gives me:
There were 5 divergences after tuning. Increase target_accept or reparameterize.
How do I check accuracy of My models?
Thank you for your help
Ok. I would suggest that your difficulty is due to the fact that your models (e.g., linear regression) is not well-suited to your data. You have already plotted data['TOTAL_CONV'] and it’s clear that it’s very (very very) skewed. So a model assuming normally distributed error is not going to go well and neither is a Student T (both are symmetric and thus won’t deal well with the skew).
I would probably step back and try to figure out exactly what you want out of your model. If you want to try an predict the continuous variation in your outcome variable, I’m not sure the data will be of much use to you (you don’t have enough variability ). If you instead could get by with predicting a dichotomous version of your outcome variable (e.g., data['TOTAL_CONV']>1), then you might be able make a bit more headway with the data you have. But it really depends on what you need.
Thank you for the tip.
I thought that’s one of the power pyMC3 -we can apply it to any kind of target distribution.
In my case I need to predict TOTAL_CONVER (It distributed skew to the right)