Heya folks. I’d need to model data that comes on a 5 point integer scale (human ratings from 1-5) that are clearly somewhat normally distributed; ~50% of scores are 3 and less than ~10% are 1 or 5.
If I just use Normal for the Prior and the Likelihood, but all the observations are integers, does that lead to some kind of mis-specification of the model?
Yes if your observations are integers I think you should choose a likelihood that deals with integers – you can always try with a Normal likelihood and see how it goes. After all, a likelihood is nothing more than an assumption that you can test!
Here there isn’t enough info but maybe you have to take the order of the data into account (1 is better than 2-5 for instance) – so an ordered logistic could be appropriate.
Hope this helps
When modeling ratings the Multinomial distribution is usually a good choice. Unlike the normal distribution, it has a discrete (integers) and finite (1 to 5 stars) support.
In Kruschke’s Doing Bayesian Data Analysis, he models data like yours as if there were an underling normal distribution that is then digitized using (inferred) outpoints. Check out this example: https://gist.github.com/DanielWeitzenfeld/d9ac64f76281e6c1d29217af76449664