How to model Normally distributed but integer data?

Matthew_Francis · December 18, 2019, 8:38pm

Heya folks. I’d need to model data that comes on a 5 point integer scale (human ratings from 1-5) that are clearly somewhat normally distributed; ~50% of scores are 3 and less than ~10% are 1 or 5.

If I just use Normal for the Prior and the Likelihood, but all the observations are integers, does that lead to some kind of mis-specification of the model?

AlexAndorra · December 23, 2019, 5:11pm

Hi Matthew,
Yes if your observations are integers I think you should choose a likelihood that deals with integers – you can always try with a Normal likelihood and see how it goes. After all, a likelihood is nothing more than an assumption that you can test!
Here there isn’t enough info but maybe you have to take the order of the data into account (1 is better than 2-5 for instance) – so an ordered logistic could be appropriate.
Hope this helps

rlouf · December 24, 2019, 9:27am

Hi Matthew,

When modeling ratings the Multinomial distribution is usually a good choice. Unlike the normal distribution, it has a discrete (integers) and finite (1 to 5 stars) support.

DanWeitzenfeld · December 30, 2019, 7:53pm

Hi Matthew,

In Kruschke’s Doing Bayesian Data Analysis, he models data like yours as if there were an underling normal distribution that is then digitized using (inferred) outpoints. Check out this example: https://gist.github.com/DanielWeitzenfeld/d9ac64f76281e6c1d29217af76449664

Topic		Replies	Views
Fit mult-option data with Multinomial distributuion or normal distribution? Questions	4	438	January 12, 2022
Selecting likelihood for scaled count-data version agnostic modeling	1	307	March 22, 2023
Help with Hierarchical regression for discrete data v5	11	731	August 24, 2022
How best to build a model on 200k normally distributed observations without a simple vector relation (rather, a piecewise vector relation, i.e. subsets of data depend on a combination of parameters) Questions	1	379	July 23, 2021
Modeling a simple bayesian network Questions	0	403	September 30, 2021

How to model Normally distributed but integer data?

Related topics