Im working on a regression-task, to simplify things, lets assume the response can be modelled as following: y = b1*x.
For some reason we have to scale the input-data and the output-data. A reason could be e.g to construct general priors that can function on a vast variety of datasets.
My question is this, assuming that y can solely take on positive integer values and we scale it by for exampling dividing all responses by the maximum of y. Does it make sense to still work with a e.g Normal or Student-T likelihood or should we tweak a poisson likelihood to act on these potential decimal values that however are discrete?
When you’ve scaled so that you don’t have integers, then likelihoods that operate on integers won’t work. You can try it yourself with something like
y = np.arange(10)
pm.logp(pm.Poisson.dist(mu=5), y/4).eval()
You’ll see that you get the same result as
pm.logp(pm.Poisson.dist(mu=5), np.floor(y/4)).eval()
If I understand correctly, tweaking like you suggest basically amounts to a potentially massive filtering of your data and will most likely end up in tears.
So the obvious thing to do is to use a likelihood for continuous data.