I have some sampled data (x1, x2) but it’s a biased sample. I also know the population means (and variance potentially). I want to generate a weight w_i for each row in the sample so that the sampled mean matches the population mean (closely). How would you do this in a Bayesian way?
I could have a normal or beta prior for the weight
wi ~ N(1, …)
wi ~ Beta(1, 1) + .5
But I think there is also an identifiability problem since we only compare two values (weighted mean vs true mean)
Actually a simple heuristic works well, weight = inverse distance from true mean for that individual.
I would suggest taking a look at this mixture model notebook for some ideas about an approach to a similar, but not identical, problem. As you suggest, if you have many rows/observations, are looking to infer a mixture of those observations, and only have the grand mean (and possibly variance), you are unlikely to be able to make much progress. However, if you have additional information it may be possible. For example, if your observations can be categorized (e.g., gender, state, time, etc.) or have other attributes, then you start asking about the mixture of types rather than the mixture of observations (and then you’re closer to the problem the above notebook is trying to address).