Newbie question - How to estimate a distribution if I only have the min, max and mean of observations

Hi all,

Let’s say I have 20 observations from a random variable Normal(mu, sd)

I only know the min, max and mean of these 20 observations, but not the individual observations

Is there enough information to estimate mu and sd? How would you do it in pymc3?

Thanks for your help in advance!

So you basically have 3 observed data (min, max, and mean) and you try to estimate 2 parameters (mu, sd). I would say you dont have enough information in the observed to do it reliably.

Thanks for the reply. There is actually one more piece of information - the number of observations (20), but I take your point.

I guess what I am wondering is whether there is a way to feed information into a model that is not direct observations of random variables in the model, but rather some description of those observations…

Only if these description are sufficient statistics Sufficient statistic - Wikipedia.

As @junpenglao said, you have 3 observed (or 4 if you would want to model the number of observations n) and are trying to estimate 2 parameters. If you want to try to make a pymc3 model out of these, you need to know the probability distribution that governs each of your observed.

The mean of the n observations is computed as:

m = \sum\limits_{i=1}^{n}\frac{x_i}{n}

where each x_i is a normal with mean mu and standard deviation sigma. There are a bunch of properties that normal distributions have, you can compute the probability distribution of m by using the sum of normals and the product of a normal with a real value. This means that

m = pm.Normal('m', mu=mu, sigma=sigma/tt.sqrt(n))

The real problem is how to compute the probability density of the min and max estimators:

\mathrm{min} = \min\limits_{i}x_i

\mathrm{max} = \max\limits_{i}x_i

You can work this out by induction. Take the max estimator to start. If you have two x_i, then you can compute the conditional probability that P(x_1 > x_2| x_1) like

P(x_1 > x_2| x_1) = \int\limits_{-\infty}^{x_1}N(x_2 | \mu, \sigma) dx_2

This integral ends up being something that depends on the error function. If you have more than two observations, you need x_1 to be bigger than x_2, x_3, …, x_n, and given that they are all i.i.d. you get

P(x_1 > x_2 \wedge x_1 > x_3 \wedge ... \wedge x_1 > x_n | x_1) \propto \left[\int\limits_{-\infty}^{x_1}N(x_j | \mu, \sigma) dx_j\right]^{n-1}

where the proportionality constant depends on the number of possible permutations of x_{i\neq1}, which is (n-1)!.

Then to get the probability density of the maximum, you just need to multiply the conditional probability computed as above with x_1's marginal probability, which is the normal distribution. Then the probability density of the maximum will be the:

P(\max) \propto N(\max|\mu,\sigma)\left[\int\limits_{-\infty}^{\max}N(x_j | \mu, \sigma) dx_j\right]^{n-1}

where the proportionality constant depends on the combinations of x_1 being the maximum and the rest of the observations are exchangeable in order, or x_2 being the maximum and the rest of the observations are exchangeable in order, etc. I’m not sure but I think that the proportionality constant should be the combinatorial number \left(\begin{array}{c} n \\ n-1 \\ \end{array}\right).

The calculations for min are similar but you have to change the integration limits.

Of course, this probability distribution, and the corresponding one for the min, are not implemented in pymc3, but you could write them down as your custom distribution classes max_dist and min_dist (keep in mind that you need to supply logp not the pdf). After that, you can do some inference of mu and sigma as:

from theano import tensor as tt
import pymc3 as pm


n = 20
m_obs = # some value
min_obs = # some value
max_obs = # some value

with pm.Model():
    mu = pm.Normal('mu', mu=0, sigma=1)
    sigma = pm.HalfNormal('sigma', sigma=1)
    m = pm.Normal('m', mu=mu, sigma=sigma/tt.sqrt(n), observed=m_obs)
    min = min_dist('min', mu=mu, sigma=sigma, n=n, observed=min_obs)
    max = max_dist('max', mu=mu, sigma=sigma, n=n, observed=max_obs)
    trace = pm.sample()