As @junpenglao said, you have 3 observed (or 4 if you would want to model the number of observations n
) and are trying to estimate 2 parameters. If you want to try to make a pymc3 model out of these, you need to know the probability distribution that governs each of your observed.
The mean of the n
observations is computed as:
m = \sum\limits_{i=1}^{n}\frac{x_i}{n}
where each x_i
is a normal with mean mu
and standard deviation sigma
. There are a bunch of properties that normal distributions have, you can compute the probability distribution of m
by using the sum of normals and the product of a normal with a real value. This means that
m = pm.Normal('m', mu=mu, sigma=sigma/tt.sqrt(n))
The real problem is how to compute the probability density of the min
and max
estimators:
\mathrm{min} = \min\limits_{i}x_i
\mathrm{max} = \max\limits_{i}x_i
You can work this out by induction. Take the max
estimator to start. If you have two x_i
, then you can compute the conditional probability that P(x_1 > x_2| x_1) like
P(x_1 > x_2| x_1) = \int\limits_{-\infty}^{x_1}N(x_2 | \mu, \sigma) dx_2
This integral ends up being something that depends on the error function. If you have more than two observations, you need x_1
to be bigger than x_2
, x_3
, …, x_n
, and given that they are all i.i.d. you get
P(x_1 > x_2 \wedge x_1 > x_3 \wedge ... \wedge x_1 > x_n | x_1) \propto \left[\int\limits_{-\infty}^{x_1}N(x_j | \mu, \sigma) dx_j\right]^{n-1}
where the proportionality constant depends on the number of possible permutations of x_{i\neq1}, which is (n-1)!.
Then to get the probability density of the maximum, you just need to multiply the conditional probability computed as above with x_1
's marginal probability, which is the normal distribution. Then the probability density of the maximum will be the:
P(\max) \propto N(\max|\mu,\sigma)\left[\int\limits_{-\infty}^{\max}N(x_j | \mu, \sigma) dx_j\right]^{n-1}
where the proportionality constant depends on the combinations of x_1
being the maximum and the rest of the observations are exchangeable in order, or x_2
being the maximum and the rest of the observations are exchangeable in order, etc. I’m not sure but I think that the proportionality constant should be the combinatorial number \left(\begin{array}{c} n \\ n-1 \\ \end{array}\right).
The calculations for min
are similar but you have to change the integration limits.
Of course, this probability distribution, and the corresponding one for the min, are not implemented in pymc3, but you could write them down as your custom distribution classes max_dist
and min_dist
(keep in mind that you need to supply logp
not the pdf). After that, you can do some inference of mu
and sigma
as:
from theano import tensor as tt
import pymc3 as pm
n = 20
m_obs = # some value
min_obs = # some value
max_obs = # some value
with pm.Model():
mu = pm.Normal('mu', mu=0, sigma=1)
sigma = pm.HalfNormal('sigma', sigma=1)
m = pm.Normal('m', mu=mu, sigma=sigma/tt.sqrt(n), observed=m_obs)
min = min_dist('min', mu=mu, sigma=sigma, n=n, observed=min_obs)
max = max_dist('max', mu=mu, sigma=sigma, n=n, observed=max_obs)
trace = pm.sample()