Adding observed to result of maximum

Jaime_Sevilla · August 1, 2021, 1:23pm

I want to fit a model to the maximum of two random variables.
But it is not clear to me what is the syntax to do so.

import numpy as np
import pymc3 as pm

data = np.array([np.max([10 + np.random.randn(), 10 + np.random.randn()])])

with pm.Model() as model:
  a = pm.Normal('a', mu=10, sigma=1)
  b = pm.Normal('b', mu=10, sigma=1)
  c = pm.math.maximum(a, b, observed = data) # This doesn't work!

How can I feed my observations to the model?

jlindbloom · August 1, 2021, 4:13pm

I don’t think there is currently a simple way to do this for general RVs using PyMC3’s API. However, I can show you a trick for your case with two Gaussians.

The reason your code doesn’t work is because the maximum of two random variables is another random variable, whereas you are trying to use the vector maximum function that just selects the maximal element of the array. This is why observed = data throws the error, since this only works as a kwarg for random variables.

To get this working, you will need a PyMC3 distribution that represents the distribution of the maximum of two Gaussians. So what is this distribution? In the general case of n i.i.d. random variables X_1, \ldots, X_n, we can express the pdf of the maximum \rho(x) in terms of the pdf f(x) and cdf F(x) of X_1 as

\rho(x) = n [F(x)]^{n-1} f(x).

So for a general random variable, you could write a custom PyMC3 distribution that calls the pdf/cdf of the underlying random variable in order to define the distribution over the maximum. I think this would be an interesting but simple feature request for PyMC3 that wouldn’t be too hard to implement, and it wouldn’t surprise me if someone has already done this on their own.

In the meantime, you are lucky because it just so happens that the distribution for the maximum of two Gaussians is just a skew-Normal distribution, which is already implemented as a distribution in PyMC3. From here the pdf for the max of two standardized but correlated Gaussians is

\rho(x) = 2 \phi(x) \Phi \Big( \cfrac{1 - r}{\sqrt{1 - r^2}} x \Big)

where r is the correlation coefficient. For your case, it seems like you want r = 0. Since you have non-standardized Gaussians the above becomes

\rho(x) = 2 \phi \Bigg( \frac{x - \mu}{\sigma} \Bigg) \Phi \Bigg( \frac{1 - r}{\sqrt{1 - r^2}} \cdot \frac{x - \mu}{\sigma} \Bigg)

where for your case you have \mu = 10, \sigma = 1. We are almost there. The pdf above is implemented in PyMC3’s skew-normal distribution, where you take the parameters to be \mu = \mu, \sigma = \sigma, and \alpha = \frac{1- r}{\sqrt{1 - r^2}}. If your Gaussians are uncorrelated just take \alpha = 1.0.

TL;DR Here’s a Collab notebook that should solve your problem. One thing to note is that you aren’t actually inferring any parameters in your model, so maybe you want to pretend you don’t know what \mu and \sigma are. The code I used was:

# You only had one observation so I added more
data = np.array([np.max([10 + np.random.randn(), 10 + np.random.randn()]) for j in range(100)])

with pm.Model() as model:
  # Priors on mu and sigma
  mu = pm.Normal('mu', mu=9, sigma=3)
  sigma = pm.Uniform("sigma", lower=0.2, upper=4.0)

  # Likelihood
  likelihood = pm.SkewNormal("likelihood", mu=mu, sigma=sigma, alpha=1.0, observed=data)

Jaime_Sevilla · August 1, 2021, 7:38pm

This is an amazingly complete answer, thank you so much!

I was actually trying to implement this as an intermediate step of a more complex model that involved successively taking the maximum of a series of random variables, so (from what I understood from your answer) this solution doesn’t quite work out for me.

I have opened a new topic with more context, if you find the time to look at that I would really appreciate it!

Topic		Replies	Views
Modelling a timeseries of cumulative maximum Questions	19	1155	August 10, 2021
Newbie question - How to estimate a distribution if I only have the min, max and mean of observations Questions	4	2530	January 7, 2019
Sampling MaxEnt Distribution Questions	20	1986	September 28, 2017
Problem model definition PyMC3 Questions	12	4730	September 21, 2017
The Normal distribution with upper and lower bound Questions	2	1789	December 14, 2017

Adding observed to result of maximum

Related topics