Modelling a timeseries of cumulative maximum

The wall of text is very much appreciated!

Let me see if I understand the proposed solution. We have that we can decompose the conditional cdf as:

F_{Y_{n+1} | Y_n=o_n}(o_{n+1}) = P(Y_{n+1} \le o_{n+1} | Y_n=o_n) = \\ = P(X \le o_{n+1}) I(o_n \le o_{n+1}) = F_X(o_n) I(o_n \le o_{n+1}) + I(o_n \le o_{n+1}) \int_{o_n}^{o_{n+1}}f_X = \\ = F_X(o_n) I(o_n \le o_{n+1}) + \frac{(1-F_X(o_n))}{(1-F_X(o_n))}I(o_n \le o_{n+1}) \int_{o_n}^{o_{n+1}}f_X = \\ = F_X(o_n) I(o_n \le o_{n+1}) + (1-F_X(o_n))I(o_n \le o_{n+1}) \frac{\int_{o_n}^{o_{n+1}}f_X}{(1-F_X(o_n))} = \\ = F_X(o_n) I(o_n \le o_{n+1}) + (1-F_X(o_n))I(o_n \le o_{n+1}) \frac{\int_{o_n}^{o_{n+1}}f_X}{\int_{o_{n}}^{\infty} f_X} = \\ \lambda_{n+1} P(A_{n+1} \le o_{n+1}) + (1-\lambda_{n+1}) P(X_{n+1} \le o_{n+1} | X_{n+1} > o_n)

Where we have that \lambda_{n+1} = F_X(o_n) and A_{n+1} is a constant random variable with value o_{n}.

Hence we have expressed the conditional random variable Y_{n+1} | Y_n=y_n as a mixture of A_{n+1} and B_{n+1} = X_{n+1} | X_{n+1} > o_n, with mixing probability \lambda_{n+1} = F_X(o_n).

Using the trick in the link you sent, we can then write the conditional likelihood for o_{n+1} given o_n as

\mathcal{L}_{n+1} (o_{n+1} | o_n) = \lim_{\epsilon\to 0} \frac{F_{Y_{n+1} | Y_n=o_n}(o_{n+1} + \epsilon) - F_{Y_{n+1} | Y_n=o_n}(o_{n+1} - \epsilon)}{2\epsilon} \approx \\ = \left.\begin{cases} \lambda_{n+1} F_{A_{n+1}}(o_{n+1}), & \text{if } o_{n} = o_{n+1}\\ (1 - \lambda_{n+1}) f_{B_{n+1}}(o_{n+1}), & \text{if } o_{n} < o_{n+1}\\ \end{cases}\right\} = \\ = \left.\begin{cases} F_X(o_n) , & \text{if } o_{n} = o_{n+1}\\ (1 - F_X(o_n)) \frac{f_X(o_{n+1})}{\int_{o_n}^\infty f_X}, & \text{if } o_{n} < o_{n+1}\\ \end{cases}\right\} = \\ = \left.\begin{cases} F_X(o_n) , & \text{if } o_{n} = o_{n+1}\\ f_X(o_{n+1}), & \text{if } o_{n} < o_{n+1}\\ \end{cases}\right\}

denoting as D the set of indices i>1 where o_i = o_{i-1} and C its complement, we can then denote the total likelihood as:

\mathcal{L}(\mathbf o) = f_X(o_1)\prod_{i\in D} F_X(o_i) \prod_{i\in C} f_X(o_{i})

To make this work in pymc3, we have to code the logarithm of this expression as a python function, and implement it as a joint distribution object (which I am still confused about how to do)

So far what I could implement is the code for the log likelihood part.
If we assume that the underlying distribution of X_i is normal then it would be:

import numpy as np
import pymc3 as pm

def logp(os, mu, sigma, epsilon = 0.00001):
  x_dist = pm.Normal.dist(mu=mu, sigma=sigma)
  log_likelihood = 0
  o_n_minus_1 = -np.math.inf
  for o_n in os:
    if np.abs(o_n - o_n_minus_1) < epsilon: # if the value is equal to the last one
      log_likelihood += x_dist.logp(o_n).eval()
    elif o_n_minus_1 - o_n: #if the value is superior than last one
      log_likelihood += x_dist.logcdf(o_n).eval()
    else: # if the value is inferior than last one
      log_likelihood += -np.math.inf
    o_n_minus_1 = o_n
  return log_likelihood

We can test this log likelihood function on some sampled data:

# TEST THE LOGP FUNCTION

# Model parameters
N = 20
mu = 10.
sigma = 0.1

# Sample generation
ts = np.arange(N)
obss = []
obs = -math.inf
for t in ts:
  aux = mu + sigma*np.random.randn()
  obs = np.max([obs, aux])
  obss.append(obs)
obss = np.array(obss)

mus = np.linspace(0., 20.)
log_likelihoods = np.vectorize(lambda mu : logp(obss, mu=mu, sigma=1.))(mus)

import matplotlib.pyplot as plt
plt.plot(mus, log_likelihoods);

image

The peak of the loglikelihood corresponds to the value of \mu we used to generate the data, so it seems to be working!

I tried implementing this as a DensityDistribution in PyMC3 as this:

with pm.Model() as model:
  mu = pm.Uniform('mu', 0., 20.)
  sigma = pm.Uniform('sigma', 0.5, 1.5)
  Ys = pm.DensityDist('Ys', logp, observed = {'os':obss, 'mu': mu, 'sigma': sigma})
  trace = pm.sample(1000)
  pm.traceplot(trace, ['mu', 'sigma'])

But PyMC3 didn’t like my logp; I get TypeError: Variables do not support boolean operations.
I guess I needed to implement the function as you said, with different distirbution depending on whether the observations are the same or different.

1 Like