The wall of text is very much appreciated!
Let me see if I understand the proposed solution. We have that we can decompose the conditional cdf as:
Where we have that \lambda_{n+1} = F_X(o_n) and A_{n+1} is a constant random variable with value o_{n}.
Hence we have expressed the conditional random variable Y_{n+1} | Y_n=y_n as a mixture of A_{n+1} and B_{n+1} = X_{n+1} | X_{n+1} > o_n, with mixing probability \lambda_{n+1} = F_X(o_n).
Using the trick in the link you sent, we can then write the conditional likelihood for o_{n+1} given o_n as
denoting as D the set of indices i>1 where o_i = o_{i-1} and C its complement, we can then denote the total likelihood as:
To make this work in pymc3, we have to code the logarithm of this expression as a python function, and implement it as a joint distribution object (which I am still confused about how to do)
So far what I could implement is the code for the log likelihood part.
If we assume that the underlying distribution of X_i is normal then it would be:
import numpy as np
import pymc3 as pm
def logp(os, mu, sigma, epsilon = 0.00001):
x_dist = pm.Normal.dist(mu=mu, sigma=sigma)
log_likelihood = 0
o_n_minus_1 = -np.math.inf
for o_n in os:
if np.abs(o_n - o_n_minus_1) < epsilon: # if the value is equal to the last one
log_likelihood += x_dist.logp(o_n).eval()
elif o_n_minus_1 - o_n: #if the value is superior than last one
log_likelihood += x_dist.logcdf(o_n).eval()
else: # if the value is inferior than last one
log_likelihood += -np.math.inf
o_n_minus_1 = o_n
return log_likelihood
We can test this log likelihood function on some sampled data:
# TEST THE LOGP FUNCTION
# Model parameters
N = 20
mu = 10.
sigma = 0.1
# Sample generation
ts = np.arange(N)
obss = []
obs = -math.inf
for t in ts:
aux = mu + sigma*np.random.randn()
obs = np.max([obs, aux])
obss.append(obs)
obss = np.array(obss)
mus = np.linspace(0., 20.)
log_likelihoods = np.vectorize(lambda mu : logp(obss, mu=mu, sigma=1.))(mus)
import matplotlib.pyplot as plt
plt.plot(mus, log_likelihoods);

The peak of the loglikelihood corresponds to the value of \mu we used to generate the data, so it seems to be working!
I tried implementing this as a DensityDistribution in PyMC3 as this:
with pm.Model() as model:
mu = pm.Uniform('mu', 0., 20.)
sigma = pm.Uniform('sigma', 0.5, 1.5)
Ys = pm.DensityDist('Ys', logp, observed = {'os':obss, 'mu': mu, 'sigma': sigma})
trace = pm.sample(1000)
pm.traceplot(trace, ['mu', 'sigma'])
But PyMC3 didn’t like my logp; I get TypeError: Variables do not support boolean operations.
I guess I needed to implement the function as you said, with different distirbution depending on whether the observations are the same or different.