Observed Deterministic

Hi there :)
As the title says, I’m looking for a way to build a model where the observed variable is a combination of several latent ones, and the uncertainties correspond to those latent variables. To give a very simple example, suppose I measure values a_i with some uncertainties but observe only differences a_i - a_0. So I would like to represent this as a = pm.Normal(..., shape=n), a_diff = pm.Deterministic(a[1:] - a[0], observed=data) - but currently there is no such possibility. The only similar thing I found is something like a_diff = pm.Normal(mu=a[1:] - a[0], sd=0.001, observed=data) and it works, but sampler has difficulties for small sd (as expected), and sd value itself has to be specified.
So, what do you think about this?

I don’t think there is an easy way to model it in PyMC3, as this class of problem falls into Likelihood-free inference. To model it in PyMC3 you need to find a way to express the likelihood.
For example, in the case above if a_i and a_0 are both have a Normal distribution, say a_i ~ N(mu_i, var_i) and a_0 ~ N(mu0, var0), then a_i - a0 ~ N(mu_i-mu0, var_i+var0). You can try using Sympy to work it out.
If a close form likelihood is difficult to find, you need some likelihood-free inference approach (approximate bayesian computation, etc) like https://github.com/elfi-dev/elfi

[Edit] You can also use SMC-ABC introduced in https://github.com/pymc-devs/pymc3/pull/3404

Well, currently I do basically this (like a_i - a0 ~ N(mu_i-mu0, var_i+var0)), just hoped that maybe there is a way to specify the model as it is. And what’s the deal-breaking difference between a_diff = pm.Normal(mu=a[1:] - a[0], sd=0.001, observed=data) (which is possible) and a_diff = pm.Deterministic(a[1:] - a[0], observed=data) (which isn’t)? I thought that they would require basically the same processing.

Although they seem similar, in practice associating observation to a deterministic node is really difficult. In PyMC3 a deterministic node is meant to record a deterministic mapping $f(x_1, x_2, …) \to y$, and if y is observed we need to find the likelihood of $\pi(y | x_1, x_2, …)$. This is something difficult to do besides a few limited cases, as we need to do a jacobian correction to keep the volume constant before and after the deterministic mapping.

Not to drive away the focus of the discussion but what are some of the draw backs of likelihood-free inference approaches?

Hi,

I was just wondering if this has changed since this was raised.

I have a similar situation as the original poster.

Thanks in advance!

I wouldn’t expect much progress in this area, as the limitation is not that pymc3 has no implementation of observed deterministics. Rather, no good algorithm has been found to do this.

To expand on @junpenglao’s comment, the issue lies in the fact that saying

pm.Deterministic('F', f(x1, x2, x3, x4), observed=x_obs)

corresponds to the likelihood

\prod_{i=1}^N \int \int \int \int \mathbb{I}[f(x_1, x_2, x_3, x_4) = x_{obs}^{(i)}]P(x_1,x_2,x_3,x_4)dx_1dx_2dx_3dx_4

which is a nasty high-dimensional integral. This comes with the added bonus that the event set is typically of measure-0, even in the simplest case of f(x_1, x_2) = x_1 + x_2. This corresponds to a line x_2 = x_{obs}^{(i)} - x_1 which will be of 0 volume in \mathbb{R}^2, and thus impossible to sample from. So we’d need some way of performing high-dimensional integration on an arbitrary implicit surface. Naturally this problem is a great deal harder than sampling from a likelihood.

@aplavin’s comment, in this light, is even more striking. Why is it that, in the above likelihood, the parameters of P(x_1, \dots) are in general intractable, while using

x_{obs} \sim N(f(x_1, x_2, x_3, x_4), \sigma_e^2)

Or, equivalently, replacing

y = f(x)

with

y = f(x) + \epsilon

(as an aside: this is the entry-point into everything Gaussian Process). After all the situation is barel
y any different as now:

P(x|y) = \int \int \int \int Z_\sigma e^{(y-f(x))^2/(2\sigma^2)}P(x_1,x_2,x_3,x_4)dx_1dx_2dx_3dx_4

The short answer is that \epsilon provides a volume, so that there is no longer a measure-0 constraint set, making the integral amenable to standard numerical integration (such as MCMC).

3 Likes

A post was split to a new topic: Decomposition of rates?

Thank you @chartl, this is exactly the explanation I was hopping for. So, in summary, we can assume model - obs = N(0, sigma), and

with pm.Model() as model:
    x1 = pm.Uniform('x1', 0, 1)
    x2 = pm.Uniform('x2', 0, 1)

def fun(x = x1, x2 = x2):
    return x1 + x2

with model:
    modeled_mu = pm.Deterministic('modeled_mu', fun())

with model:
    obs = pm.Normal( "obs",  mu=modeled_mu, tau=1, observed=observed_data)

because we are assuming that the likelihood is gaussian and we are only modeling mu (we could also model tau). Is this correct?

Thank you!