Using a random variable as observed

It appears to be impossible to use a random variable as the observed for another variable. For example, consider this simple model:

import pymc3 as pm
import matplotlib.pyplot as plt
import scipy.stats as stats
import numpy as np

raw_observed = 0.3
with pm.Model() as model:
    alpha = pm.Exponential('alpha', 0.2)
    beta = pm.Exponential('beta', 0.2)
    raw_with_slight_adjustment = pm.Normal(
        'raw_with_slight_adjustment', raw_observed, 0.0001) 
    wpp = pm.Beta(
        'wpp', alpha=alpha, beta=beta, 
        observed=raw_with_slight_adjustment) 

This model raises a TypeError:

If the observed is a deterministic instead of a RV, the model can be defined. For example this variation of the above does not raise an error:

raw_observed = 0.3
with pm.Model() as model:
    alpha = pm.Exponential('alpha', 0.2)
    beta = pm.Exponential('beta', 0.2)
    slight_adjustment = pm.Normal('slight_adjustment', 0, 0.0001)
    raw_with_slight_adjustment = pm.Deterministic(
        'raw_with_slight_adjustment', 
        raw_observed + slight_adjustment)
    wpp = pm.Beta(
        'wpp', alpha=alpha, beta=beta, 
        observed=raw_with_slight_adjustment) 

But this new model does raise an error when the prior predictives are sampled:

Is using a variable as observed for another variable just a bad idea? Or is this a sensible feature that just has not yet been implemented?

PyMC v3.9.3

1 Like

It is something I am still trying to understand intuitively, but it seems the consensus is that we should treat our data as being completely deterministic, and any other type of noise should be explicitly modeled outside the likelihood.

Within the PyMC3 API in particular, the Deterministic should definitely not be used for observed, it is just an edge case that is not checked for.

Does this achieve the same goal?

    alpha = pm.Exponential('alpha', 0.2)
    beta = pm.Exponential('beta', 0.2)
    latent_raw = pm.Beta('latent_raw', alpha=alpha, beta=beta) 
    wpp = pm.Normal('wpp', mu=latent_raw, sigma=0.0001, observed=raw_observed) 
2 Likes

Just to echo Ricardo’s point I can’t offhand think of a conventional model (GLM, timeseries etc) where you’d want to evidence against an RV: you’d typically estimate that variance explicitly. That said, I’ve a nagging doubt that this could be useful as part of a belief network, to evidence at different nodes for message passing…

2 Likes

To @ricardoV94’s point, this simple example could be reparameterized without the RV in the observed. But my actual model is both much more complex, and not a conventional model. I was attempting to reparameterize it around another issue, and hit upon the ill-advised idea of using an RV in the observed, only to discover that it is not possible in pymc3.

I have two ideas on how to trick pymc3 into using an observed variable as observed, I am not sure if it’s a good idea though. It may also be important to note that if PyMC3 were to allow random variable as observed, that would break after sampling when trying to convert to ArviZ and compute convergence chechs.

Option 1

Use a pm.Potential. I believe something like pm.Potential("wpp", pm.Beta.dist(alpha=alpha, beta=beta).logp(raw_with_slight_adjustment)) will add the right term to the joint probability.

Option 2

Use the cursed DensityDist. There are at least 5 options to do that with a densitydist, 3 of which should work in this situation, afaik, all 5 should work when using proper observed values (and many won’t need the density_dist_obs thingy when doing so). Remember to use idata_kwargs={"density_dist_obs": False}

2.1
DensityDist(
    "wpp", 
    pm.Beta.dist(alpha=alpha, beta=beta).logp, 
    observed=raw_with_slight_adjustment
)

This won’t work because if observed is not a dict, DensityDist has the same check all other distributions do, and doesn’t allow FreeRVs to be passed.

2.2
DensityDist(
    "wpp", 
    pm.Beta.dist(alpha=alpha, beta=beta).logp, 
    observed={"value": raw_with_slight_adjustment}
)

This should now trigger the dict checks, and avoid the FreeRV one. I still don’t know if this is a feature or a bug, at the very least feels wrong to me. One can see in the docs that the logp method of beta takes a value argument, so this should be public knowledge, what I don’t think is documented anywhere official is the array vs dict input trick. I have lost count of how many discourse posts and issues I have answered explaining that and saying density_dist_obs=False should be used.

2.3
DensityDist(
    "wpp", 
    lambda a, b, raw: pm.Beta.dist(alpha=a, beta=b).logp(raw), 
    observed={"raw": raw_with_slight_adjustment, "a": alpha, "b": beta}
)
2.4
def logp_fun(raw):
    pm.Beta.dist(alpha=alpha, beta=beta).logp(raw)
pm.DensityDist("wpp", logp_fun, observed=raw_with_slight_adjustment)

Again, won’t work because no dict input

2.5
def logp_fun(raw):
    pm.Beta.dist(alpha=alpha, beta=beta).logp(raw)
pm.DensityDist(
    "wpp", logp_fun, observed={"raw": raw_with_slight_adjustment}
)

Disclaimer: I don’t have the slightest idea of which of these approaches is the one DensityDist was designed for, if there are any differences between them when it comes to preformance nor if one of these (or a 6th alternative) should be the “recommended” approach. I personally prefer the Potential option, but this opinion is only based on style and conceptual coherence in the use of observed

3 Likes

Good idea with the Potential. I always forget about it! Only issue is that you can’t do (automatic) predictive sampling that way

2 Likes

@OriolAbril: Like @ricardoV94, I always forget about Potential. And I did not even know about DensityDist. Will try both. Meanwhile marked as solved, betting the odds.

1 Like

You know, in light of @OriolAbril’s answer I must not have interpreted your question properly and/or not thinking straight!

You can indeed use a Potential in this situation and I actually have, and quite recently too. Not quite the same situation, but using a distribution to transform observed data, then evidencing that transformed data against another rv. Used as part of a copula transform.

So +1 for Potential :slight_smile:

1 Like