Cannot sample with DEMetropolis* when using pm.Constant

I’m working with the ABC example provided on pymc.io. I’ve tried a couple of factorial changes:

  • Changing from the default “Slice”/CompoundStep sampler to DEMetropolis or DEMetropolisZ (Works)
  • Changing the m parameter from a Uniform prior to pm.Constant("m", c = 0.4) (Works)
  • Changing the m parameter from a Uniform prior to pm.Constant("m", c = 0.4) and using step = DEMetropolis() or step = DEMetropolisZ() (DOES NOT WORK)
import numpy as np
import pymc as pm
import aesara.tensor as at
import arviz as az
from matplotlib import pyplot

# define a aesara Op for our likelihood function
class LogLike(at.Op):

    """
    Specify what type of object will be passed and returned to the Op when it is
    called. In our case we will be passing it a vector of values (the parameters
    that define our model) and returning a single "scalar" value (the
    log-likelihood)
    """

    itypes = [at.dvector]  # expects a vector of parameter values when called
    otypes = [at.dscalar]  # outputs a single scalar value (the log likelihood)

    def __init__(self, loglike, data, x, sigma):
        """
        Initialise the Op with various things that our log-likelihood function
        requires. Below are the things that are needed in this particular
        example.

        Parameters
        ----------
        loglike:
            The log-likelihood (or whatever) function we've defined
        data:
            The "observed" data that our log-likelihood function takes in
        x:
            The dependent variable (aka 'x') that our model requires
        sigma:
            The noise standard deviation that our function requires.
        """

        # add inputs as class attributes
        self.likelihood = loglike
        self.data = data
        self.x = x
        self.sigma = sigma

    def perform(self, node, inputs, outputs):
        # the method that is used when calling the Op
        (theta,) = inputs  # this will contain my variables

        # call the log-likelihood function
        logl = self.likelihood(theta, self.x, self.data, self.sigma)

        outputs[0][0] = np.array(logl)  # output the log-likelihood


def main():
    def my_model(theta, x):
        m, c = theta
        return m * x + c

    def my_loglike(theta, x, data, sigma):
        model = my_model(theta, x)
        return -(0.5 / sigma**2) * np.sum((data - model) ** 2)

    # set up our data
    N = 10  # number of data points
    sigma = 1.0  # standard deviation of noise
    x = np.linspace(0.0, 9.0, N)

    mtrue = 0.4  # true gradient
    ctrue = 3.0  # true y-intercept

    truemodel = my_model([mtrue, ctrue], x)

    # make data
    rng = np.random.default_rng(716743)
    data = sigma * rng.normal(size=N) + truemodel

    # create our Op
    logl = LogLike(my_loglike, data, x, sigma)

    # use PyMC to sampler from log-likelihood
    with pm.Model():
        # uniform priors on m and c
        # m = pm.Uniform("m", lower=-10.0, upper=10.0)
        m = pm.Constant("m", c = 0.4)
        c = pm.Uniform("c", lower=-10.0, upper=10.0)

        # convert m and c to a tensor vector
        theta = at.as_tensor_variable([m, c])

        # use a Potential to "call" the Op and include it in the logp computation
        pm.Potential("likelihood", logl(theta))

        step_func = pm.DEMetropolisZ(vars = [m, c])

        # Use custom number of draws to replace the HMC based defaults
        trace = pm.sample(
            draws = 3000, step = step_func,
            chains = 4, return_inferencedata = True)

    # plot the traces
    az.plot_trace(trace, lines=[("m", {}, mtrue), ("c", {}, ctrue)]);
    pyplot.show()


if __name__ == '__main__':
    main()

The output when using DEMetropolis* and a Constant parameter is:

Multiprocess sampling (4 chains in 4 jobs)
DEMetropolisZ: [c, m]
Sampling 4 chains for 1_000 tune and 3_000 draw iterations (4_000 + 12_000 draws total) took 4 seconds.
/usr/local/python-env/pymc4/lib/python3.8/site-packages/arviz/stats/diagnostics.py:586: RuntimeWarning: invalid value encountered in double_scalars
  (between_chain_variance / within_chain_variance + num_samples - 1) / (num_samples)
/usr/local/python-env/pymc4/lib/python3.8/site-packages/arviz/stats/density_utils.py:491: UserWarning: Your data appears to have a single value or no finite values
  warnings.warn("Your data appears to have a single value or no finite values")
/usr/local/python-env/pymc4/lib/python3.8/site-packages/arviz/stats/density_utils.py:491: UserWarning: Your data appears to have a single value or no finite values
  warnings.warn("Your data appears to have a single value or no finite values")

The resulting trace plot shows both parameters fixed at their initial values (not just m) and inspection of the posterior in the inference data shows that the same, initial value is stored at every step for both parameters.

Versions and main components

  • PyMC/PyMC3 Version: 4.0.0b6
  • Aesara/Theano Version: 2.5.1
  • Python Version: 3.8.10
  • Operating system: Ubuntu GNU/Linux 20.04
  • How did you install PyMC/PyMC3: pip

pm.Constant requires a clever sampler, because it can only ever propose one value. Usually you should not have to use it, and instead just add the constant values directly in whatever distribution needs them.

That’s a bummer. Hard-coding the constant values will require changing the model API, which limits experimentation. Unless, is it possible(?) to do something like:

m = pm.Constant('m', c = 0.4)
theta = at.as_tensor_variable([m, c]) # Instead of this

theta = at.as_tensor_variable([0.4, c]) # Do this?

This does seem to work (in this toy example), so maybe I don’t have to change my API. Both pm.Constant and pm.DiscreteUniform did work with DEMetropolis* samplers in v3.11.5 (numerical issues notwithstanding), so I was caught off-guard here. But thanks for your help!

1 Like

Yes that is perfectly valid.