Multiplicative Factor Model (degradations)

I wish to model a system that has a certain level of performance, and then various kinds of degradations.

The basic model is

   perf = (ax + b) * gamma

The most important constraint is that the degradation is 0 < gamma <= 1.0. In the ultimate model I’ll have a couple of different multiplicative degradations, each dependent on different dependent variables.

Here is a simpler example, additive instead of multiplicative, and just a constant base. I have a base level of performance, and then that level is degraded by a ‘cognitive’ term that is positive. I don’t know how to add observed data to anything other than a distribution. So I added a small Normal (noise) term with a very small prior.

This feels ill-posed, although I know the answer I want. The base performance can be any value, as long as the degradation term has a hard cutoff (zero in the example below.). With the real data, I want to model the degradations with an exponential fall off starting at 1.

How do I write this in PyMC?

Thanks.

– Malcolm

n = 1000
data = np.random.uniform(size=n)*.1 + 0.7

with pm.Model() as simple_model:
  base = pm.Normal('base', mu=0.5, sigma=0.5)
  cog_state = pm.Uniform('state')
  degraded = pm.Deterministic('degraded', base - cog_state)
  y_pred = pm.Normal('y_pred', mu=degraded, sigma=.01, observed=data)

  idata_simple = pm.sample()

I found the basic algorithm elsewhere on discourse, and had to update it a bit. So now I have a shifted Exponential distribution, as described here: Exponential likelihood at different location. Hurray.

But I really want to flip the distribution, so I define this version, and I can sample from it.

import pymc as pm
from pytensor.tensor import TensorVariable

def dist(lam: TensorVariable, shift: TensorVariable, size: TensorVariable) -> TensorVariable:
    return -pm.Exponential.dist(lam, size=size) + shift

with pm.Model() as m:
    lam = pm.HalfNormal("lam")
    shift = -1
    pm.CustomDist(
        "custom_dist",
        lam, shift, dist=dist,
        # observed=[-1, -1, 0],
        observed=[-2, -2, -1], # Make sure observed data is valid (not 0 probability)
    )

    prior = pm.sample_prior_predictive()
    posterior = pm.sample()

for shift in [-1, 2, 3]:
  x = pm.CustomDist.dist(4, shift, dist=dist)
  x_draws = pm.draw(x, draws=4000)
  plt.hist(x_draws, bins=20, label=shift)
plt.legend();

shifted_exp

All is good.

My real data looks like this, adding the linear regression first.

size = 200
true_intercept = 4
true_slope = 6

x = np.linspace(0, 1, size)
# y = a + b*x
true_regression_line = true_intercept + true_slope * x
# add noise, rate = 1/scale
y = true_regression_line * 1/(1+np.random.exponential(scale=.5, size=size))

# Here's what it looks like
plt.figure(figsize=(3,3))
plt.scatter(x, y)
plt.plot(x, true_regression_line, color='red', linewidth=3)
plt.xlabel('x'); plt.ylabel('y');

sample_data

But when I try to do inference it says: RuntimeError: The logprob terms of the following value variables could not be derived:

Here is the code I’m running. Why can’t it derive the logp? And how do I do it myself?

with pm.Model() as linreg3:
  lam = pm.HalfNormal('sigma', initval=0.0) # , beta=10, initval=1.)
  intercept = pm.Normal('Intercept', 0, sigma=20)
  x_coeff = pm.Normal('Slope', 0, sigma=20)

  likelihood = pm.CustomDist('y', lam, 
                              intercept + x_coeff * x, 
                              dist=dist,
                              observed=y, 
                              )
  trace2 = pm.sample()

I’m confused about what I broke, and I’m not sure how to proceed.

Thanks for any help.

– Malcolm

Negating should be fine, but there’s still some things that need to be fixed in the automatic logprob inference. Let me see if I can reproduce

I don’t see your error, only that the initial point has non-finite logp:

linreg3.point_logps()
# {'sigma': -inf, 'Intercept': -3.91, 'Slope': -3.91, 'y': -inf}

This is however a problematic model to sample. Your priors should try to exclude impossible values, so the sampler doesn’t find them, or you should reparametrize so the data is on a more unconstrained space (like how positive data is usually modeled on a log-scale, or equivalently using an exponential link)

Thank you!!!

I assembled all the test code and results into this colab.

I took your suggestion about just the initial point failing, and I provided initial points for all the distributions. Not because this is the final solution, but it’s a quick and good test. If I start everything with the right initval’s, then all is good. But if I leave one out…

I get an error I don’t understand.

Initial evaluation of model at starting point failed!
Starting values:
{'lam_log__': array(-0.1364851), 'Intercept': array(3.05363133), 'Slope': array(0.07561529)}

Logp initial evaluation results:
{'lam': -0.74, 'Intercept': -3.93, 'Slope': -3.91, 'y': -inf}
You can call `model.debug()` for more details.

I have specified that lam is a HalfNormal. Why the !#%#% should it be trying to evaluate it (initially) at -0.74??? That is not allowed by the prior! And why doesn’t the Intercept have the initval I specified (4)?

My real model is going to add a logistic regression, so it will be harder to initialize and specify priors. So I want to understand what is going on and what is wrong with this simple model.

Thanks for the advice. Any clues about the initial values?

– Malcolm

0.74 is not the point but the density evaluated at the initial point. The value is shifted because we jitter the initial point for each chain when starting NUTS. Tweaking the initial point will only dealy the inevitable encounter with hard boundaries that screws up NUTS. It’s fine for your initial debugging of course.

OK, so the probability of the model (y?) evaluated to -inf at this starting point, even though all the prior distributions were small negative integer numbers. Right?

Would a “soft” exponential distribution mitigate that problem? There is a hard zero for half the space. That would make the derivatives small. Would MCMC be happier if the distribution was not so binary? Perhaps add a low-amplitude, wide Gaussian to the Exponential.

Is that (adding a low-ampltiuyde Gaussian to the Exponential) something I can do in a CustomDist? Can I add two distributions? Or do I need to modify distributions.Exponential to return a small number instead of zero? What would you advise?

You also recommended tighter priors. That would keep the sampling closer to the good solutions, and reduce the chance of a -inf log probability? Is that the right model for the suggestion?

Thanks for the help.

  • Malcolm

Thank you.

As you suggested, a tighter prior did mitigate the starting point failure.

I’m still curious as to whether a less drastic distribution might also help. But I’m not sure how to combine an exponential and (perhaps) a low-level Gaussian into a new distribution.

Thanks for the help.

– Malcolm

Maybe a SkewNormal gives you something like that soft exponential?

Hmm… interesting. That does seem to work. Now I have to figure out which approach works best. But that ball’s in my court!

Thank you, thank you, thank you Sir Ricardo!

– Malcolm

1 Like