Exponential likelihood at different location

Hi!

New on this discourse :slight_smile: I have searched for similar questions but couldn’t find one (apologies if I overlooked something).

My problem is the following (also described in somewhat more detail here. I have a linear function, that I would like to recover, but the “noise” on the data is exponential: only going upwards from the regression line I am after:
Regression_exponential_likelihood
The red line is the line to be recovered.

Usually (read: with e.g. a Normal likelihood) I can do something like

with pm.Model() as linreg:
    sigma = pm.HalfCauchy('sigma', beta=10, testval=1.)
    intercept = pm.Normal('Intercept', 0, sigma=20)
    x_coeff = pm.Normal('Slope', 0, sigma=20) 

    likelihood = pm.Normal('y', mu=intercept + x_coeff * x,
                        sigma=sigma, observed=y)

    trace = pm.sample()

This works, because the Normal distribution has a location parameter (the mean, mu). The Exponential does not have a location parameter, it is always at 0. I need it to be away from that, on the regression line I am trying to find.

I have solved it in a bit of a hacky way:

class shiftedExp(pm.Exponential):
    def __init__(self, lam=1., shift=0, *args, **kwargs):
        super().__init__(lam=lam, *args, **kwargs)
        self.lam = lam
        self.shift = shift
    
    def logp(self, value):
        return super().logp(value-self.shift)

And then I use the shiftedExp as my likelihood. As far as I can tell this works nicely, but I was wondering:

  1. Did I overlook a built-in way of doing this?
  2. Is this acceptable/recommended?
  3. Would it be worth having an optional shift (or loc, in more scipy-like lingo) parameter in the definition on the Exponential distribution. I think the changes in the code can be fairly trivial. If this sounds worthwhile to people, I’d be up to give it a try, but I would likely need some guidance, as it would be my first contribution…

Thanks!
Marcel

1 Like

Looks like a simple and clean solution to me.

@harcel I agree with @ricardoV94 that your solution is a simple and clean one. But I agree with your third point—I’ve been wanting an offset parameter (akin to scipy loc) for some of the distributions, including the Exponential class, and was wondering if this is something the developers were considering.

But to address your question, I was trying to think of a way of doing it that is built in and in line with your Normal example. The solution I came up with was to have the exponentially distributed error as an additive term to your mu= in your pm.Normal() likelihood definition and using a small and constant value for sigma=, as shown below.

First some test data:

x = np.array(range(0, 100))
m = 0.1
b = 10
scale = 2
err = sp.stats.expon.rvs(scale=scale, size=len(x), random_state=42)

y = m*x + b + err

plt.plot(x, y, '.')
plt.plot(x, m*x + b, '-')

example
I model this by creating a hierarchical parameter for the exponential decay (lam) but then use that to generate a bunch of independent error variables err for each of your data points. Like so:

 sigma_vals = np.array([0.1]*len(x))

with pm.Model() as m:
    lam = pm.HalfCauchy('err_lam', beta=10)
    intercept = pm.Normal('intercept', mu=0, sigma=20)
    slope = pm.Normal('slope', mu=0, sigma=20)
    
    err = pm.Exponential('errors', lam = lam, shape=[len(x),])
    
    y_vals = slope*x + intercept + err
    obs = pm.Normal('y', mu = y_vals, sigma=sigma_vals, observed = y)
    
    samps = pm.sample(1000)

Which returns the correct posteriors:
arviz.plot_posterior(samps, var_names=['intercept', 'slope', 'err_lam']);

The downside to this approach seems two-fold:

  1. One needs to specify an arbitrary value for sigma in the model (i.e. sigma_vals)
  2. One ends up with a bunch of extraneous variables associated with the error on each data point (i.e. err) (which I have omitted in the plot_posterior call)

I’m curious if there’s anything else people think is particularly wrong-headed about this approach or if they have suggestions for improvements.

1 Like

Thank you both for responding! @jstanley I like your approach! I would worry a little bit about interaction between your fixed value of sigma and the inferred posteriors for err_lam, but I guess that is something that can be checked after the fact.

Either way, it’s good to hear that I’m not the only one trying to invent workarounds for the “issue”. As said, I would be willing to contribute this change to pm.Exponential, and perhaps other distribution functions, if people in the lead think this is a good idea and if someone more experienced is willing to keep a mentoring eye on what I do.

1 Like

Yeah, I agree that you’d want to keep an eye on the interaction between the sigma value and the err_lam posterior. In principle, as sigma_val approaches zero you have the exact case you are looking for, but I wonder about the sampling stability for very small values of sigma.

I am but a humble pymc3 user, but whole-heartedly support your changes to incorporate a location parameter. I started looking at the implementation of pm.Laplace which is essentially the same functional form (except mirrored) and has a location parameter (mu). Seems like that could get you most of the way there with Exponential.
https://docs.pymc.io/api/distributions/continuous.html#pymc3.distributions.continuous.Laplace

I’m curious what the regular devs have to say.

I suggest you guys open an issue for discussion on the github repo. I don’t see any good reason not to add this functionality in general, specially if there is a demand for it. Several distributions already do it, but it’s inconsistent which ones do it and which ones don’t.

Thanks @ricardoV94. Just opened an issue:

I couldn’t make the suggested shiftedExp code work, pymc has changed, but I found a CustomDist example which did work.

Just for the record:

– Malcolm

# Defined: https://www.pymc.io/projects/docs/en/v5.10.4/api/distributions/generated/pymc.CustomDist.html

from pytensor.tensor import TensorVariable

def shiftedExp(
    lam: TensorVariable,
    shift: TensorVariable,
    size: TensorVariable,
) -> TensorVariable:
    return pm.Exponential.dist(lam, size=size) + shift

with pm.Model() as linreg:
    lam = pm.HalfCauchy('sigma', beta=10, initval=1.)
    intercept = pm.Normal('Intercept', 0, sigma=20)
    x_coeff = pm.Normal('Slope', 0, sigma=20) 

    likelihood = pm.CustomDist('y', lam, intercept + x_coeff * x, observed=y, dist=shiftedExp)
    trace2 = pm.sample()
2 Likes

Yes CustomDist was the actual solution we decided on in the end, because it’s open ended, you can shift, scale, tranaform however you want