Dealing with negative values and the exponential distribution

Hi everyone,

a very newbie question:

I’m looking to model time between events with a relatively large range (~1-6000 seconds). The empirical data is well described by an exponential distribution, eg:

with pm.Model() as model0:

    a = pm.Gamma("a", mu=0.1, sigma=0.5)
    sec = pm.Exponential("sec", a, observed=d0.tdeltas)
    trace0 = pm.sample(1000, tune=1000)

I run into issues when I want to model additional effects. For example, an effect of time. I have ~10000 sequential observations, which I turn into a ‘time’ variable as follows (I realize this can be done more accurately later by using actual time between events):

time = np.arange( len(d0) )
time = (time - time.mean()) / time.std()

As the effect of time could be positive or negative, I wanted to use a Gaussian:

with pm.Model() as model0:

    a = pm.Gamma("a", mu=0.1, sigma=0.5)
    b = pm.Normal("b", mu=0, sd=1)
    lam = pm.Deterministic("lam", a + b*time)
    sec = pm.Exponential("sec", lam, observed=d0.tdeltas)
    
    trace0 = pm.sample(1000, tune=1000)

This results in zero divison errors and “initial evaluation errors” like so:

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'a_log__': array(-1.620602, dtype=float32), 'b': array(0.9321959, dtype=float32)}

Initial evaluation results:
a_log__   -3.38
b         -1.35
sec        -inf
Name: Log-probability of test_point, dtype: float32

I assume this has to do with the fact that some samples could result in negative (and thus invalid) values for lam? I haven’t found a solution yet going through the documentation/forums and would be very thankful for any pointers!

1 Like

Yes that is correct. The exponential distribution requires lam to be a positive quantity and your b parameter could be initialized at a negative value.

An alternative approach to modelling that comes to the top of my mind would be to log your data and fit it in a linear model. Perhaps something like:

a = pm.Gamma("a", mu=0.1, sigma=0.5) # maybe use a Normal prior here?
b = pm.Normal("b", mu=0, sd=1)
lam = pm.Deterministic("lam", a + b*time)
sec = pm.Normal("sec", lam, observed=np.log(d0.tdeltas))

trace0 = pm.sample(1000, tune=1000)
3 Likes

Thanks a lot for your suggestion! (And someone for formatting my post:))

I’ve experimented with it and I’m wondering whether it’s a problem that the resulting distribution (after taking the log) is not really Gaussian?
image

I’ve been playing around with a simplified version to understand log-transformations:

with pm.Model() as model0:
    a = pm.Normal("a", mu=6, sd=1)
    b = pm.Exponential("b", 1/2)
    sec = pm.Normal("sec", mu=a, sd=b, observed=np.log(d0.tdeltas+1e-8))
    
    trace0 = pm.sample(1000, tune=2000)

Whereas the empirical mean is close to 600, I can’t recover this accurately:

with model0:
    ppc = pm.sample_posterior_predictive(trace1)
    
az.plot_ppc(az.from_pymc3(posterior_predictive=ppc, model=model0));
print("mean=", np.exp(ppc["sec"]).mean())
mean= 734.969762489091

image

(Not sure why my posterior mean is larger than the empirical mean - the ppc plot seems to indicate the opposite.)

Anyway - I just wanted to check whether I’m misinterpreting something or whether this is an unavoidable downside of transforming my data?

I would suggest looking at either Poisson regression or negative binomial regression for inspiration/ideas. These models are traditionally used to model counts and other non-negative outcome variables (e.g., length of inter-event intervals). In Poisson, you construct a linear expression (e.g., \alpha + \beta X) and then the (canonical) link function is an exponential (so E[Y]=e^{\alpha + \beta X} or, ln(E[Y])=\alpha + \beta X, which was similar to the suggestion by @larryshamalama ) which forces the expectation to be non-negative. However, as you note, the transformation warps things (e.g., a prior over \beta that is normally distributed in the linear expression will not “look” normal in the transformed space). But hopefully that will give you some resources to check out!

4 Likes

Thank you for the suggestions, I’m going to look into it!

Much appreciated.

1 Like