How do I implement an upper limit log normal distribution

grinsted · June 7, 2018, 11:00am

The upper limit log normal (ULLN) distribution is similar to a lognormal distribution but with an extra parameter for the upper limit (ymax). I am having difficulty implementing it in pymc3 (i am new to pymc3). This is my attempt at coding it, but it does not work as expected.

import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm

## generate surrogate data
sigma=1
mu=0
ymax=5
q = np.exp(np.random.randn(500)*sigma+mu)
y = (q*ymax)/(q + ymax)

plt.hist(np.log(y))
plt.xlabel('log surrogate')


## setup model

ullnmodel = pm.Model()
yobsmax = np.amax(y)

with ullnmodel:
    ymax = pm.Pareto('ymax',alpha=1,m=yobsmax)
    
    mu = pm.Normal('mu', mu=0, sd=5)
    sigma = pm.Lognormal('sigma', mu=0, sd=5)

    qt =  pm.math.log(y*ymax) - pm.math.log(ymax  - y)
    q = pm.Normal('q', mu=mu, sd = sigma, observed = qt)
    

    
startpoint = {'mu': np.mean(np.log(y)), 'sigma': np.std(np.log(y)), 'ymax': yobsmax*2.0}
map_estimate = pm.find_MAP(model=ullnmodel,start=startpoint)
map_estimate

junpenglao · June 7, 2018, 11:26am

Your use case is more similar to Truncated Inverse normal distribution (also known as Wald distribution). You can have a look at the discussion there which is quite in-depth.

In general, if you are transforming the observed variable, you need to make sure to adjust the volume changes by adding the jacobian. In your code above, pt ~ Normal(mu, sd) does not imply y ~ULLN(exp(mu), sd)

grinsted · June 7, 2018, 12:52pm

I’ve read the Wald discussion, but I don’t fully understand how it relates to my problem. If i further simplify then this is what i am trying to build in pseudocode:

theta ~ Normal(mu=0,sd=1) # a prior on theta
q = f(y, theta) #a function that makes a nonlinear transformation of my observations (y) which depends on theta
q ~ Normal(mu=0,sd=1)

Does anybody know of any other examples where something similar is done?

junpenglao · June 7, 2018, 1:16pm

Your model in the main post is the same as the pseudocode. But you might find that the estimated parameter theta is not correct, that’s because the volume change in the forward and backward transformation is not accounted for.
You can have a look at an example here: https://github.com/junpenglao/Planet_Sakaar_Data_Science/blob/master/Miscellaneous/Regression%20with%20a%20twist.ipynb
where f(y, theta) is a linear function. In the notebook, model2_ and model2 is quite similar, but only the output from model2 that accounted for the volume change is correct.

grinsted · June 7, 2018, 4:25pm

Thank you. This is super helpful (and a great example). I hope I can figure it out now

grinsted · June 8, 2018, 8:32am

For anybody stumbling on this thread. Here’s a link with some reading material explaining why this is needed. https://tpapp.github.io/post/jacobian-chain/

grinsted · June 8, 2018, 8:45am

I have just implemented it. I post it here for other people to learn from. Please correct me if I have misunderstood something.

In my case i have:

f(y) = log(ymax*y/(ymax-y))
f(y) ~ Normal(mu,sigma)
det(log(J)) = log(df/dy) = log(-ymax/(y*(y - ymax)))

which I turn into this code:

import numpy as np
import pymc3 as pm

## generate surrogate data
sigma=1
mu=0
ymax=5
q = np.exp(np.random.randn(500)*sigma+mu)
y = (q*ymax)/(q + ymax)

## setup model

ullnmodel = pm.Model()
yobsmax = np.amax(y)

with ullnmodel:
    ymax = pm.Pareto('ymax',alpha=1,m=yobsmax)
    
    mu = pm.Normal('mu', mu=0, sd=5)
    sigma = pm.Lognormal('sigma', mu=0, sd=5)
    
    qt =  pm.math.log(y*ymax) - pm.math.log(ymax  - y)
    q = pm.Normal('q', mu=mu, sd = sigma, observed = qt)
    pm.Potential('jacob_det',pm.math.log(-ymax/(y*(y - ymax))))
    
startpoint = {'mu': np.mean(np.log(y)), 'sigma': np.std(np.log(y)), 'ymax': 5}
map_estimate = pm.find_MAP(model=ullnmodel,start=startpoint)
map_estimate

Topic		Replies	Views
Truncated lognormal observations Questions	1	815	February 19, 2021
Sampling with a general lognormal model Questions	9	936	December 3, 2019
Truncated Normal Questions	8	1899	January 29, 2021
Truncated Inverse normal distribution (also known as Wald distribution) Questions	15	2899	April 27, 2018
The Normal distribution with upper and lower bound Questions	2	1775	December 14, 2017

How do I implement an upper limit log normal distribution

Related topics