Low discrepancy Sampler

tupui · October 21, 2017, 9:22am

Hi,

The pymc3.sampling.sample is sampling distributions using MC.
Do you plan to introduce some low discrepancy sequence such as Sobol’, Halton sequences or even LHS?

I am willing to contribute

Cheers

Tupui

junpenglao · October 21, 2017, 10:03am

I dont think there is any plan to introduce low discrepancy sequence in PyMC3. Do you know any good reference using low discrepancy sequence in Bayesian statistics?

tupui · October 21, 2017, 12:30pm

Yeah, If you want to construct a surrogate model using GP, this is standard procedure in today’s literature. Especially when the function is expensive to evaluate.

You can check out this for example. Written by famous guys in the field with lots of references.

You can also look at the Handbook of Uncertainty Quantification:

tupui · October 21, 2017, 7:58pm

@junpenglao For the record, there is one package which does all that openTURNS (LGPL):

http://openturns.org

But it is a bit overkill to use it just for design of experiments.

junpenglao · October 21, 2017, 8:59pm

Thanks a lot @tupui! I will have a look.

tupui · November 4, 2017, 3:42pm

If you are interested I have done some work here (MIT licensed): https://gist.github.com/tupui/cea0a91cc127ea3890ac0f002f887bae

I can PR or anything else if needed

junpenglao · November 4, 2017, 3:59pm

Hi @tupui, sorry I still dont quite get how to make use of these random sequences for sampling from a distribution - do you have more example in the regards?

tupui · November 4, 2017, 4:32pm

If you want to sample from a distribution (or a joint distribution), you then just have to apply the inverse transformation (inverse CDF) to the sample generated by the low discrepancy sequence.

Using my code, here is an example with some scipy:

x = halton(2, 200)

import matplotlib.pyplot as plt
from scipy import stats

plt.figure()
plt.plot(x[:, 0], x[:, 1], '+')
plt.title('uniform-distribution')

plt.figure()
xn = stats.norm._ppf(x)
plt.plot(xn[:, 0], xn[:, 1], '+')
plt.title('normal-distribution')

plt.figure()
plt.plot(stats.t._ppf(x[:, 0], 3), stats.t._ppf(x[:, 1], 3), '+')
plt.title('t-distribution')

This way you see that we can sample the distributions but it will converge faster than pure MC.

tupui · November 5, 2017, 3:23pm

@junpenglao For info, I have done a PR with this here: https://github.com/statsmodels/statsmodels/pull/4104#issuecomment-341980641

junpenglao · November 5, 2017, 5:04pm

Thanks for the update. I am still thinking about this - ie how to make use of it to sample from logp.

Topic		Replies	Views
How can I deal with a computationally expensive simulator method in Sequential Monte Carlo/Approximate Bayesian Computation? v5 smc_abc , smc	16	1133	October 24, 2024
Can pymc3 get samples from a (not posterior) complex distribution using MCMC? Questions	14	1539	July 2, 2019
Adaptive surrogate models Questions	4	1655	October 23, 2017
Sample method of pymc3 Questions	4	670	July 15, 2018
Regression model sampling solely one sample every 5th second version agnostic modeling	2	512	August 29, 2022

Low discrepancy Sampler

Related topics