Low discrepancy Sampler

development

#1

Hi,

The pymc3.sampling.sample is sampling distributions using MC.
Do you plan to introduce some low discrepancy sequence such as Sobol’, Halton sequences or even LHS?

I am willing to contribute :slight_smile:

Cheers

Tupui


#2

I dont think there is any plan to introduce low discrepancy sequence in PyMC3. Do you know any good reference using low discrepancy sequence in Bayesian statistics?


#3

Yeah, If you want to construct a surrogate model using GP, this is standard procedure in today’s literature. Especially when the function is expensive to evaluate.

You can check out this for example. Written by famous guys in the field with lots of references.

http://arxiv.org/abs/1704.07090

You can also look at the Handbook of Uncertainty Quantification:


#4

@junpenglao For the record, there is one package which does all that openTURNS (LGPL):

http://openturns.org

But it is a bit overkill to use it just for design of experiments.


#5

Thanks a lot @tupui! I will have a look.


#6

If you are interested I have done some work here (MIT licensed): https://gist.github.com/tupui/cea0a91cc127ea3890ac0f002f887bae

I can PR or anything else if needed :slight_smile:


#7

Hi @tupui, sorry I still dont quite get how to make use of these random sequences for sampling from a distribution - do you have more example in the regards?


#8

If you want to sample from a distribution (or a joint distribution), you then just have to apply the inverse transformation (inverse CDF) to the sample generated by the low discrepancy sequence.

Using my code, here is an example with some scipy:

x = halton(2, 200)

import matplotlib.pyplot as plt
from scipy import stats

plt.figure()
plt.plot(x[:, 0], x[:, 1], '+')
plt.title('uniform-distribution')

plt.figure()
xn = stats.norm._ppf(x)
plt.plot(xn[:, 0], xn[:, 1], '+')
plt.title('normal-distribution')

plt.figure()
plt.plot(stats.t._ppf(x[:, 0], 3), stats.t._ppf(x[:, 1], 3), '+')
plt.title('t-distribution')

This way you see that we can sample the distributions but it will converge faster than pure MC.


#9

@junpenglao For info, I have done a PR with this here: https://github.com/statsmodels/statsmodels/pull/4104#issuecomment-341980641


#10

Thanks for the update. I am still thinking about this - ie how to make use of it to sample from logp.