Significant digits of pm.stats.hpd

lewiso1 · December 15, 2020, 1:22am

Hi PyMC3,
Is there a principled way to determine the precision of the HPD estimate? In this case I’m not referring to precision of the estimated parameters, but the actual significant digits.

Here’s an example because why not.

sample = np.random.randn(100)
with pm.Model() as example:
    mu_prior = pm.Normal('mu_prior', mu=0,sigma=100)
    sigma_prior = pm.Normal('sigma_prior', mu=0,sigma=100)
    
    likeliehood = pm.Normal('lik', mu_prior, sigma_prior, observed=sample)
    
    example_trace = pm.sample(1000)
print(pm.stats.hpd(example_trace['mu_prior']))

That returns:

array([-0.22033622,  0.17845608])

which just has significant digits of the data type.

Thankyou!

lewiso1 · December 15, 2020, 3:21am

Perhaps one way to achieve this is performing inference multiple times, and identifying the digits that are changing in the hpd estimates? This could be estimated using bootstrapping, ie:

data = example_trace['mu_prior']
for i in range(10):
  bstrap = np.random.choice(data, data.shape[0])
  print(pm.stats.hpd(bstrap))

gives:

[-0.17369228  0.20168696]
[-0.17219421  0.20168696]
[-0.17386337  0.20208916]
[-0.17709522  0.2024287 ]
[-0.17709522  0.2024287 ]
[-0.15893305  0.21802071]
[-0.15893305  0.21916593]
[-0.17386337  0.20050872]
[-0.17709522  0.19793949]
[-0.1675457   0.20622163]

so the estimated HPD would be -0.2 to 0.2

This obeys intuition, in that if you repeat the example code above but increase the number of samples to 5000, you get more sigfigs (this uses a new random sample so the estimate is slightly different):

[-0.26531642  0.1494834 ]
[-0.26617875  0.14188749]
[-0.26580921  0.14320279]
[-0.26718867  0.14182161]
[-0.26922152  0.14190669]
[-0.26713285  0.14182161]
[-0.26713907  0.14326737]
[-0.26635723  0.1480868 ]
[-0.26594888  0.14637803]
[-0.26619511  0.14885934]

this time you get 2 sigfigs: -0.27 - 0.14

Topic		Replies	Views
Prevent roundoff on pm.summary Questions	1	424	May 5, 2020
Any reason why the HPD in Pymc3 differs from what the code in Statistical Rethinking gets? Questions	2	489	August 26, 2018
How to understand parameter uncertainty and HPD? Questions	1	679	April 4, 2019
Bayesian sample size estimation for given HPD Questions	10	4751	August 5, 2021
Highest Density Regions (HDR) misunderstanding Questions	9	2880	October 5, 2019

Significant digits of pm.stats.hpd

Related topics