Significant digits of pm.stats.hpd

Hi PyMC3,
Is there a principled way to determine the precision of the HPD estimate? In this case I’m not referring to precision of the estimated parameters, but the actual significant digits.

Here’s an example because why not.

sample = np.random.randn(100)
with pm.Model() as example:
    mu_prior = pm.Normal('mu_prior', mu=0,sigma=100)
    sigma_prior = pm.Normal('sigma_prior', mu=0,sigma=100)
    
    likeliehood = pm.Normal('lik', mu_prior, sigma_prior, observed=sample)
    
    example_trace = pm.sample(1000)
print(pm.stats.hpd(example_trace['mu_prior']))

That returns:

array([-0.22033622,  0.17845608])

which just has significant digits of the data type.

Thankyou! :slight_smile:

Perhaps one way to achieve this is performing inference multiple times, and identifying the digits that are changing in the hpd estimates? This could be estimated using bootstrapping, ie:

data = example_trace['mu_prior']
for i in range(10):
  bstrap = np.random.choice(data, data.shape[0])
  print(pm.stats.hpd(bstrap))

gives:

[-0.17369228  0.20168696]
[-0.17219421  0.20168696]
[-0.17386337  0.20208916]
[-0.17709522  0.2024287 ]
[-0.17709522  0.2024287 ]
[-0.15893305  0.21802071]
[-0.15893305  0.21916593]
[-0.17386337  0.20050872]
[-0.17709522  0.19793949]
[-0.1675457   0.20622163]

so the estimated HPD would be -0.2 to 0.2

This obeys intuition, in that if you repeat the example code above but increase the number of samples to 5000, you get more sigfigs (this uses a new random sample so the estimate is slightly different):

[-0.26531642  0.1494834 ]
[-0.26617875  0.14188749]
[-0.26580921  0.14320279]
[-0.26718867  0.14182161]
[-0.26922152  0.14190669]
[-0.26713285  0.14182161]
[-0.26713907  0.14326737]
[-0.26635723  0.1480868 ]
[-0.26594888  0.14637803]
[-0.26619511  0.14885934]

this time you get 2 sigfigs: -0.27 - 0.14