Poor convergence using interpolated distribution for observed data


I’m trying to use the interpolated distribution provided by Pymc3 to fit observed data. I’m having a poor convergence.
For the sake of simplicity, I’m made a toy example to show the problem.

This is my dataset : a classic gaussian data.

import pymc3 as pm
import numpy as np
from scipy import stats

data = np.random.normal(size=10000) # default is mu=0, sd=1

To compute interpolated pdf, I use the following function found here : https://docs.pymc.io/notebooks/updating_priors.html

def from_posterior(samples):
    smin, smax = np.min(samples), np.max(samples)
    width = smax - smin
    x = np.linspace(smin, smax, 100)
    y = stats.gaussian_kde(samples)(x)

    # what was never sampled should have a small probability but not 0,
    # so we'll extend the domain and use linear approximation of density on it
    x = np.concatenate([[x[0] - 3 * width], x, [x[-1] + 3 * width]])
    y = np.concatenate([[0], y, [0]])
    return pm.Interpolated.dist(x, y)

Then this is my model : (basically I’m trying to find the “hidden parameter” mu & sd)

with pm.Model() as model:
    mu = pm.Normal('mu',mu=0,sd=5) # mu prior
    sd = pm.HalfNormal('sd',sd=1) # sd prior
    s = pm.Normal('s',mu=mu,sd=sd)

    obs =from_posterior(data) #  creating interpolated dist
    pot = pm.Potential('pot', obs.logp(s).sum()) # using dist to create a "potential" in order to fit the normal distribution (and hoping to have an good estimate of mu & sd)
    trace = pm.sample(draws=1000, tune=1000)

This is the sampling log :

And the trace plot :

These are quite bad estimation of mu & sd ! (and the mean of sd trace is around 0.8 instead of 1)
I say bad because my reference is the sampling of a model directly using a normal distribution from which I get far better results.
This is the ‘direct’ model :

with pm.Model() as model:
    mu = pm.Normal('mu',mu=0,sd=5)
    sd = pm.HalfNormal('sd',sd=1)
    obs = pm.Normal('obs',mu=mu,sd=sd,observed=data)
    trace = pm.sample(draws=1000, tune=1000)

This is the sampling log …

And the trace plot :

We see here far more accurate estimation of mu & sd !

Question : is the accuracy difference only because it’s an interpolated pdf ?

My guess is no, because I raised the number of samples to built a better interpolation, but it did’nt help.

I have no other clue, and augmenting ‘target_accept’ or tuning phase does not help. I’m certainly doing something wrong but I don’t know what…

Any help appreciated.

I am not sure but I think Interpolation is meant to only used as prior because it is using a scipy function (which means no gradient flow through the parameter which makes inference very difficult).
You can try metropolis on this and check it is doing better.

Using Metropolis sampling give the same bad results…

The page where I get the function does say this (in the middle of the page)

It is still possible to continue using NUTS sampling method because Interpolated class implements calculation of gradients that are necessary for Hamiltonian Monte Carlo samplers.

So the NUTS sampler should work ok… But it doesn’t ;(