Parameterize a multi-dimensional pymc.Interpolated distribution

Hi, I’m trying to calibrate a PyMC Bayesian model with new data. Basically I would like to update the prior with the posterior distributions of the pre-trained model. I followed a similar workflow of this work: Updating priors — PyMC3 3.11.5 documentation. I understand general idea is to find the gaussian kernel density estimation of the posterior samples and rebuild corresponding priors with pymc.Interpolated distribution.
The only difference between my work and the demo notebook is that some of the coefficients in my model are multi-dimensional. I parameterized the argument shape and pass an array to distribution parameters, such as mu and sigma, to build a multi-dimensional Normal distribution object. See example codes below:

import pymc3 as pm
import numpy as np
import theano.tensor as tt

model = pm.Model()
with model:
    ...
    beta = pm.Normal(
        name="beta", 
        mu=tt._shared(np.array([1, 2, 3])), 
        sigma=tt._shared(np.array([1, 2, 3])),
        shape=3
    )
   ...
   trace = pm.sample(draws=2000, tune=2000, ...)

Then I modified the from_posterior function with np.apply_along_axis to generate a batch of x_points and pdf_points.So that I can similarly pass arrays of shape (3, 1000) to argument x_points and pdf_points, where the slice along axis 1 parameterize each dimension of the distribution. See code examples below:

import pymc3 as pm
from scipy import stats
import numpy as np

# I modified the function a little bit because my sampled posteriors are multi-dimensional
def from_posterior(tag, samples, bins=1000):
    if len(samples.shape) == 1:
        samples = np.expand_dims(samples, axis=1)
    smin, smax = np.min(samples, axis=0), np.max(samples, axis=0)
    width = smax - smin
    x = np.linspace(smin, smax, bins)
    y= np.apply_along_axis(func1d=lambda a: stats.gaussian_kde(a[:-bins])(a[-bins:]),
                           axis=0,
                           arr=np.concatenate((samples, x), axis=0))

    x_points = np.concatenate((x[[0], :] - 3 * width, x, x[[-1], :] + 3 * width), axis=0).T 
    # x_points.shape = (samples.shape[1], bins+2)
    pdf_points = np.concatenate([np.zeros((1, y.shape[1])), y, np.zeros((1, y.shape[1]))], axis=0).T
    # pdf_points.shape = (samples.shape[1], bins+2)
    return pm.Interpolated(tag, x_points=x_points,  pdf_points=pdf_points, shape=samples.shape[1])

with model:
    ...
    beta = from_posterior(
        tag="beta", 
        samples=beta_posteriors, # beta_posteriors.shape = (8000, 3)
        bins=1000
    )
   ...

However, the codes above returned this error:

ValueError: too many axes: 2 (effrank=2), expected rank=1

It seems the class pymc.Interpolated doesn’t function similarly as pm.Normal. The source codes called scipy.interpolate which doesn’t support the syntax above. So I just wonder if anyone encountered a similar problem and knows a way to code the dimension size of pm.Interpolated dynamically? Thank you very much!

I think the current suggestion is to try the histrogram_approximation distribution found in pymc-experimental (here) rather than pm.Interpolated. Not sure that helps with the multivariate case, but figured I would mention it. Also, if you can update pymc (5.3 is the current release) that would probably be ideal.

@yang92 Were you able to find a way to do this?