Mixture model with boxcox transformation

I think it might be better to start with a Mixture model - I spent some time to think about what would be the smart way to pass a transformation to Normal to model this, but at the end maybe it is more straightforward to write a Box-Cox Normal distribution:

from pymc3.distributions.continuous import Continuous
from pymc3.theanof import gradient


class BoxCoxNormal(Continuous):
    def __init__(self, mu=0., sd=1., lmbda=1., **kwargs):
        self.sd = tt.as_tensor_variable(sd)
        self.mu = tt.as_tensor_variable(mu)
        self.lmbda = tt.as_tensor_variable(lmbda)

        super(BoxCoxNormal, self).__init__(**kwargs)

    def inv_boxcox_func(self, x):
        return tt.exp(tt.log1p(self.lmbda * x) / self.lmbda)

    def boxcox_func(self, y):
        return tt.expm1(self.lmbda * tt.log(y)) / self.lmbda

    def jacobian_det(self, x):
        x = tt.as_tensor_variable(x)
        grad = tt.reshape(
            gradient(tt.sum(self.boxcox_func(x)), [x]), x.shape)
        return tt.log(tt.abs_(grad))

    def logp(self, value):
        sd = self.sd
        mu = self.mu
        value_ = self.boxcox_func(value)
        return pm.Normal.dist(mu, sd).logp(value_) + self.jacobian_det(value)

I cherry-picked code from the scipy box-cox (and it’s inverse) transformation function, and the jacobian_det computation from pymc3 transformed.

So it works like this:

mu0, sd0 = 3., .05
y = np.random.normal(mu0, sd0, size=200)
lam = -.2
y_tr = special.inv_boxcox(y, lam)

with pm.Model() as m:
    mu = pm.Normal('mu', 0., 10.)
    sd = pm.HalfNormal('sd', 5.)
    y_latent = BoxCoxNormal('y', mu, sd, lmbda=lam, observed=y_tr)
    trace = pm.sample(5000, tune=1000)

pm.traceplot(trace, lines=dict(mu=mu0, sd=sd0));

Which it can recover the mu and sd.

To model the mixture, I go with something like below:

with pm.Model() as m:
    w = pm.Dirichlet('w', a=np.ones(2))
    mus = pm.Normal('mus', 2.8, 5., shape=2)
    sds = pm.HalfNormal('sds', .5, shape=2)
    mix_logp = [BoxCoxNormal.dist(mus[0], sds[0], lmbda=ld_1),
                BoxCoxNormal.dist(mus[1], sds[1], lmbda=ld_2),]
    obs = pm.Mixture('y', w, mix_logp, observed=combi_data)

Inference of mixture is pretty difficult with NUTS (you can do a search on the discourse for some related discussion), but for this model using find_MAP seems to performed quite well:

with m:
    map1 = pm.find_MAP()
    trace = pm.sample(1000, tune=1000, start=map1)

To also inference lambda, however, it’s a bit difficult, as I think the Box-Cox transformation is very sensitive to the lambda parameter. I tried with the model below, but the inference is very slow. I think reparameterization, more informative prior, and/or specify inference for this parameter is needed.

with pm.Model() as m:
    w = pm.Dirichlet('w', a=np.ones(2))
    mus = pm.Normal('mus', 2.8, 5., shape=2)
    sds = pm.HalfNormal('sds', .5, shape=2)
    lmbdas = pm.InverseGamma('lambdas', 8., 2., shape=2,
                             testval=np.asarray([-ld_1, -ld_2]))
    mix_logp = [BoxCoxNormal.dist(mus[0], sds[0], lmbda=-lmbdas[0]),
                BoxCoxNormal.dist(mus[1], sds[1], lmbda=-lmbdas[1]),]
    obs = pm.Mixture('y', w, mix_logp, observed=combi_data)

You can find everything in the notebook here: https://github.com/junpenglao/Planet_Sakaar_Data_Science/blob/master/PyMC3QnA/Box-Cox%20transformation.ipynb

3 Likes