Simple Dirichlet Process Binomial Mixture Model samples slow

#1

I’m trying to build a simple DPMM with binomial distributions as the component dists.
However, even Metropolis’ sampling is extremely slow (only like 10draws/s max) with N=200

NUTS, ADVI are also extremely slow.

Is there any reason such a simple model should take so long to sample?

Here is the model code:


def stick_breaking(beta):
    portion_remaining = tt.concatenate([[1], tt.extra_ops.cumprod(1 - beta)[:-1]])

    return beta * portion_remaining

d0=np.concatenate([np.random.binomial(15, .1, size=(100, 1)), np.random.binomial(15, .5, size=(100, 1))])
d1=np.ones(200)*15

with pm.Model() as model:

    alpha = pm.Gamma('alpha', 1., 1.)
    beta = pm.Beta('beta', 1, alpha, shape=30)
    w = pm.Deterministic('w', stick_breaking(beta))

    dpmm_comp_mu=pm.Normal('dpmm_comp_mu', 0., 100., shape=30)
    
    visit_rate_like=pm.Mixture(
        'visit_rate_like', 
        w, 
        pm.Binomial.dist(
            p=pm.math.invlogit(dpmm_comp_mu),
            n=d1.astype('int32')[:, None]
        ), 
        observed=d0.astype('int32')[:, None]
    )

with model:
    trace=pm.sample(step=pm.Metropolis())
0 Likes

#2

I did notice that reducing N from 200 to 20, the sampling speed increases by 100X
Not sure why that would be either…

0 Likes

#3

Changing the data to n=1 trials, to simulate a single bernoulli trial did not improve the speed at all.
However, when I then switched to literally pm.Bernoulli instead of the Binomial(n=1) the speed increased over 100X

1 Like

#4

I get a bit of an improvement if I break the Binomial down in a list comprehension:

    visit_rate_like=pm.Mixture(
    'visit_rate_like', 
    w, 
    [pm.Binomial.dist(
        p=pm.math.invlogit(dpmm_comp_mu[i]),
        n=d1.astype('int32')[:, None]
    ) for i in range(30)], 
    observed=d0.astype('int32')[:, None]
)

But in general, these are tricky to sample.

1 Like