How to implement Zero-Inflated Multinomial?

junpenglao · November 3, 2019, 11:28am

Is your CPU AMD? I think amdlibm only apply for that.
What I found surprising is that some of the operation is not compile to c, but I have not look at any profiling result for a long time so I might be miss-informed
Pinging @aseyboldt to see if he has some tips re looking at profiling and speeding up ops

AlexAndorra · November 3, 2019, 12:28pm

Mmmh, not sure how to answer – sorry, I am truly clueless about this stuff
Here is my machine’s info (it’s a Mac), hoping that it answers your question:

compiler   : Clang 4.0.1 (tags/RELEASE_401/final)
system     : Darwin
release    : 15.6.0
machine    : x86_64
processor  : i386
CPU cores  : 2
interpreter: 64bit

junpenglao · November 3, 2019, 2:01pm

Mac only use Intel CPU I think - and you set up your environment using conda? (if so Intel MKL should be set up correctly)

AlexAndorra · November 3, 2019, 3:00pm

Yes it’s a conda env – don’t know if useful but my Mac is quite old (mid-2009).
FYI, I just ran the same model, but without modeling the covariation btw intercepts and slopes and it took 10 minutes to sample. So it seems that it really is the addition of the covariation structure (MvNormal) that considerably slows sampling.

AlexAndorra · November 10, 2019, 11:50am

Just realized I can mark this as solved thanks to @junpenglao’s suggestion of considering that non-available categories have probability 0 – this implies we know in advance which categories are absent. Of course, I’m curious about @aseyboldt thoughts on profiling and speed-ups if he’s got any!
Here is a simplified version of the model, for the sake of readability and understanding how to implement the idea with PyMC:

with pm.Model() as m_multi:
    a_cluster = pm.Normal("a_cluster", -1.8, 0.5, shape=(Nclusters, Ncategories - 1))
    a_pivot = tt.as_tensor_variable(np.full(shape=(Nclusters, 1), fill_value=-2.2))
    a_cluster_f = tt.horizontal_stack(a_cluster a_pivot)

    lat_p = tt.nnet.softmax(a_cluster_f[cluster_id])

    # zero-inflation process:
    # keep only preferences for available categories:
    slot_p = cats_available * lat_p
    # normalize preferences:
    slot_p = slot_p / tt.sum(slot_p, axis=1, keepdims=True)
    R = pm.Multinomial("R", n=N, p=slot_p, observed=sim_R)

    trace_multi = pm.sample(2000, tune=3000, init="adapt_diag")

Hope this will help future users!
For the curious out there, the real model is much more complicated (varying-effects and covariance) but I’ll open source it – as well as real and simulated data – once I’ve had time to clean it up!
And again, a big thank you to Junpeng Couldn’t have done it without your help! Now I owe you a second beer when you come to Paris

Topic		Replies	Views
Data representation for mixtures of multinomials, and Categorical vs Mixture? Questions	5	986	April 17, 2019
Multinomial mixture with observed-values themselves as mixtures Questions	8	1020	March 29, 2019
How to create model with series of multinomials, each of which has observations on only a subset of components? v5 modeling	1	339	November 27, 2022
"Generalized ZeroInflatedPoisson" / observed deterministic variable Questions	2	444	December 4, 2020
PR proposal -- API for Multinomial-Softmax and Cholesky decomposition Development development	2	596	November 11, 2019

How to implement Zero-Inflated Multinomial?

Related topics