Several seconds per iteration with GPU

leeke2 · June 8, 2018, 5:34am

I am developing a bivariate gaussian mixture model, with invariant component distributions and time-variant mixing coefficients to model a spatiotemporal distribution. The model is defined as follows:

P(\mathcal{D}|\theta; t) = \displaystyle \sum_{i=1}^K a_i(t)\phi(\mu_i, \Sigma_i)

with the priors:

a(t) = (a_1(t), \dots, a_K(t)) \sim \operatorname{Dir}(1)
\mu_i \sim \mathcal{N}(0, 10)
\Sigma_i | \sigma_i\sim \operatorname{LKJ}(2, \sigma_i)
\sigma_i \sim \operatorname{Half-Cauchy}(1)

I have modeled the data for two time point, t=1 and t=2, having 6K and 13K data points respectively. The code for the model is given below:

with pm.Model() as model:
    n_comps = 5
    n_hours = 2

    packed_L = []
    L = []
    sigma = []
    mu = []
    obs = []
    w = []

    for i in range(n_comps):
        packed_L.append(pm.LKJCholeskyCov('packed_L' + str(i),
                                            n=2,
                                            eta=2.,
                                            sd_dist=pm.HalfCauchy.dist(1)))
        L.append(pm.expand_packed_triangular(2, packed_L[-1]))
        sigma.append(pm.Deterministic('sigma' + str(i),
                                      L[-1].dot(L[-1].T)))
        mu.append(pm.Normal('mu' + str(i), 0., 2., shape=2))
        obs.append(pm.MvNormal.dist(mu=mu[-1], chol=L[-1]))

    mixt = []

    for i in range(n_hours):
        w.append(pm.Dirichlet('w'+str(i), a=np.ones(n_comps)))

        x, y = df[df.pickup_hour == i][['pickup_latitude', 'pickup_longitude']].values.T
        x = (x - mean_x) / std_x
        y = (y - mean_y) / std_y

        mixt.append(pm.Mixture('mixt' + str(i), w=w[i], comp_dists=obs, observed=list(zip(*np.stack([x,y])))))

and the following sampler:

with model:
        trace = pm.sample(njobs=1)

The most I can get from this model is about 1 iteration per second, with GPU acceleration (tested on AWS p2.xlarge). Could anybody point out what I have done wrong or any suggestions to model it better?

junpenglao · June 8, 2018, 6:30am

1 iteration per second for a large model like this is actually not too bad. Did you try using CPU only?

leeke2 · June 8, 2018, 4:34pm

It seems like it takes 2.6s/it with GPU, but without GPU, it wasn’t even able to sample anything. When I limit the data points to 500 for each hour, it’s running at 3.0s/it with CPU.

junpenglao · June 8, 2018, 5:39pm

You can also try profiling to see which part of the model is slow. However this seems difficult to optimized, unless you are willing to write down the multivariate mixture explicitly.

leeke2 · June 9, 2018, 12:50am

Would you mind explain what you mean by writing down the multivariate mixture explicitly?

junpenglao · June 9, 2018, 6:25am

Something similar to this doc: http://docs.pymc.io/notebooks/gaussian-mixture-model-advi.html (see cell 2)

Topic		Replies	Views
Mixed multivariate Gauss distribution Questions	39	4814	April 12, 2018
Speed up the model: how large is a model large enough to benefit from using GPU? v5	3	346	September 21, 2022
Advice for speeding up large multivariate models Questions	17	2144	April 12, 2018
Gausian Mixture Model takes too long to sample v5 modeling	15	637	November 6, 2023
GPU is running the model much slower than the CPU v5	7	1503	August 16, 2023

Several seconds per iteration with GPU

Related topics