Hi - new to pymc3, and currently all of the tutorial examples that I run are going very slowly. As an example, I took this code, reduced the number of samples from 10,000 to 1,000, but it is still taking around 10 minutes to run (~10 draws/s.) There’s another script that my coworker can run in 15 minutes with a similar machine, but when I try to run it it was estimated to take 25 hours.
Here are the software versions I’m running:
My processor is an Intel Core 15-8350U 1.7GHz 8 cores (no discrete GPU)
Here is the code I was running (takes ~10 minutes to complete.) (Note, I also wrapped it in an “if
__name__ == ‘
__main__’:” statement to avoid getting “RuntimeError: The communication pipe between the main process and its spawned children is broken.”)
Any help would be appreciated! There are a few different projects I’d like to use pymc3 for, but right now it’s hard to even do tutorials because everything takes to long to sample.
import numpy as np, pandas as pd, matplotlib.pyplot as plt, seaborn as sns
if __name__ == '__main__':
sns.set_context('paper') sns.set_style('darkgrid') import pymc3 as pm, theano.tensor as tt # simulate data from a known mixture distribution np.random.seed(12345) # set random seed for reproducibility k = 3 ndata = 500 spread = 5 centers = np.array([-spread, 0, spread]) # simulate data from mixture distribution v = np.random.randint(0, k, ndata) data = centers[v] + np.random.randn(ndata) plt.hist(data); # setup model model = pm.Model() with model: # cluster sizes p = pm.Dirichlet('p', a=np.array([1., 1., 1.]), shape=k) # ensure all clusters have some points p_min_potential = pm.Potential('p_min_potential', tt.switch(tt.min(p) < .1, -np.inf, 0)) # cluster centers means = pm.Normal('means', mu=[0, 0, 0], sd=15, shape=k) # break symmetry order_means_potential = pm.Potential('order_means_potential', tt.switch(means-means < 0, -np.inf, 0) + tt.switch(means-means < 0, -np.inf, 0)) # measurement error sd = pm.Uniform('sd', lower=0, upper=20) # latent cluster of each observation category = pm.Categorical('category', p=p, shape=ndata) # likelihood for each observed value points = pm.Normal('obs', mu=means[category], sd=sd, observed=data) # fit model with model: step1 = pm.Metropolis(vars=[p, sd, means]) step2 = pm.ElemwiseCategorical(vars=[category], values=[0, 1, 2]) tr = pm.sample(1000, step=[step1, step2])