Slow sampling speed

I am trying to recreate the Facebook Prophet model from scratch in PyMC. Unfortunately, I am already running into performance issues at the very beginning.

I am using the peyton_manning dataset (a uni-variate time series of 2905 rows) that is used in the FB Prophet docs. Fetching and processing that dataset is shown in this code snippet:

df = pd.read_csv('https://raw.githubusercontent.com/facebook/prophet/main/examples/example_wp_log_peyton_manning.csv')

df["ds"] = pd.to_datetime(df["ds"])
y_max = df["y"].max()
df["y_scaled"] = df["y"] / y_max
df["t"] = (df["ds"] - df["ds"].min()) / (df["ds"].max() - df["ds"].min())

My model looks like this:

def trend_model(y, t, n_changepoints=25, changepoints_prior_scale=0.05, 
                growth_prior_scale=5, changepoint_range=0.8):
    """
    The piecewise linear trend with changepoint implementation in PyMC.
    :param y: (np.array) MinMax scaled observations.
    :param t: (np.array) MinMax scaled time.
    :param n_changepoints: (int) The number of changepoints to model.
    :param changepoint_prior_scale: (flt) The scale of the Laplace prior on the delta vector.
    :param growth_prior_scale: (flt) The standard deviation of the prior on the growth.
    :param changepoint_range: (flt) Proportion of history in which trend changepoints will be estimated. 
    :return model
    """
    model = pm.Model()
    
    s = np.linspace(0, changepoint_range * np.max(t), n_changepoints + 1)[1:]
    
    # * 1 casts the boolean to integers
    A = (t[:, None] > s) * 1

    with model:
        # initial growth
        k = pm.Normal('k', 0 , growth_prior_scale)
        
        # rate of change
        delta = pm.Laplace('delta', 0, changepoints_prior_scale, shape=n_changepoints)
        # offset
        m = pm.Normal('m', 0, 5)
        gamma = -s * delta

        trend = pm.Deterministic("trend", (k + pyt.tensor.dot(A, delta)) * t + (m + pyt.tensor.dot(A, gamma)))

        sigma = pm.HalfCauchy('sigma', 0.5, initval=1)
        pm.Normal('obs', mu=trend, sigma=sigma, observed=y)
        
    return model

model = trend_model(df["y_scaled"], np.array(df["t"]))

Finally, I am sampling:

with model:
    linear_trace = pm.sample(return_inferencedata=True)

I stopped the execution of this piece of code after 20 mins of sampling (it was 40% done). Is it just me, or is this sampling time ridiculously slow?

Have you looked through this notebook or the materials associated with this talk?

I am using this notebook as a reference. I haven’t checked the other link yet, but does it address any performance issues?

Btw, just ran this code on Google Collab, it took less than 2 minutes. Not great, but a big improvement that is making me think that I may have installed PyMC wrong (which is weird, since I followed the recommended conda way of installing it). I will start with a fresh ubuntu, conda and pymc installations and see if I get better speed.

I am not sure why things are running slow for you, but I’m working on something similar and am using pm.math.dot instead of pyt.tensor.dot. Not sure if that will make an improvement for you

the pm.math library just aliases pytensor.tensor, so this will not be the source of performance issues.

If colab is running significantly faster than local there’s likely an installation issue. Did you follow the official instructions when you installed?

1 Like

I installed everything using miniconda instead of anaconda, can that be a problem?

Do you see a warning when you import pymc?

No, no warnings. BLAS is installed fine :smiley:

Then could your machine be considerably worse than what’s on Colab?

No, but in the meantime I finished the prophet implementation and reinstalled ubuntu, anaconda and pymc. Now the sampling speed is 7 mins on my machine, which is much faster than google collab. I guess I somehow made a mistake while installing pymc :melting_face:

Anyways, thanks for the help and sorry for taking up your time with this nonsense!

p.s. I am working on an idea I have for improving the prophet model a bit, so once I finish that, I will post my repo and my results here. Maybe they will be useful for someone.

1 Like