Difficulty plotting multi-switchpoint posterior distributions

Victor_Adejola · May 1, 2021, 11:49am

Hi,

I am trying to plot the multi-switchpoint posterior distributions, but the graph output looks different from the posterior distributions, can anyone tell me what I am doing wrong, please?

The results of the posterior distributions are:

And my code for the plot is below

early_rate_samples = trace['lambda_1']
late_rate_samples = trace['lambda_2']
later_rate_samples = trace['lambda_3']
changepoint_samples = trace['tau_1']
changepoint_2_samples = trace['tau_2']

#N samples from the corresponding posterior distribution.
N = ((changepoint_samples.shape[0]) + (changepoint_2_samples.shape[0]))

expected_Outbreaks_per_week = np.zeros(number_of_weeks)
for week in range(0, number_of_weeks):
    # ix is a bool index of all changepoint samples corresponding to
    # the changepoint occurring prior to value of 'week'
    ix = (week <= changepoint_samples) & (changepoint_2_samples >= week)
    expected_Outbreaks_per_week[week] = (early_rate_samples[ix].sum() + late_rate_samples[ix].sum() + later_rate_samples[~ix].sum()) / N

plt.plot(range(number_of_weeks), expected_Outbreaks_per_week, lw=4, color="#E24A33",
         label="expected number of Outbreak incident")
plt.xlim(0, number_of_weeks)
plt.xlabel("Time (weeks)")
plt.ylabel("Expected # Outbreaks")
plt.title("Expected number of Outbreak Incident")
plt.ylim(0, 15)
plt.bar(np.arange(len(twitter_df['twitter_stream_count'])), twitter_df['twitter_stream_count'], color="#348ABD", alpha=0.65,
        label="observed Outbreak per week")

plt.legend(loc="upper left");

The graph below plotted from the code looks completely way off the posterior distribution outputs.

Can you tell me what I am doing wrong here, please?

Thanks

OriolAbril · May 7, 2021, 8:28am

Can you share the output of the corresponding InferenceData? Without this nor the model it’s difficult to make sense of some of the code like the definition of N.

It also looks like ix is not doing what you want it to do. ix indicates if the week is between the two changepoints only, and you are using both early rate and late rate when it’s true, later rate when it’s false. It should probably be something like:

early_mask = (week <= changepoint_samples)
later_mask = (week >= changepoint_2_samples)
late_mask = ~(early_mask | later_mask)  # equal to (week > changepoint_samples) & (week < changepoint_2_samples)
expected_Outbreaks_per_week[week] = (early_rate_samples[early_mask].sum() + late_rate_samples[late_mask].sum() + later_rate_samples[later_mask].sum()) / N

Victor_Adejola · May 7, 2021, 9:17am

Thank you for your reply.

The model is as follows

with pm.Model() as twitter_model:

    alpha = pm.Uniform('alpha', lower=0, upper=100)
    #alpha = pm.Gamma('alpha', mu=50, sigma=20)

    #lambda_1 = pm.Exponential("lambda_1", alpha)
    #lambda_2 = pm.Exponential("lambda_2", alpha)
    #lambda_3 = pm.Exponential("lambda_3", alpha)
    
    lambda_1 = pm.Uniform("lambda_1", lower=0, upper=20)
    lambda_2 = pm.Uniform("lambda_2", lower=0, upper=20)
    lambda_3 = pm.Uniform("lambda_3", lower=0, upper=20)

    tau_1 = pm.Uniform("tau_1", lower=0, upper= number_of_weeks - 1)
    tau_2 = pm.Uniform("tau_2", lower=tau_1, upper= number_of_weeks - 1)

    idx = np.arange(number_of_weeks) # Index
    
    lamda_0 = pm.math.switch(tau_1 >= idx, lambda_1, lambda_2)
    lambda_ = pm.math.switch(tau_2 >= idx, lamda_0, lambda_3)

    observation = pm.NegativeBinomial("obs", lambda_, alpha, observed=twitter_obs)

with twitter_model:

 trace = pm.sample(1000, cores=2, tune=2000)
 ````

````
Sampling 2 chains for 2_000 tune and 1_000 draw iterations (4_000 + 2_000 draws total) took 475 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.
````



Updating the plot with your suggestion gives a reasonable plot, this is exactly what I am looking for. Thanks.

````
#N samples from the corresponding posterior distribution.

N = (changepoint_samples.shape[0]) 

expected_Outbreaks_per_week = np.zeros(number_of_weeks)

for week in range(0, number_of_weeks):

    # ix is a bool index of all changepoint samples corresponding to

    # the changepoint occurring prior to value of 'week'

    #ix = (week <= changepoint_samples) & (changepoint_2_samples >= week)

    early_mask = (week <= changepoint_samples)

    later_mask = (week >= changepoint_2_samples)

    late_mask = ~(early_mask | later_mask)  # equal to (week > changepoint_samples) & (week < changepoint_2_samples)

    expected_Outbreaks_per_week[week] = (early_rate_samples[early_mask].sum() + late_rate_samples[late_mask].sum() + later_rate_samples[later_mask].sum()) / N

plt.plot(range(number_of_weeks), expected_Outbreaks_per_week, lw=4, color="#E24A33",

         label="expected number of Outbreak incident")

plt.xlim(0, number_of_weeks)

plt.xlabel("Time (weeks)")

plt.ylabel("Expected # Outbreaks")

plt.title("Expected number of Outbreak Incident")

plt.ylim(0, 15)

plt.bar(np.arange(len(twitter_df['twitter_stream_count'])), twitter_df['twitter_stream_count'], color="#348ABD", alpha=0.65,

        label="observed Outbreak per week")

plt.legend(loc="upper left")

`

OriolAbril · May 7, 2021, 11:19am

Can you use 3 backticks (`) to format the code? You can use all markdown and some extra features here, using > for quoting content doesn’t preserve the indentation so it’s difficult to read python code

Victor_Adejola · May 7, 2021, 12:10pm

Thanks, I have updated the post.

Victor_Adejola · May 7, 2021, 12:50pm

How can one test model effectiveness in PyMC3, similar to frequentist approach such as Mean Absolute Error(MAE), Root Mean Square Error (RMSE), AUC plot, and Goodness-of-fit, which are available for testing regression models?

Topic		Replies	Views
3 Multi Switchpoints Approach with PyMC3 Questions	1	1212	October 1, 2020
Correctly applying switchpoint model? Questions	2	2329	April 23, 2019
Help requested - Mixture model posterior predictive plot Questions	0	381	November 6, 2020
How to improve fit of Poisson Mixture Model in PyMC3? Questions	20	5025	May 24, 2018
How to visualize and discourse about the Switch Operator after making it a Deterministic variable Questions doc	0	395	November 10, 2021

Difficulty plotting multi-switchpoint posterior distributions

Related topics