# Reproducing an example from PyCon 2017

I’m trying to do by myself the example of Chris Fonnesbeck in Pycon 2017: Christopher Fonnesbeck - Introduction to Statistical Modeling with Python - PyCon 2017 - YouTube, from 2:06:00. (I have my own data).

My code (some differences with Fonnesbeck’s code):

``````with pm.Model() as my_model:

a_first = pm.Uniform("a_first", 0, 10)
a_second = pm.Uniform("a_second", 0, 10)

b_first = pm.Uniform("b_first", 0, 5)
b_second = pm.Uniform("b_second", 0, 5)

# likelihood
first_weeks = pm.Gamma("first_weeks", alpha=a_first, beta=b_first, observed=first_weeks_arr)
second_weeks = pm.Gamma("second_weeks", alpha=a_second, beta=b_second, observed=second_weeks_arr)

# Gamma's mean: α/β
d = pm.Deterministic("d", a_first/b_first - a_second/b_second)

# Fit
#samples = pm.fit(20000, method="svgd").sample(1000) # I won't use VI
samples = pm.sample(20000)

with my_model:
print(az.summary(samples))
az.plot_posterior(samples, var_names=["d"], ref_val=0)
az.plot_trace(samples)
``````

``````with my_model:
pred = pm.sample_posterior_predictive(samples, var_names=["d"])
az.plot_posterior(pred, var_names="d", ref_val=0)
``````

My 2 questions:

1. I think the mean of the deterministic variable “d” does not answer the question of when the `first_weeks` are above the `second_weeks`, right? In these versions of PyMC3 it would be the percentages above 0, 61.5% of the times `first_weeks` > `second_weeks`. Am I right here?

2. Now I want to do what Chris does from 2:16:00 onwards, that is, instead of asking of `d` using the historical data, what is the probability for a specific point of the future, I think this is `sample_posterior_predictive`, but Chris does it differently, perhaps `sample_posterior_predictive` didn’t exist in 2017. Would be the last part of my code valid? I obtain the same image, so I think there’s something I am missing…

Thx.

Ok, I think I got it. For the second question (the first, I am 90% confident it’s ok), would be something like:

``````with my_model:
# To find difference of the posteriors we find the mean of each
# num_chains*num_samples_per_chain and then subtract them
mean_1st_weeks = np.mean(pred["first_weeks"], axis=1)
print(mean_1st_weeks.shape)
mean_2nd_weeks = np.mean(pred["second_weeks"], axis=1)
print(mean_2nd_weeks.shape)

# Now:
diff = mean_1st_weeks - mean_2nd_weeks
print(diff.shape)
print(diff)

az.plot_posterior(diff, ref_val=0)
``````

With an increase of the 94% HDI width, as one would expect from a posterior sample. Now I think this second question is ok with 90% of confidence, too.