I’m trying to do by myself the example of Chris Fonnesbeck in Pycon 2017: Christopher Fonnesbeck - Introduction to Statistical Modeling with Python - PyCon 2017 - YouTube, from 2:06:00. (I have my own data).
My code (some differences with Fonnesbeck’s code):
with pm.Model() as my_model:
a_first = pm.Uniform("a_first", 0, 10)
a_second = pm.Uniform("a_second", 0, 10)
b_first = pm.Uniform("b_first", 0, 5)
b_second = pm.Uniform("b_second", 0, 5)
# likelihood
first_weeks = pm.Gamma("first_weeks", alpha=a_first, beta=b_first, observed=first_weeks_arr)
second_weeks = pm.Gamma("second_weeks", alpha=a_second, beta=b_second, observed=second_weeks_arr)
# Gamma's mean: α/β
d = pm.Deterministic("d", a_first/b_first - a_second/b_second)
# Fit
#samples = pm.fit(20000, method="svgd").sample(1000) # I won't use VI
samples = pm.sample(20000)
with my_model:
print(az.summary(samples))
az.plot_posterior(samples, var_names=["d"], ref_val=0)
az.plot_trace(samples)
with my_model:
pred = pm.sample_posterior_predictive(samples, var_names=["d"])
az.plot_posterior(pred, var_names="d", ref_val=0)
My 2 questions:
-
I think the mean of the deterministic variable “d” does not answer the question of when the
first_weeks
are above thesecond_weeks
, right? In these versions of PyMC3 it would be the percentages above 0, 61.5% of the timesfirst_weeks
>second_weeks
. Am I right here? -
Now I want to do what Chris does from 2:16:00 onwards, that is, instead of asking of
d
using the historical data, what is the probability for a specific point of the future, I think this issample_posterior_predictive
, but Chris does it differently, perhapssample_posterior_predictive
didn’t exist in 2017. Would be the last part of my code valid? I obtain the same image, so I think there’s something I am missing…
Thx.