Trying to work through Chapter 2 Question 3 in Osvaldo’s book and I have two questions.
tips = pd.read_csv('https://raw.githubusercontent.com/aloctavodia/BAP/master/code/data/tips.csv') tip = tips['tip'].values idx = pd.Categorical(tips['day'], categories=['Thur', 'Fri', 'Sat', 'Sun']).codes groups = len(np.unique(idx)) with pm.Model() as comparing_groups: μ = pm.Normal('μ', mu=0, sd=10, shape=groups) σ = pm.HalfNormal('σ', sd=10, shape=groups) y = pm.Normal('y', mu=μ[idx], sd=σ[idx], observed=tip) trace_cg = pm.sample(5000) ppc_cg = pm.sample_posterior_predictive(trace_cg, samples=500)
- Is it possible to group PPC samples per group (day of the week in this case)? Osvaldo mentioned you have to do this by hand with idx. But I’m not sure how to correlate idx to the trace.
ppc_cg["y"].shapereturns a shape of
(500, 244). The 500 make sense, but I don’t get where 244 comes from and am curious