Questions of group in PPC and PPC shape

Trying to work through Chapter 2 Question 3 in Osvaldo’s book and I have two questions.

tips = pd.read_csv('https://raw.githubusercontent.com/aloctavodia/BAP/master/code/data/tips.csv')
tip = tips['tip'].values
idx = pd.Categorical(tips['day'],
                     categories=['Thur', 'Fri', 'Sat', 'Sun']).codes
groups = len(np.unique(idx))

with pm.Model() as comparing_groups:
    μ = pm.Normal('μ', mu=0, sd=10, shape=groups)
    σ = pm.HalfNormal('σ', sd=10, shape=groups)
   y = pm.Normal('y', mu=μ[idx], sd=σ[idx], observed=tip)

   trace_cg = pm.sample(5000)
   ppc_cg = pm.sample_posterior_predictive(trace_cg, samples=500)
  1. Is it possible to group PPC samples per group (day of the week in this case)? Osvaldo mentioned you have to do this by hand with idx. But I’m not sure how to correlate idx to the trace.
  2. ppc_cg["y"].shape returns a shape of (500, 244). The 500 make sense, but I don’t get where 244 comes from and am curious

244 is the original shape of the observed (i.e., 244 items)

Which means for question 1, you need to index to these 244 items to isolate different groups

Thanks @junpenglao

1 Like