How best to build a model on 200k normally distributed observations without a simple vector relation (rather, a piecewise vector relation, i.e. subsets of data depend on a combination of parameters)

I would try to use indexing. Assuming x is a data frame, create integer variables for i and j.

u = pm.Dirichlet('u', ..., shape=n_time_bins)  # or whatever prior you wish
v = pm.Dirichlet('v', ..., shape=n_days) # or whatever prior you wish
sd = pm.Deterministic('sd', (u[df.i.values] * v[df.j.values])**2)
observed = pm.Normal('observed', mu=0, sd=sd, observed=df.x)
1 Like