Hello community!
I have dataset of 5 features with ~800 rows.
As per the quickstart guide, I’ll have to define priors for an alpha and 5 betas.
I’m building model for 2nd order polynomials, so further 5 beta distributions are added.
Here’s how code looks.
with pm.Model() as my_model:
a = pm.Normal('a', 0, sd=10)
b11 = pm.Normal('b11', 0, sd=10)
b12 = pm.Normal('b12', 0, sd=10)
b21 = pm.Normal('b21', 0, sd=10)
b22 = pm.Normal('b22', 0, sd=10)
b31 = pm.Normal('b31',0, sd=10)
b32 = pm.Normal('b32',0, sd=10)
b41 = pm.Normal('b41',0, sd=10)
b42 = pm.Normal('b42',0, sd=10)
b51 = pm.Normal('b51',0, sd=10)
b52 = pm.Normal('b52',0, sd=10)
eps = pm.HalfCauchy('eps', 5)
mu = a + b11*x1+ b12*x1**2 + b21*x2+ b22*x2**2 + b31*x3+ b32*x3**2 + b41*x4+ b42*x4**2 + b51*x5+ b52*x5**2
y = pm.Normal('y', mu=mu, sd=eps, observed=y)
trace = pm.sample(draws=300,chains=2,cores=2,model=my_model)
The deterministic variable equation for mu
looks quite large and I had to define whopping 10 distributions for all beta priors.
NUTS takes almost 1.5 hours to sample for a simple dataset of 800 rows( Maybe because of so many parameters).
Is there any other way to define model for large features set or am I missing something?