I’m a newer user to pymc and having trouble understanding how to structure my inputs and model to fit unpooled independent regressions using the concepts of coords and dims.
I’ve mocked up some dummy data in a pandas dataframe that represents a simplified problem as follows. I have a product_name (string), price (float), and sales (integer).
My goal is to fit independent parameters for each product name (sales ~ price separately for each product_name). Based on initial research, I believe this is done with use of coords/dims and incorporated into priors and likelihood functions, but am having trouble conceptualizing how to incorporate into code.
I could certainly loop and fit a bunch of models on subsets of the data (ex: filter to the “bike” product name + fit a model, then filter to “car” and repeat), but I’m sure the better way is to structure in a way that one can be run and then posterior results parsed afterwards. The model (simple log log linear regression) is defined as follows, but currently only works for a single product_name:
with pm.Model() as model:
m = pm.Normal('m', mu=-1, sd=1)
b = pm.Normal('b', mu=0, sd=1)
sig = pm.HalfNormal('sig', sigma=1)
y_hat_log = m * np.log(df.price) + b
y_observed_log = np.log(df.sales)
lik = pm.Normal('lik', mu=y_hat_log, observed=y_observed_log, sigma=sig)
trace = pm.sample()