Hello @morganstockham
Hope, I’m following correctly but you want to create a hierarchical model with a ("date","geo") granularity? If so. I’ll recommend you the following.
First, take advantage from pymc-marketing and implement any of the available adstocks ![]()
from pymc_marketing.mmm.components.adstock import WeibullAdstock
adstock = WeibullAdstock(l_max=10, normalize=True)
You can choose between WeibullAdstock, DelayedAdstock or GeometricAdstock ![]()
Assuming your dataframe has the following structure:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30 entries, 0 to 29
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 date 30 non-null datetime64[ns]
1 geo 30 non-null object
2 marketing_spend 30 non-null float64
3 sales 30 non-null float64
4 marketing_spend_2 30 non-null float64
dtypes: datetime64[ns](1), float64(3), object(1)
memory usage: 1.3+ KB
Where marketing_spend_N is a column with the spend/impression of certain channel. Then you can write your model as follow:
with pm.Model(coords=coordinates) as hierarchical_model:
x_data = pm.Data(
"x_data",
value=X_data.values, #transform your dataframe into an array with the shape specify in the dims
dims=("date", "channel", "geo")
)
y_data = pm.Data(
"y_data",
value=y.values, #transform your dataframe into an array with the shape specify in the dims
dims=("date", "geo")
)
intercept = pm.Gamma('intercept', mu=500, sigma=300, dims="geo")
#hyper priors (global mu & sigma)
lam_prior_sigma = pm.HalfNormal('lam_prior_sigma', sigma=100,)
lam_prior_mu = pm.Beta('lam_prior_mu', alpha=100, beta=200,)
# prior distribution -> channels share the same hyper priors
lam = pm.Gamma('lam', mu=lam_prior_mu, sigma=lam_prior_sigma, dims=("channel", "geo"))
#hyper priors
k_prior_sigma = pm.HalfNormal('k_prior_sigma', sigma=500,)
k_prior_mu = pm.Beta('k_prior_mu', alpha=100, beta=200,)
# prior dist
k = pm.Gamma('k', mu=k_prior_mu, sigma=k_prior_sigma, dims=("channel", "geo"))
# estimating contribution (only using the transformation to keep it simple)
contribution = pm.Deterministic(
name="contribution",
var=adstock.function(x=x_data, lam=lam, k=k),
dims=("date","channel", "geo")
)
yhat = pm.Deterministic(
name="yhat",
var=(
intercept +
contribution.sum(axis=1)
),
dims=("date", "geo")
)
sigma_likelihood = pm.HalfNormal("sigma_likelihood", sigma=200, dims="geo")
nu = pm.Gamma(name="nu", alpha=400, beta=200, dims="geo")
pm.StudentT(
name="likelihood",
mu=yhat,
nu=nu,
sigma=sigma_likelihood,
dims=("date", "geo"),
observed=y_data
)
After applying the following model, you can run pm.sample ![]()
If you need to apply transformation for some channels and not others then maybe worth to modify the coordinates to be something like:
coordinates = {
"date":...,
"channel_type_a":...,
"channel_type_b":...,
"geo":...
}
Then you can simply modify the shape of the X_data.
x_data_a = pm.Data(
"x_data_a",
value=X_data_a.values, #transform your dataframe into an array with the shape specify in the dims
dims=("date", "channel_type_a", "geo")
)
x_data_b = pm.Data(
"x_data_b",
value=X_data_b.values, #transform your dataframe into an array with the shape specify in the dims
dims=("date", "channel_type_b", "geo")
)
Using this way, you can decide to what data apply the transformation!
I recommend you to read the block from @twiecki about centered and non-centered hierarchies. The implementation expose here is a centered hierarchy.
As well, take a look to ZeroSumNormal which is a great alternative when you are leading with hierarchical model, where the number of parameters increase to a point of over-parametrization.