Hello @morganstockham
Hope, I’m following correctly but you want to create a hierarchical model with a ("date","geo")
granularity? If so. I’ll recommend you the following.
First, take advantage from pymc-marketing
and implement any of the available adstocks
from pymc_marketing.mmm.components.adstock import WeibullAdstock
adstock = WeibullAdstock(l_max=10, normalize=True)
You can choose between WeibullAdstock, DelayedAdstock or GeometricAdstock
Assuming your dataframe has the following structure:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30 entries, 0 to 29
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 date 30 non-null datetime64[ns]
1 geo 30 non-null object
2 marketing_spend 30 non-null float64
3 sales 30 non-null float64
4 marketing_spend_2 30 non-null float64
dtypes: datetime64[ns](1), float64(3), object(1)
memory usage: 1.3+ KB
Where marketing_spend_N
is a column with the spend/impression of certain channel. Then you can write your model as follow:
with pm.Model(coords=coordinates) as hierarchical_model:
x_data = pm.Data(
"x_data",
value=X_data.values, #transform your dataframe into an array with the shape specify in the dims
dims=("date", "channel", "geo")
)
y_data = pm.Data(
"y_data",
value=y.values, #transform your dataframe into an array with the shape specify in the dims
dims=("date", "geo")
)
intercept = pm.Gamma('intercept', mu=500, sigma=300, dims="geo")
#hyper priors (global mu & sigma)
lam_prior_sigma = pm.HalfNormal('lam_prior_sigma', sigma=100,)
lam_prior_mu = pm.Beta('lam_prior_mu', alpha=100, beta=200,)
# prior distribution -> channels share the same hyper priors
lam = pm.Gamma('lam', mu=lam_prior_mu, sigma=lam_prior_sigma, dims=("channel", "geo"))
#hyper priors
k_prior_sigma = pm.HalfNormal('k_prior_sigma', sigma=500,)
k_prior_mu = pm.Beta('k_prior_mu', alpha=100, beta=200,)
# prior dist
k = pm.Gamma('k', mu=k_prior_mu, sigma=k_prior_sigma, dims=("channel", "geo"))
# estimating contribution (only using the transformation to keep it simple)
contribution = pm.Deterministic(
name="contribution",
var=adstock.function(x=x_data, lam=lam, k=k),
dims=("date","channel", "geo")
)
yhat = pm.Deterministic(
name="yhat",
var=(
intercept +
contribution.sum(axis=1)
),
dims=("date", "geo")
)
sigma_likelihood = pm.HalfNormal("sigma_likelihood", sigma=200, dims="geo")
nu = pm.Gamma(name="nu", alpha=400, beta=200, dims="geo")
pm.StudentT(
name="likelihood",
mu=yhat,
nu=nu,
sigma=sigma_likelihood,
dims=("date", "geo"),
observed=y_data
)
After applying the following model, you can run pm.sample
If you need to apply transformation for some channels and not others then maybe worth to modify the coordinates to be something like:
coordinates = {
"date":...,
"channel_type_a":...,
"channel_type_b":...,
"geo":...
}
Then you can simply modify the shape of the X_data
.
x_data_a = pm.Data(
"x_data_a",
value=X_data_a.values, #transform your dataframe into an array with the shape specify in the dims
dims=("date", "channel_type_a", "geo")
)
x_data_b = pm.Data(
"x_data_b",
value=X_data_b.values, #transform your dataframe into an array with the shape specify in the dims
dims=("date", "channel_type_b", "geo")
)
Using this way, you can decide to what data apply the transformation!
I recommend you to read the block from @twiecki about centered and non-centered hierarchies. The implementation expose here is a centered hierarchy.
As well, take a look to ZeroSumNormal which is a great alternative when you are leading with hierarchical model, where the number of parameters increase to a point of over-parametrization.