Hierarchical Modeling MMM with Geo-Data in PyMC

cetagostini · June 11, 2024, 4:38pm

Hope, I’m following correctly but you want to create a hierarchical model with a ("date","geo") granularity? If so. I’ll recommend you the following.

First, take advantage from pymc-marketing and implement any of the available adstocks

from pymc_marketing.mmm.components.adstock import WeibullAdstock

adstock = WeibullAdstock(l_max=10, normalize=True)

You can choose between WeibullAdstock, DelayedAdstock or GeometricAdstock

Assuming your dataframe has the following structure:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30 entries, 0 to 29
Data columns (total 5 columns):
 #   Column             Non-Null Count  Dtype         
---  ------             --------------  -----         
 0   date               30 non-null     datetime64[ns]
 1   geo                30 non-null     object        
 2   marketing_spend    30 non-null     float64       
 3   sales              30 non-null     float64       
 4   marketing_spend_2  30 non-null     float64       
dtypes: datetime64[ns](1), float64(3), object(1)
memory usage: 1.3+ KB

Where marketing_spend_N is a column with the spend/impression of certain channel. Then you can write your model as follow:

with pm.Model(coords=coordinates) as hierarchical_model:
    x_data = pm.Data(
        "x_data",
        value=X_data.values, #transform your dataframe into an array with the shape specify in the dims
        dims=("date", "channel", "geo")
    )

    y_data = pm.Data(
        "y_data",
        value=y.values, #transform your dataframe into an array with the shape specify in the dims
        dims=("date", "geo")
    )

    intercept = pm.Gamma('intercept', mu=500, sigma=300, dims="geo")
    
    #hyper priors (global mu & sigma)
    lam_prior_sigma = pm.HalfNormal('lam_prior_sigma', sigma=100,)
    lam_prior_mu = pm.Beta('lam_prior_mu', alpha=100, beta=200,)
    # prior distribution -> channels share the same hyper priors
    lam = pm.Gamma('lam', mu=lam_prior_mu, sigma=lam_prior_sigma, dims=("channel", "geo"))
    
    #hyper priors
    k_prior_sigma = pm.HalfNormal('k_prior_sigma', sigma=500,)
    k_prior_mu = pm.Beta('k_prior_mu', alpha=100, beta=200,)
    # prior dist
    k = pm.Gamma('k', mu=k_prior_mu, sigma=k_prior_sigma, dims=("channel", "geo"))
    
    # estimating contribution (only using the transformation to keep it simple)
    contribution = pm.Deterministic(
        name="contribution",
        var=adstock.function(x=x_data, lam=lam, k=k),
        dims=("date","channel", "geo")
    )

    yhat = pm.Deterministic(
        name="yhat", 
        var=(
            intercept + 
            contribution.sum(axis=1)
        ), 
        dims=("date", "geo")
    )

    sigma_likelihood = pm.HalfNormal("sigma_likelihood", sigma=200, dims="geo")
    nu = pm.Gamma(name="nu", alpha=400, beta=200, dims="geo")

    pm.StudentT(
        name="likelihood", 
        mu=yhat, 
        nu=nu, 
        sigma=sigma_likelihood, 
        dims=("date", "geo"), 
        observed=y_data
    )

After applying the following model, you can run pm.sample

If you need to apply transformation for some channels and not others then maybe worth to modify the coordinates to be something like:

coordinates = {
"date":...,
"channel_type_a":...,
"channel_type_b":...,
"geo":...
}

Then you can simply modify the shape of the X_data.

    x_data_a = pm.Data(
        "x_data_a",
        value=X_data_a.values, #transform your dataframe into an array with the shape specify in the dims
        dims=("date", "channel_type_a", "geo")
    )

    x_data_b = pm.Data(
        "x_data_b",
        value=X_data_b.values, #transform your dataframe into an array with the shape specify in the dims
        dims=("date", "channel_type_b", "geo")
    )

Using this way, you can decide to what data apply the transformation!

I recommend you to read the block from @twiecki about centered and non-centered hierarchies. The implementation expose here is a centered hierarchy.

As well, take a look to ZeroSumNormal which is a great alternative when you are leading with hierarchical model, where the number of parameters increase to a point of over-parametrization.

Topic		Replies	Views
Bayesian Model development port from pymc3 to pymc5 v5 modeling	2	317	February 26, 2024
Adapting adstock transformation (media mix modelling) to hierarchical model version agnostic theano , aesara	0	866	November 18, 2022
Implementing Geo-level Adstock Function v5 pytensor	4	649	February 22, 2024
Hierarchical MMM - Ragged Data v5 hierarchical , pymc-marketing	0	101	December 1, 2024
Integrating Hierarchical Data at Multiple Levels in PyMC for Forecasting modeling	0	317	March 21, 2024

Hierarchical Modeling MMM with Geo-Data in PyMC

Related topics