Media mix models with negative intercept result

Hi everyone,

Recently I tried to use pymc marketing media mix modelling to build a model for budget optimization, everything is fine except that I got confused by obtaining a negative intercept from the model results.

As far as I know, it’s possible to set intercept prior using half normal in order to get a positive intercept, but I was wondering if forcing the intercept to be positive is statistically correct because when we build a model, there are chances that we may obtain a positive OR a NEGATIVE intercept.

If it is assumed that positive revenue will be obtained even if we have no media channel and investment costs. In that case, whether it is correct to forcely use a positive intercept? If it’s fine, then are there any assumptions the model needs to meet?

Thank you for your insight in advance.

Kelvin

Hi Kelvin,

Have you fit the model in the unconstrainted domain setting? What are the results there?
Does setting the positive only constraint cause issues with sampling?

Are you using controls in your model as well? If so, do you have any transformations applied to them before model fit?

Hi Will,

Thank you for your response.

Basically, I built the model by using the tutorial notebook. I didn’t modified the original code that much except applying my revenue and costs data.


Have you fit the model in the unconstrainted domain setting? What are the results there?

Here is my model configuration, and in fact the default_model_config was used:

dummy_model = DelayedSaturatedMM(date_column="", channel_columns="", adstock_max_lag=4)

my_sampler_config = {"progressbar", True}

mmm = DelayedSaturatedMMM(
    model_config=dummy_model.default_model_config,
    sampler_config=my_sampler_config,
    data=column="date_week",
    channel_columns=cost_channel,
    control_columns=event_column_names + ['t'],
    adstock_max_lag=8,
    yearly_seasonality=2,

)

mmm.fit(X=X, y=y, target_accept=0.95, chains=8, nuts_sampler="numpyro", random_seed=rng)

The model results look good to me without any divergence. The contributions of the channel are aligned with the proportion of the costs, so I will say the results make sense to me.

What I mean by making sense above is that given the proportion of cost channels is
cost channel A = 40%
cost channel B = 30%
cost channel C = 20%
cost channel D = 10%

The estimated contributions of the channel are tend to follow this similar proportions, e.g.
contribution of channel A = 32%
contribution of channel B = 38%
contribution of channel C = 18%
contribution of channel D = 12%

I also tried to use the following model_config, but the model results didn’t have much different than using dummy_model.default_model_config. It seems that the priors don’t have much influence to the posterior estimates.

model_config={
    "beta_channel": {
        "dist": "LogNormal",
        "kwargs": {
            "mu": 0
            ,"sigma": 1
        },
    },
    "likelihood":{
        "dist": "Normal",
        "kwargs": {
            "sigma": {"dist": "HalfNormal", "kwargs": {"sigma": 2}}
        }
    }
}


Does setting the positive only constraint cause issues with sampling?

I did try to set intercept as follows:

model_config={
    "intercept": {'dist': 'HalfNormal', 'kwargs': {'sigma': 2}},
    "beta_channel": {
        "dist": "LogNormal",
        "kwargs": {
            "mu": 0
            ,"sigma": 1
        },
    },
    "likelihood":{
        "dist": "Normal",
        "kwargs": {
            "sigma": {"dist": "HalfNormal", "kwargs": {"sigma": 2}}
        }
    }
}

The model was able to be built without any divergence. It’s just that the proportions of estimated contributions of the channel become as follows, e.g.
contribution of channel A = 22%
contribution of channel B = 48%
contribution of channel C = 18%
contribution of channel D = 12%

Which make a bit no sense to me that the contribution of channel B was a bit too high whereas contribution of channel A was a bit low compared to their costs.


Are you using controls in your model as well?
ans: I did and the control_columns is as follows:

mmm = DelayedSaturatedMMM(
.
.
control_columns=event_column_names + ['t'],
.
)


event_column_names is referred to the weeks that are public holidays and [‘t’] is built using the same method as the tutorial notebook does.

data['t']=range(n)

If so, do you have any transformations applied to them before model fit?
ans: I didn’t apply any kind of transformation e.g. taking log(), but I did divide revenue and costs with a constant in order to make the amounts smaller.

Forgive me that I am someone new to bayesian statistics and the way I examined the model may not be professional :slight_smile:

Thank you for insight in advance.

Kelvin

Anyone can share anything about it? :slight_smile: