I am using pymc to solve the MMM problem and have recently conducted some uplift tests, which provide a true performance data of a certain channel. I want to incorporate this information into my model by setting constraint such that the product of the channel spend and a channel random variable is equal to the results of the uplift tests.
My model:
# input data
channel_A_spend = [1, 2, 3, 4, 5, 6, 7, 8, 9]
channel_B_spend = [2, 4, 56, 78, 9, 765, 4, 3, 0]
channel_C_spend = [2, 3, 30, 78, 9, 10, 5, 8, 0]
conversions = [1, 20, 3, 10, 5, 15, 7, 18, 9]
# priors data
channel_C_min_conversions = 3
channel_C_max_conversions = 10
# uplift results
channel_A_uplift = [0, 5, 2, 4, 1, 5, 0, 0, 2]
channel_C_cp = pm.find_constrained_prior(
pm.HalfNormal,
lower=channel_C_min_conversions,
upper=channel_C_max_conversions,
mass=0.95,
init_guess=dict(sigma=1),
)
with pm.Model() as base_model:
channel_spend_A_data = pm.MutableData(
name="channel_spend_A_data",
value=channel_A_spend,
)
channel_spend_B_data = pm.MutableData(
name="channel_spend_B_data",
value=channel_B_spend,
)
channel_spend_C_data = pm.MutableData(
name="channel_spend_C_data",
value=channel_C_spend,
)
intercept = pm.Normal("intercept", mu=0, sigma=1)
channel_A_coef = pm.HalfNormal("channel_A_coef", sigma=1)
channel_B_coef = pm.HalfNormal("channel_B_coef", sigma=1)
channel_C_coef = pm.HalfNormal("channel_C_coef", **channel_C_cp)
# set constraint of pm.math.dot(channel_A_coef, channel_spend_A_data) = channel_A_uplift
constraint = pm.Potential(
"channel_A_constraint",
-100
* pm.math.abs(
pm.math.dot(
channel_spend_A_data,
channel_A_coef,
)
- (channel_A_uplift)
),
)
mu = (
intercept
+ pm.math.dot(channel_A_coef, channel_spend_A_data)
+ pm.math.dot(channel_B_coef, channel_spend_B_data)
+ pm.math.dot(channel_C_coef, channel_spend_C_data)
)
sigma = pm.HalfNormal("sigma", sigma=1)
pm.Normal(name="response", sigma=sigma, mu=mu, observed=conversions)
base_model_prior_predictive = pm.sample_prior_predictive()```
Questions:
- While I have tried implementing constraints, I have some concerns about if it is an appropriate way of setting such constraints, especially because model performance goes off and also, over examples I have seen that pm.Potential was used only with random variables, not the product of data and random variables.
- Also mentioned, that if I remove constrained prior from channel_C - model performance becomes a bit better - how this can relate?
Thanks in advance for any help.