Can I include lift test start and end date in building pymc marketing prior?

Hi there,

I am trying to add my lift test result into my MMM priors. Recently used the add_lift_test function posted here. My question is, is there a way that i can state the start and end date of a lift test for specific channel? The reason being is an lift test ran 6 months ago may generate very different result as compared to more recent period and this may help with model saturation.

In addition, I also found this resources below but i’m not sure if this means i need to write all the dates that are covered by a lift test in a certain channel. Some of our channel have experiment that spans over a couple months, so I want to see if there’s a better way to incorporate this information.
https://www.pymc-marketing.io/en/0.6.0/api/generated/pymc_marketing.mmm.lift_test.add_lift_measurements_to_likelihood.html

Thank you,
Jennifer

Hello Jennifer,

Our model does not incorporate time as a variable in adstock/saturation, so identifying the specific time of the test is unnecessary. The model aims to find the average impact of the channel over the entire training period. Currently, there is no temporal component that adjusts the contribution value based on time.

Furthermore, you don’t need to list all the dates. Regardless of whether an experiment lasts 3, 5, or 10 months, each line should represent the average total contribution values ​​in the same time unit as the model.

For example, if your model is trained with 179 weeks of data and the information is weekly, and you have 3 tests conducted at different times:

First test:

  • Duration: 12 weeks
  • Total expense: $1,000
  • Expense change (delta X): $350 more than the previous 12 weeks
  • Contribution increase: 30 units

Second test:

  • Duration: 4 weeks
  • Total expense: $0
  • Expense change (delta X): -$750 less than the previous 4 weeks
  • Contribution decrease: -40 units

Third test:

  • Duration: 2 weeks
  • Total expense: $50
  • Expense change (delta X): $50 more than the previous 2 weeks
  • Contribution increase: 10 units

Let’s analyze these numbers:

For the first test, a total expense of $1,000 with a delta of $350 means the initial 12 weeks had a total spend of $650. During the subsequent 12 weeks, the spend was $1,000. To find the weekly spend, divide by 12:

  • Initial weekly spend: $650 / 12 = $54
  • Post-intervention weekly spend: $1,000 / 12 = $83

The weekly delta X value is $29 (83 - 54). Therefore, we can inform the model that an experiment with a weekly increase of 29 units of spend resulted in a 2.5 unit increase in contribution.

Here’s how this data should be presented:

Channel x delta_x delta_y sigma
Channel 1 83 29 2.5 0.05

If you repeat this for the other two examples, you will end up with a dataset similar to the example in PyMC Marketing, and you will be able to fit your model by adding a likelihood to the results to calibrate it.

Let me know, if it’s clear!

4 Likes

Hey, I came across this, and it really helps – thanks!

I’m still curious though: why doesn’t time matter in the model? Like, if I sell sunglasses and run a lift test in December vs. August, the ad effect would probably change because of seasonality, no? How does the model handle that?

Would love to hear your thoughts! :blush:

Hey @A_Bell

As mentioned before, the basic PyMC model does not incorporate time as a variable in adstock/saturation, so identifying the specific time of the test is unnecessary to calibrate the non-linear transformations. Even if you select time varying media, the design doesn’t require specify time in the experiments.

Currently non-linear transformations always estimate the average contribution. This means that if the actual contribution varies over time, the base models without varying coefficients estimate the mean over time. Something like the image you see below, the base model will estimate the blue line.

This means that the curve you see is the average response curve. This implies that for each point in time we will have a different curve.

As you can see time is not a dimension in the curve space, basically each experiment is added as a point in this space and the parameters to generate the curve must be adjusted to make the average curve match as many points as possible. When you have an experiment you will surely make a shift of the average curve towards one side more than the other, but after adding different experiments, even considering the temporal variation, you should find the average contribution.

The process will look as the following:

As the example should show, after several experiments you should find the average contribution correctly. This is the same for when you have media varying adjustment, the only difference is that when you calibrate the model after finding the mean contribution, the model should try to adjust the time varying HSGP to make a match with your experiment, basically the calibration process will adjust the mean contribution and its deviation at time T, given by the HSGP.

This works because the HSGP is a multiplier (With mean 1 across time), over the estimated mean contribution.

I hope this adds clarity!

2 Likes

Got it, thanks for the detailed explanation @cetagostini ! The concept of adjusting the mean contribution over time while calibrating with experiments makes perfect sense now. Appreciate it!