Optimizer lift issue

Hello,

I’ve developed an MMM with a defensible response decomposition, ROAS, and channel contribution curves. Using sample_posterior_predictive on a 13-week out-of-sample period, I observed a reasonable R-squared and low MAPE.

I read through the Budget Allocation documentation and tested it with 20% allocation bounds per channel. This resulted in an optimized budget allocation response of $22.7M, compared to $19.1M from the initial budget allocation and I measured a 18.7% lift.

As a sanity check, I reran the procedure with 0.001% allocation bounds per channel (attempting to keep the budget the same for both initial and optimized allocations), and the response totals were nearly identical to the 20% optimized trial. I calculated a 18.5% lift. What this indicates is that sample_posterior_predictive and sample_response_distribution produce drastically different response results from the same allocations. I’ve verified that the initial budget allocations for each channel are large enough to avoid the steepest part of the contribution curves, where small spend changes could have an outsized impact.

How can I diagnose what’s causing this discrepancy? Any insights would be greatly appreciated!

Hello,

I need more information to give you a response but the sample response, its just calling sample posterior predictive under the hood.

Here is the code:

def sample_response_distribution(
        self,
        allocation_strategy: DataArray | dict[str, float],
        time_granularity: Literal["daily", "weekly", "monthly", "quarterly", "yearly"],
        num_periods: int,
        noise_level: float,
    ) -> az.InferenceData:
        """Generate synthetic dataset and sample posterior predictive based on allocation.

        Parameters
        ----------
        allocation_strategy : DataArray or dict[str, float]
            The allocation strategy for the channels.
        time_granularity : Literal["daily", "weekly", "monthly", "quarterly", "yearly"]
            The granularity of the time units (e.g., 'daily', 'weekly', 'monthly').
        num_periods : int
            The number of time periods for prediction.
        noise_level : float
            The level of noise to add to the synthetic data.

        Returns
        -------
        az.InferenceData
            The posterior predictive samples based on the synthetic dataset.
        """
        if isinstance(allocation_strategy, dict):
            # For backward compatibility
            allocation_strategy = DataArray(
                pd.Series(allocation_strategy), dims=("channel",)
            )

        synth_dataset = self._create_synth_dataset(
            df=self.X,
            date_column=self.date_column,
            allocation_strategy=allocation_strategy,
            channels=self.channel_columns,
            controls=self.control_columns,
            target_col=self.output_var,
            time_granularity=time_granularity,
            time_length=num_periods,
            lag=self.adstock.l_max,
            noise_level=noise_level,
        )

        constant_data = allocation_strategy.to_dataset(name="allocation")

        return self.sample_posterior_predictive(
            X=synth_dataset,
            extend_idata=False,
            include_last_observations=True,
            original_scale=False,
            var_names=["y", "channel_contributions"],
            progressbar=False,
        ).merge(constant_data)

My sense is the following:

  1. You are defining a high or default noise level, you can set noise level to 0.00001 and both results should be okay. Could you try that?
  2. What granularity selection are you using? You may be using a one different from your data and by consequence getting a different estimate.

If you want to really check everything is correct, create a dataset with _create_synth_dataset private method and see if sharing the same dataset gives you different outcomes. That should not be.

I feel this could be just point 1, mentioned early. You are talking about a lift of 18.5% and 18.7% which doesn’t sound as something radically different, but probable given the noise.

1 Like

Hi, I tried your first recommendation setting noise_level = 0.00001 within the sample_response_distribution call, however it didn’t affect the result as I hoped when optimized allocation is set to the values as the initial allocation. I am seeing an “optimized” allocation response of $22.5M, and initial allocation response of $19M.

My data is weekly granularity, and I am passing ‘weekly’ to the time_granularity parameter in sample_response_distribution. Here’s some code that I’m using as input to the allocation optimizer:

budget_bounds = optimizer_xarray_builder(
      np.array(budget_bounds_list),
      channel = media_vars,
      bound = ["lower", "upper"]
)

allocation_strategy, optimization_result = mmm.optimize_budget(
    budget= time_unit_budget,
    num_periods=5,
    budget_bounds= budget_bounds,
)
response_optimized = mmm.sample_response_distribution(
    allocation_strategy = allocation_strategy,
    time_granularity = 'weekly',
    num_periods = campaign_period,
    noise_level = 0.00001,
)

time_unit_budget being the sum of channel spends on a weekly basis
campaign_period is 5 weeks
budget_bounds_list is a list of lists with 0.01% below and above corresponding channel spends from the initial budget that I’m passing into optimizer_xarray_builder.

I’ve also confirmed that the initial and optimized budget allocations sum to the same value.

Here’s how I’m setting up the initial budget scenario:

last_date = mmm.X['date'].max()
# New dates starting from last in dataset
n_new = 5
new_dates = pd.date_range(start=last_date, periods=1 + n_new, freq="W-SAT")[1:]
initial_budget_scenario = pd.DataFrame(
    {
        "date": new_dates,
    }
)
# Same channel spends as last day
initial_budget_scenario["channel_1"] = initial_budget_dict["channel_1"]
initial_budget_scenario["channel_2"] = initial_budget_dict["channel_2"]
initial_budget_scenario["channel_3"] = initial_budget_dict["channel_3"]
initial_budget_scenario["channel_4"] = initial_budget_dict["channel_4"]
initial_budget_scenario["channel_5"] = initial_budget_dict["channel_5"]
initial_budget_scenario["channel_6"] = initial_budget_dict["channel_6"]
initial_budget_scenario["channel_7"] = initial_budget_dict["channel_7"]
initial_budget_scenario["channel_8"] = initial_budget_dict["channel_8"]
initial_budget_scenario["channel_9"] = initial_budget_dict["channel_9"]
initial_budget_scenario["channel_10"] = initial_budget_dict["channel_10"]
for var in control_vars:
    initial_budget_scenario[var] = 0

I may try using create_synth_dataset to check the functionality too.

Thanks for your help.

I see, my recommendation will be make a dataset comparison side by side. Consider the posterior predictive in the sample response sets last observations to True, probably if your regular process set the value to false then responses will vary.

Try does and then come back :slight_smile: I feel its just a difference in inputs than error, you just need to identify the discrepancy.

ps: I can be wrong :smile: if so, happy if you open an issue in pymc-marketing with a reproducible example from the PyMC Example code with fake data.