Implementing Geo-level Adstock Function

Hi PyMC community,

I’m implementing a Bayesian Marketing Mix Model (MMM) in PyMC based on this tutorial and drawing inspiration from the pymc-marketing framework. I’m not directly using this framework as I need my model to be able to handle geo-level data.

Currently, I’m working on adapting the geometric adstock function to apply it to geographical level data and estimate the alpha (assumed equal over all groups) with a PyMC model. I aim to apply a geometric adstock transformation group-wise based on geographical identifiers that I’m updating using MutableData as I also need to make out-of-sample predictions.

I can’t get it to work, I’m running into issues while debugging within scan. I’m unsure the line of code group_x = ptt.set_subtensor(x[1], group)[0] properly only selects the x variable and the group id is based on the sequence. So my first question is: any tips on how to check the value of group_x?

Then my second, larger question. I wrote another function before, which produced the right results using for loops, but it seemed as if the alpha wasn’t sampled. It was just stuck on the starting value. Next to that, that implementation broke my out-of-sample prediction function as there were now shape mismatches even though I’m using set_data to update all MutableData objects. I’m now focusing on ensuring the adstock functions works properly, but if you have any other suggestions how to tackle my problem these are very welcome as these problems persist even when using this function which includes scan.

This is my first post on this discourse, if you need any additional information to provide an answer, please let me know.

Adstock code:

import pytensor as pt
import pytensor.tensor as ptt

def geometric_adstock(group, x, alpha, l_max):
    group_x = ptt.set_subtensor(x[1], group)[0]

    l_max = l_max.eval()

    cycles = [
        ptt.concatenate(
            [ptt.zeros(i), group_x[: group_x.shape[0] - i]]
        )
        for i in range(l_max)
    ]
    x_cycle = ptt.stack(cycles)
    w = ptt.as_tensor_variable([ptt.power(alpha, i) for i in range(l_max)])
    return ptt.dot(w, x_cycle)

def apply_groupwise_adstock(x, groups, alpha, l_max: int = 3):
    # Unique geo identifiers
    unique_groups = ptt.unique(groups.eval())

    # Stack x values and geo ids together
    x_per_group = ptt.stack([x, groups])

    outputs, _ = pt.scan(fn=geometric_adstock,
                         sequences=[unique_groups],
                         non_sequences=[x_per_group, alpha, l_max])

    return outputs[-1]

Model definition:

    def _model_definition(self, x, y):
        with pm.Model() as self.model:
            self.model.add_coord(constants.N_OBS_NAME, np.arange(len(x)), mutable=True)

            group_idx = pm.MutableData(
                name="province",
                value=self.label_encoder.fit_transform(x["province"]),
                dims="n_obs",
            )

            response_mean = []

            y = pm.MutableData(
                name=constants.Y_NAME,
                value=y,
                dims=constants.N_OBS_NAME,
            )

            for feature_name, prior_setting in self.prior_settings.items():
                x_feature = pm.MutableData(
                    name=feature_name,
                    value=x[feature_name],
                    dims=constants.N_OBS_NAME,
                )
                beta_feature = unpack_bayesian_settings(prior_setting)
                alpha_feature = pm.Beta(f"alpha_{feature_name}", alpha = 1, beta = 1)
                x_feature_adstock = apply_groupwise_adstock(x[feature_name], group_idx, alpha_feature)
                response_mean.append(pm.math.dot(x_feature_adstock.eval(), beta_feature))

            intercept = unpack_bayesian_settings(self.intercept_settings)
            response_mean.append(intercept)

            sigma = unpack_bayesian_settings(self.sigma_settings)

            pm.Normal(
                name=constants.LIKELIHOOD_NAME,
                mu=sum(response_mean),
                sigma=sigma,
                observed=y,
            )

Out of sample predictions:

    def predict(self, x: np.ndarray) -> np.ndarray:
        """
        Predict function used outside this class.

        Parameters:
        ---------------------------------
        x: np.ndarray with the data to be predicted, the data has a
        shape of (n observations, n features)

        Returns:
        __________________________________
        np.array with predictions.
        """
        with self.model:

            pm.set_data(
                {"province": self.label_encoder.fit_transform(x["province"])},
                coords={constants.N_OBS_NAME: np.arange(len(x))},
            )

            for feature_name in self.prior_settings.keys():
                pm.set_data(
                    {feature_name: x[feature_name]},
                    coords={constants.N_OBS_NAME: np.arange(len(x))},
                )

            pm.set_data(
                {constants.Y_NAME: np.zeros(x.shape[0])},
                coords={constants.N_OBS_NAME: np.arange(len(x))},
            )
            posterior_predictive = pm.sample_posterior_predictive(
                self.trace, predictions=True
            )

        predictions_per_chain = posterior_predictive.predictions[
            constants.LIKELIHOOD_NAME
        ].T.to_numpy()
        predictions = predictions_per_chain.mean(axis=1).mean(axis=1)
        return predictions

Example data:

x = pd.DataFrame(
         {
             "date": ["2021-01-01", "2021-01-02", "2021-01-03", "2021-01-04", "2021-01-05", "2021-01-01", "2021-01-02", "2021-01-03", "2021-01-04", "2021-01-05"],
             "province": ["group1", "group1","group1", "group1","group1", "group2","group2", "group2","group2", "group2"],
             "tv_grp": [0.4] * 10,
          }
     )
y = pd.DataFrame({"y": [1.5] * 10})

Can you share the simple Python loop code that the scan should be equivalent to?

Otherwise, the first issue I notice is the use of .eval(). You shouldn’t do that because it will break the graph and show up just as a fixed constant. That could explain why you had some variables not updating.

Also I assume you may have seen this, but if not the original scan documentation could be useful: scan – Looping in PyTensor — PyTensor dev documentation

Ah I read about .eval() being bad practice but not that it would break anything. I’ll try to fix that. This was my previous attempt but I’m also using .eval() here so that might explain why this wouldn’t work.

import pandas as pd
import pytensor.tensor as pt
import tensorflow as tf

class GeometricAdstock:
    def __init__(self):
        self.name = self.__class__.__name__

    def transform(self, x, date_column, grouped_column, alpha: float = 0.0, l_max: int = 20):
        original_df = pd.DataFrame(
            {
                'date': date_column.eval(),
                'group': grouped_column.eval(),
                'feature': x.eval()
            }
        )

        sorted_df = original_df.sort_values(by=['group', 'date'])

        w = pt.as_tensor_variable([pt.power(alpha, i) for i in range(l_max)])

        groups = sorted_df['group'].unique()

        group_tensors = []  # List to hold tensors for each group

        for g in groups:
            group_df = sorted_df[sorted_df['group'] == g]  # Filter df by group
            group_tensor = []  # List to hold tensors for current group

            for i in range(l_max):
                features = group_df['feature'].iloc[:len(group_df) - i].values

                fill_row = pt.concatenate([pt.zeros(i), features])  # Concatenate zeros and features
                group_tensor.append(fill_row)

            group_tensors.append(group_tensor)

        result_tensor = pt.concatenate(pt.concatenate([group_tensors], axis=1), axis=1)

        sorted_df['adstock'] = pt.dot(w, result_tensor).eval()

        adstocked_df = sorted_df.sort_index()

        return pt.as_tensor_variable(adstocked_df['adstock'])

Thanks because sticking to tensors and not using .eval() at least caused sampling to go as expected! However, can’t quite get the function to work properly now. Do you recommend starting a new thread for this or not?

You can stay in this thread since it’s still the same problem, just partially solved :slight_smile: