Understanding intuition behind the gaussian process regression in pymc blogpost "moddeling changes marketing effectiveness over time"

cryingrichard · October 12, 2022, 1:46pm

After reviewing the post Bayesian Media Mix Models: Modelling changes in marketing effectiveness over time - PyMC Labs
i got intrigued to understand how one would go on about this and customize it.
lets say we assume a data-generation process: y_t = (e^gaussprocess_t) * x_t (daily data)
where we want to generalize the gaussprocess per month and not on the whole dataset meaning that we want to fit the same gaussprocess on the daily data per month thus 1 <= x <= 31 for the gaussianprocess and then broadcast this to the corresponding x_s’s. So we would have the same gaussian process every month fitted to the daily data.
How would one go about this?
Second question, does anyone have an smart example of restricting the generated functions from the guassprocess to produce values in the interval [0, 1] such that we could use it as saturation in an power-function saturation setup?

lucianopaz · October 12, 2022, 7:34pm

Hi, I’m glad you were intrigued by the post. Sorry but I didn’t understand your first question. You want to train a gp with monthly aggregate data and then reuse it somehow for daily data?
If that’s the case, I’m not sure how it is done. I imagine that you can work out a way to rewrite the gp amplitude in terms of the amplitude of the gp on aggregate data as long as you assume that the length scale is the same, and know at which day of the month you assume the aggregate is placed.
About the second question, you can pass the gp through a sigmoid link function, similar to what you did with the exponential. That will give you the output on the [0,1] interval, but setting priors on the gp amplitude and mean will be much harder.

cryingrichard · October 12, 2022, 11:03pm

i am really looking forward to see the follow up on that post.
it is really unclear what i meant, i will try to clarify.
Given daily data that spans over several months i would like to capture the day-in-month effects by using GP’s as parameters/latent variables. E.g in an sales = coeff_1 * spend^coeff_2 model this would be the coeff_2 or coeff_1 parameter that is now time-variant w.r.t day in month.
I dont really want to fit the gp over the whole dataset since i want to generalize the day-in-month effect over all months and use this as an forecast for the upcoming month.
Are there any practical examples in pymc out there where people used GP’s in such models, i have just seen people utilizing them when regressing on time as sole regressor, are there any examples of them being used as latent variables/parameters?

as for your answer on the second question, brilliant, makes sense.

thanks for taking your time.

kind regards

lucianopaz · October 13, 2022, 7:13am

Oh, in that case you can do something like this

Take the whole time series and split it into day and year_month.
Factorize day into the unique days (let’s call that array uday) and an indexing array (day_idx). The unique days array will have at most 31 entries while the indexing array will have as many entries as individual observations
Do the same factorization for the year_month array. Let’s call the indexing array year_month_idx, and the unique year-month pairs X_month.

Then you can create a GP using X_month, a separate RV for the days in month using uday and sum them index them with the indexing arrays and sum them together. The pseudocodish version of what I’m saying would look like this:

with pm.Model():
    latent = pm.gp.Latent(...)  # Use the cov and mean that you want
    gp_month = latent.prior("gp_month", X_month)
    day_rv = SomeDistribution("day_rv", ..., dims="uday")
    mu = gp_month[year_month_idx] + day_rv[day_idx]
    # Use mu for whatever else you need

Topic		Replies	Views
Implementing Gaussian Process with Periodic Kernel in PyMC: Questions and Approaches version agnostic development , gaussian_process , modeling	2	278	May 17, 2024
Using Gaussian Process model to make inference? v5 gaussian_process	1	660	February 5, 2023
Getting Started with Multivariate Gaussian Processes (vector autoregression) v5 gaussian_process , time_series	3	91	March 4, 2025
Gaussian Process -Statistical rethinking v5 gaussian_process	2	836	January 28, 2023
Gaussian Process regression with Automatic Relevance Determination Questions	14	2008	August 13, 2019

Understanding intuition behind the gaussian process regression in pymc blogpost "moddeling changes marketing effectiveness over time"

Related topics