Different distribution families for different channels / controls

Hi!

I have been playing around with PyMC Marketing to build an MMM. The predefined functionality, the plots etc, are very useful!

I know the model_config allows passing different parameters to adjust the priors for the different channels. I wonder, however, whether it is possible to also select different distribution families for the different channels, say use a halfnormal for one channel, and a normal for another.

The same question applies to the control variables, maybe even more severely, since in many cases I strongly expect, if anything, a positive influence from one control, and a negative one from the other.

To illustrate what I mean, here’s how I would expect that part of the model config to be structured, if it were possible:

"beta_channel": {
    "dist": ["HalfNormal", "LogNormal"],
    "kwargs": [
        {"sigma": [2, 1]},
        {"mu": [2, 1], "sigma": [2, 4]},
    ],
},

This does not work for the proximate reason that _get_distribution requires dist to contain a string, not a list. And of course for many more downstream reasons.

So I thought that i might be able to customize the build_model method, but at first and second glance it seems to be very difficult to implement the ability to use different types of distributions, because of “channel” and “control” being used as dims to create many random variables in one go – and I have the initial impression that the rest of the code strongly depends on that structure. And, being new to pymc, I suppose that since dims is an argument on the distribution itself, there is no easy way of constructing the same collection of random variables but with each RV drawn from a different type of distribution. I mean such that the object “looks the same” to the downstream code so that I only need to change how it’s constructed. Or maybe it is, given that it’s a TensorVariable?

I am now torn between using pymc-marketing, customizing both the model class as well as all code that depends on it, or to build my own model right away with pymc, and replicate all the useful functionality in the utility methods with custom code independent of pymc-marketing…

Before I choose either route: Any advice on this? Am I missing something obvious? Any hints on the feasibility of the different options are also highly appreciated!

Best Regards
Jonas

Two ideas:

You can achieve a lot with TruncatedNormal, by playing with upper and lower, if you set lower=[-np.inf, 0], upper=[0, np.inf], you get a negative prior for the first channel and positive for the second.

Otherwise you can try to concatenate two variables and use the coords for that Deterninistic

x = pm.Normal("x")
y = pm.LogNormal("y")
beta = pm.Deterministic("beta", pm.math.stack([x, y]), dims="channel”)

Hopefully no part of the model class cares if beta is a Deterministic or pure RV

Hi Ricardo!

Thank you very much, great suggestions! I will try with TruncatedNormal first as a quick fix.

Best
Jonas