Add uncertainty about receiving treatment in linear regression, random variables without posterior?

Hey :wave:,

I am pretty new to pymc and think I am missing something obvious. I have an experiment to measure the impact of push notifications on completing a task. The treatment is a binary 1 to receive notifications and 0 to not.

I can get the treatment effect with the slope form a simple linear regression:

Normal('slope' ....)
Normal('intercept' ....)
Normal('y', slope * data.treatment + intercept, observed=data.y)

In reality though about 60% of users have notification turned off, so even if they are assigned to the notification group they do not get any. I was thinking prob programming could be really cool here if I modeled the uncertainty with a Bernoulli distribution like:

prob_of_notifications = Normal('prob_of_notifications', .6, .1)
b = Bernoulli('b', p=prob_of_notifications, size=len(data))
real_treatment = Deterministic('real_treatment', pm.math.switch(data.treatment, b, data.treatement))
pm.Normal('y', slope * real_treatment + intercept, observed=data.y)

When I do this though its becomes way too slow. I tried replacing with at.random.normal etc for those but would not let me do that. Does this type of model make any sense and is there any way to make it efficient if I don’t care about generating a posterior for the real_treatment?

(also is this an appropriate place to ask this type of question? sorry new here!)


I think your idea makes sense in principle. You can make it faster by thinking through the implications of your model a bit.

You want to model the treatment effect for individuals who are 1) in the treatment group, and 2) really have their notifications on. So the variable real_treatment can be written with the & operator: real_treatment = data.treatment & b, which should be faster than using a switch.

You are also currently drawing values of b for individuals in both the control and treatment group. If you’re not able to model prob_of_notifications using some other available data (does anything about a person’s behavior/demographics give insight into whether he turns on notifications?), you might be able to save some cycles by drawing only data.treatment.sum() values for b, then concatenating it with a vector a zeros for the controls (sort your data by treatment in this case).

You should also use the logit_p parameter of Bernoulli rather than p, since your prob_of_notifications isn’t restricted to the unit interval.


Thank you for the quick response! Good to hear it makes some sense and those changes helped speed it up.