Calculating the difference between control and experiment in our warehouse

Hello! We have performed an experiment at our warehouse on the rate at which we can fill boxes with goods, where we apply a new algorithm to some of our box-filling-lanes, and we want to see if we can find a statistical improvement in that rate with our new algorithm. We’re three data-scientists following the Statistical Rethinking course, and we’re trying to apply what we’re learning, using pymc.

The experiment
We have two box-filling-lanes. For one week, we recorded the time required to fill each box in each lane, using an old algorithm. For the second week, we changed the algorithm on one lane.

We want to avoid the following two biases:

  • Box-filling rate might be faster/slower on the second week than the first week
  • The people working on one lane might be faster/slower than on the other lane

The model
The data consists of over 200_000 filled boxes, each with 0/1 for is_experiment, 0/1 for which week number, and a time_norm, which is a standardised (subtract mean, divide by std) amount of time required to fill each box.

We’ve tried to construct a hierarchical model where we try to fit a model to the data, and also calculate the difference between week 1 and 0 for both lanes, and then use that difference to calculate the difference between experiment and non-experiment. I do the latter using a Deterministic “distribution” in order to track of it in the model.

The following solution seems to work, but it works without us really knowing why it does so, and we’re trying to understand if it is correct. (I realise now that I should try with some dummy data as well where I know the answer beforehand).

with pm.Model(coords={"is_experiment": [0, 1], "week_nr": [0, 1]}) as model:
    mu_time = pm.Normal(
        "mu_time", mu=0, sigma=0.2, dims=("is_experiment", "week_nr"),
    time_norm = pm.Normal(
        "time_norm", mu=mu_time[is_experiment_data, week_nr], sigma=0.1, observed=picking_times, 

    diff_weeks = pm.Deterministic("diff_between_weeks", mu_time[:, 1] - mu_time[:, 0])
    diff_experiment = pm.Deterministic("diff_between_experiment_control", diff_weeks[1] - diff_weeks[0])
    trace = pm.sample(draws=1000)


There are two main points of confusion:

  1. Why are we using two Normal distributions, when it feels like one should suffice. When we try to only use one, then we don’t know how to implement the indexing (mu_time[is_experiment_data, week_nr]). I think the answer is something like “the second Normal is the likelihood when fitting the data, and the first one describes the mean and sigma of the parameter itself”, but I don’t have an intuitive sense for this.
  2. We are using mu_time in our Deterministic code. I felt like I should have been using time_norm, but if I do so, then I get a shape error, since time_norm is one-dimensional, and we would be indexing it using e.g. [:, 1]

The following is the result of fitting the model, and seems sensible enough (slight reduction in box-filling-time with the new algorithm).