How to set up sigma for multiple batches of data?

Hi,

I have 6 batches of data, each batch contains concentrations of 6 species at 11 time points. My model currently is based on the assumption that the likelihood for each species is normally distributed around the predicted value (the ODE solution) with a different sigma for each species but the same across all time points per species.

sigmas = pm.HalfNormal("sigmas", sigma=0.1, shape=(6,1)) 
...
#likelihood
pm.Normal("C_obs", mu=ode_solution, sigma=sigmas, observed=observed_exp)

where my ode_solution and observed_exp are both [6,66] arrays. batch 1 is [6,0:11], batch 2 is the [6, 12:22]… and so on. sigmas has the shape=(6,1).

The code above works. Now, I want to create a model with a different likelihood. I want to set the sigma to be different for each species and also different across different time points so I should have 66 sigmas. How do I do this? Is it as simple as setting the sigmas shape to (6,11)? Does it involve separating the likelihood for each batch? If so, how do I do it?

Any help is greatly appreciated. Thanks

1 Like

The easiest is to reshape the data/prediction to have 3d shape batch x species x time and the you can define sigma per species and broadcast across batch/time.

Alternatively you can try to use pt.tile or pt.repeat to replicate along the common dimension

2 Likes

Hi! Thank you for the quick reply.

I am trying out the first option. I have reshaped my data/prediction with the shape [6,6,11] for 6 batches, 6 species and 11 timepoints.

Now, I don’t know exactly how to define sigma.

I have tried defining the shape to (6,11):

sigmas = pm.HalfNormal("sigmas", sigma=0.1, shape=(6,11))
    
# Likelihood
pm.Normal("Y_obs", mu=ode_solution, sigma=sigmas, observed=observed_exp_3d)

and I got a “Segmentation fault” error.

If I understand your advice, should I define sigma per species, say, sigmas1 for species1 with a shape of (1,11) and set up my likelihood as

sigmas1 = pm.HalfNormal("sigmas1", sigma=0.1, shape=(1,11))
sigmas2 = pm.HalfNormal("sigmas2", sigma=0.1, shape=(1,11))
sigmas3 = pm.HalfNormal("sigmas3", sigma=0.1, shape=(1,11))
sigmas4 = pm.HalfNormal("sigmas4", sigma=0.1, shape=(1,11))
sigmas5 = pm.HalfNormal("sigmas5", sigma=0.1, shape=(1,11))
sigmas6 = pm.HalfNormal("sigmas6", sigma=0.1, shape=(1,11))
        
# Likelihood
pm.Normal("Y_obs", mu=ode_solution, sigma=[sigmas1,sigmas2,sigmas3,sigmas4,sigmas5,sigmas6], observed=observed_exp_3d)

I tried this but it also doesn’t seem to work. I think the error has to do with my sigma definition.

I want the sigma to be the same across all batches per species and per time point, so I should have a total sigmas of 6 species x 11 timepoints.

Thanks.

Segmentation error is wild. Can you provide a fully reproducible snippet (as simplified as possible) to play with?

oh. I figured out the error was in the line:

@as_op(itypes=[pt.dvector], otypes=[pt.dmatrix])

I used otypes=[pt.dtensor3] and it seems to be working now. I’m using the shape=(6,11) for sigmas.

Thank you very much.

1 Like