Beta and dirichlet regression for continuous proportion data

Hello,

I was hoping to get some advice on how to fit some continuous proportion data using pymc/bambi. Essentially, I have multiple measurements for a given mixture and this mixture can be a binary or ternary composition. For each sample, the proportion is calculated based on the measured mass. Therefore, for a binary mixture:

p_{A} = \frac{M_A}{M_A + M_B}
p_{B} = \frac{M_B}{M_A + M_B} = 1-p_A

And for a ternary mixture:
p_{A} = \frac{M_A}{M_A + M_B + M_C}
p_{B} = \frac{M_B}{M_A + M_B + M_C}
p_{B} = \frac{M_C}{M_A + M_B + M_C} = 1 - p_A - p_B

For the binary mixture case, I believe all I need is a beta regression which is straight forward in Bambi. Here is an example of the dataframe:

Using the example in Beta Regression — Bambi 0.8.0 documentation, I’m able to fit this binary data quite well while also taking into account the study level:

model = bmb.Model("p_A ~ 1 + (1|study)", data, family="beta")
fitted = model.fit()

However, I run into issues when attempting to fit a ternary mixture where the data looks like this:

This isn’t quite going to be a dirichlet-multinomial problem because I’m dealing with a continuous proportion and not count data. Does Bambi have analogous capabilities for a dirichlet regression where n>2? If not, can I get some help for setting up the model with pm.Dirichlet where p is observed as opposed to some measured counts?

Thank you in advance for the help.

I will let one of the more bambi-savvy folks answer you actual question. However, I will note that there are several examples of this type of model built in pymc (rather than bambi). This thread, and Alex’s chiming in with a pointer to this notebook, as well as the discussion on this old issue. If you’re looking to get your hands dirty a bit, those may help you get going (and of course you can ask here if they aren’t exactly what you need).

1 Like

@zult unfortunately Bambi does not support anything related to Dirichlet distribution yet