Best way to make point estimate predictions (using inference data)

mihagazvoda · April 5, 2023, 5:00pm

Hi!
Let’s say I want to create a simple binomial model per x:

import pandas as pd
import bambi as bmb

df_simple = pd.DataFrame({
    'x': ['A', 'B', 'C'],
    'y': [10, 20, 30],
    'n': [100, 100, 100]
})

m = bmb.Model('p(y, n) ~ x', data=df_simple, family='binomial')
idata = m.fit()

m.predict(idata)

I want to have a point estimate (median, can also be mean) for probability of success per x. What’s the best way (fastest / recommended) way to achieve this? I tried with

idata.posterior['p(y, n)_mean'].mean(dim='p(y, n)_obs')

But get mean per chain and x? Can you also show me what’s the best way to add this as column back to the original dataframe df_simple? Thanks!

OriolAbril · April 5, 2023, 5:40pm

dim behaves like axis in NumPy, that is, the dimensions/axis provided are the ones reduced. Therefore, what you want is:

p_mean = idata.posterior['p(y, n)_mean'].mean(dim=("chain", "draw"))

which will return a 1D DataArray of length 3. I haven’t checked, but you should be able to add this as a new column with:

df_simple["p_mean"] = p_mean

mihagazvoda · April 6, 2023, 4:20am

I think it should be like this:

df_simple["p_mean"] = p_mean.values

to get a numpy array out of DataArray.

Works great, thank you very much!

Topic		Replies	Views
How to get the labels for the predictive distribution? version agnostic bambi	2	657	May 31, 2022
Getting Point Estimates from Posterior to add to the Data Frame v5	4	789	May 12, 2022
Estimating probability of data point using inferred posterior Questions	7	733	April 30, 2018
PyMC3 posterior prediction example; how to incorporate a matrix of betas? Questions	24	2838	April 23, 2021
How to pass a coordinate to inference data v5 bambi , arviz	4	814	December 13, 2022

Best way to make point estimate predictions (using inference data)

Related topics