Selecting proirs based on group

Nachshon_Korem · September 29, 2022, 4:28pm

Dear PYMC community,

I am building a simulation of how people rate images.
I am using a 5 parameter logistic regression model to estimate response.
The model works fine.
The problem starts when I try to compare groups to see how they differ in-terms of the logistic regression parameters.
I currently use 2 likelihood functions, but then I cannot compare models.
I would like to create a single likelihood function but different priors to the different groups, with some shared hyper priors.
I tried using aesara switch, with an index for group on top of the index for subject but it won’t run. I think the logic behind it wasn’t that good.

I create the data using this method

def add_sub(trials = 100, min_r = 0, max_r = 4, noise = .1, 
            slope=10, a1 = 1, a2 = 20, c_mu = 2, d1 = 10, d2 = 10, g_mu = 1,
            axis = False):
    
    x = np.arange(min_r, max_r, max_r/trials)
    a = np.random.beta(a1, a2, 1)[0] # close to 0 but not 0.
    b = np.random.normal(slope, 2, 1)[0] # this is where the magic happens
    c = np.random.normal(c_mu, .5, 1)[0] # kind of average
    d = np.random.beta(d1, d2, 1)[0] # close to 0.5
    g = np.random.normal(g_mu, .2, 1)[0] # almost symetrical
    
    # the 5PL formula
    y = d + (a-d)/(1+(x/c)**b)**g + np.random.normal(0, noise, trials)
    
    # replace negative values with 0 (you cannot rate lower than 0)
    y = [0 if i < 0 else i for i in y]   

    df = pd.DataFrame({'x': x, 'y': y})
    if axis:
        return(x)
    return(df)

and then try to extract the data using this method:

def pl5_multi_groups(df1, df2, idx1, idx2, n_subs1, n_subs2 = 0):
    
    if n_subs2 == 0:
        n_subs2 = n_subs1

    with pm.Model() as five_PL_groups:


        # hyper prior per group
        mu_b = pm.Normal('mu_b', 6,    3, shape = 2)
        mu_c = pm.Normal('mu_c', 3.5,  1, shape = 2)
        mu_g = pm.TruncatedNormal('mu_g', 1.1, .2, lower = .4, shape = 2)

        a1 = pm.TruncatedNormal('a1', 1, .3, lower = 0, shape = 2)
        a2 = pm.Normal('a2', 20, 3, shape = 2)

        d1 = pm.TruncatedNormal('d1', 12, 2, lower = 0, shape = 2)
        d2 = pm.TruncatedNormal('d2', 8,  2, lower = 0, shape = 2)

        # priors

        a_1 = pm.Beta('a_1',   a1[0],  a2[0], shape = n_subs1)
        b_1 = pm.Normal('b_1', mu_b[0], 2, shape = n_subs1)
        c_1 = pm.Normal('c_1', mu_c[0], 2, shape = n_subs1)
        d_1 = pm.Beta('d_1',   d1[0],  d2[0], shape = n_subs1)
        g_1 = pm.Normal('g_1', mu_g[0],.2, shape = n_subs1)

        a_2 = pm.Beta('a_2',   a1[1],  a2[1], shape = n_subs2)
        b_2 = pm.Normal('b_2', mu_b[1], 2, shape = n_subs2)
        c_2 = pm.Normal('c_2', mu_c[1], 2, shape = n_subs2)
        d_2 = pm.Beta('d_2',   d1[1],  d2[1], shape = n_subs2)
        g_2 = pm.Normal('g_2', mu_g[1],.2, shape = n_subs2)


        eps = pm.Exponential('eps', 5, shape = 2)

        # model prediction using the 5PL regression    
        y_hat1 = d_1[idx1] + (a_1[idx1]-d_1[idx1])/(1+(df1['x'].values/c_1[idx1])**b_1[idx1])**g_1[idx1]
        y_hat2 = d_2[idx2] + (a_2[idx2]-d_2[idx2])/(1+(df2['x'].values/c_2[idx2])**b_2[idx2])**g_2[idx2]

        # Likelihood (sampling distribution) of observations
        rating1 = pm.Normal('rating1',y_hat1, eps[0], observed=df1.y)
        rating2 = pm.Normal('rating2',y_hat2, eps[1], observed=df2.y)

        trace = jx.sample_numpyro_nuts(tune = 3000, draws = 1000, target_accept=0.95)
        
    return trace

my question is how to send the function which group each participant is and then to have the function select the priors based on the group.

PYMC 4.1.7
Arviz 0.12.1
JAX 0.3.17
Aesara 2.8.2

complete code

github.com

KoremNSN/EmotionalNumbing/blob/master/EN_Simulator.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "2acd7028",
   "metadata": {},
   "source": [
    "# Emotional numbing in PTSD a look into a paradox\n",
    "\n",
    "PTSD symptoms include (not only) symptoms of hyper arousal, anger, nightmares, problems with emotion regulation and... emotional numbing. </br>\n",
    "While most symptoms point to over expression of emotions, emotional numbing states that there is low emotionality.\n",
    "\n",
    "In a not really direct way, here, we will continue to explore the idea we proposed in the SIA paper.</br>\n",
    "https://www.nature.com/articles/s41386-022-01405-2</br>\n",
    "Mainly, reduced response to low valance/arousing stimuli and high response to higher valance stimuli.\n",
    "\n",
    "In this experiment we will explore this by asking people with or without diagnosis of PTSD to rate pictures from known datasets (IAPS/NAPS).</br> \n",
    "Using the dataset reported valence rating as the X axis and participants rating as the y-axis, we will fit a 5 parameter logistic regression to the curve of participants and compare the parameters between the groups.\n",
    "\n",
    "The formula we will use for the 5PL regression:\n",

This file has been truncated. show original

junpenglao · September 30, 2022, 10:05am

The best way to think about this is that your model building function should always take 1 data frame as input, where the group information is coded as one of the colume:

df_all = pd.concat([cont_df, ptsd_df], keys=['control', 'ptsd'], names=['group']).reset_index()
df_all['group_id'] = df_all['group'].astype('category').cat.codes
df_all['sub_id'] = df_all['sub'] + df_all['group_id'] * 10
group_sub_id = np.unique(df_all[['group_id', 'sub_id']].values, axis=0)
idx = df_all['sub_id'].values

with pm.Model() as five_PL_groups:

    # hyper prior per group
    ... this part stays the same
    # priors
    # Note that since we are broadcasting the input, we dont need to specify shape here
    a = pm.Beta('a',   a1[group_sub_id[:, 0]],    a2[group_sub_id[:, 0]])
    b = pm.Normal('b', mu_b[group_sub_id[:, 0]],   2)
    c = pm.Normal('c', mu_c[group_sub_id[:, 0]],   1)
    d = pm.Beta('d',   d1[group_sub_id[:, 0]],    d2[group_sub_id[:, 0]])
    g = pm.Normal('g', mu_g[group_sub_id[:, 0]], .15)
    y_hat = d[idx] + (a[idx]-d[idx])/(1+(df_all['x'].values/c[idx])**b[idx])**g[idx]

    eps = pm.Exponential('eps', 5, shape = 2)
    # Likelihood (sampling distribution) of observations
    rating = pm.Normal('rating',y_hat, eps[df_all['group_id'].values], observed=df_all.y)

Nachshon_Korem · October 3, 2022, 6:21pm

Thank you so much @junpenglao!

Topic		Replies	Views
[Beginner level question on modeling] Bayesian analysis of F1 scores from two ML models v5 modeling	5	410	January 24, 2023
Beginner question - Comparing two posterior predictive distributions with different number of observed data v5	8	670	July 12, 2023
New To PyMC3 \| Logistic Regression - Bug Questions bug	4	783	November 4, 2020
Optimizing PyMC Code for Efficiency for many runs	2	518	January 12, 2024
Unexpected results from logistic regression model version agnostic	8	545	February 15, 2023

Selecting proirs based on group

Related topics