Extracting peak locations from Bayesian Polynomial Regression in a pupil-performance study

Hi everyone,

We are working on a Bayesian hierarchical probit regression to analyze the relationship between pupil size and performance in an auditory discrimination task. The key assumption is that this relationship follows the Yerkes-Dodson law, meaning an inverted U-shape. We model this using a second-order polynomial, fitting both a linear and a quadratic term for pupil size.

The task consists of two conditions:

  • An easy condition, where performance is around 90%.
  • A difficult condition, where performance is around 70%.

Our hypothesis is that the relationship between pupil size and performance differs between these conditions, specifically that the peak of the inverted U-shape shifts.

We are using a hierarchical Bayesian probit regression model (DeCarlo, 1998; see APA PsycNet) to account for the interaction between condition and the pupil-performance relationship. The following posteriors are estimated:

  • β_lin: Describes the linear relationship between pupil size and performance.
  • β_quad: Describes the quadratic relationship between pupil size and performance.
  • β_linXcond: Describes the modulation of task condition on the linear term.
  • β_quadXcond: Describes the modulation of task condition on the quadratic term.
  • We also estimate β_cond (main task condition effect), though not directly relevant in this post.

To obtain the condition-specific coefficients, we compute:

  • Linear term: (β_lin + β_linXcond) for one condition, (β_lin - β_linXcond) for the other.
  • Quadratic term: (β_quad + β_quadXcond) for one condition, (β_quad - β_quadXcond) for the other.

Using these, we calculate the peak location of the inverted U-curve as:

\text{peak} = -\frac{\beta_{\text{lin}}}{2 \beta_{\text{quad}}}

Questions:

  1. Is this approach valid for estimating the peak of a second-order polynomial using Bayesian posteriors?
  2. If so, does the distribution of peak locations (calculated from all posterior samples) form a posterior that can be analyzed in a standard Bayesian way (e.g., computing credible intervals)?
  3. If this is not valid, would a different approach be more appropriate?

Diagnostics and Plots:
We have checked convergence using chain plots and rhat statistics, which confirm good mixing. Below are:

  • Trace plots for the β coefficients (1st)
  • Group level relationship between pupil size and performance (2nd)
  • Example posterior distribution of peak locations across conditions and the posterior for difference in pupil peak (3rd)
  • Attached below also the code for the model

Would love to hear your thoughts on whether this approach is reasonable or if there are alternative methods for analyzing peak shifts in a Bayesian framework.

Thanks a lot in advance!

Figure 1:
RAW model traces.pdf (1.2 MB)

Figure 2:
RAW model comparison 3.pdf (27.8 KB)

Figure 3:
RAW model peaks 5.pdf (202.8 KB)

Code:

    with pm.Model() as pupil_model:
        # Probit regression model, estimating behavioral responses (0,1) based on stimuli (-1, 1). See Lawrence DeCarlo, 1998.
        # The model further incorporates task (-1, 1) and raw baseline pupil size (z-scored) 
        # as regressors.

        # Beta0:    intercept (criterion)
        # Beta1:    stimulus (dprime)
        # Beta2:    pupil linear (criterion)
        # Beta3:    pupil quadratic (criterion)
        # Beta4:    main effect task (criterion)
        # Beta5:    pupil linear (d')
        # Beta6:    pupil quadratic (d')
        # Beta7:    main effect task (d')
        # Beta8:    task x pupil linear (criterion)
        # Beta9:    task x quadratic linear (criterion)
        # Beta10:   task x pupil linear (d')
        # Beta11:   task x pupil quadratic (d')

        # Hyperpriors
        mu_beta0 = pm.Normal('mu_beta0', mu=0., sigma=1)
        sigma_beta0 = pm.HalfCauchy('sigma_beta0', beta=0.5)
        mu_beta1 = pm.Normal('mu_beta1', mu=1.5, sigma=1)
        sigma_beta1 = pm.HalfCauchy('sigma_beta1', beta=0.5)

        # Now hyperpriors for pupil and task effects
        mu_beta2 = pm.Normal('mu_beta2', mu=0., sigma=1)        # Linear pupil criterion
        sigma_beta2 = pm.HalfCauchy('sigma_beta2', beta=0.5)    # Linear pupil criterion
        mu_beta3 = pm.Normal('mu_beta3', mu=0., sigma=1)        # Quadratic pupil criterion
        sigma_beta3 = pm.HalfCauchy('sigma_beta3', beta=0.5)    # Quadratic pupil criterion
        mu_beta4 = pm.Normal('mu_beta4', mu=0., sigma=1)        # Task effect criterion
        sigma_beta4 = pm.HalfCauchy('sigma_beta4', beta=0.5)    # Task effect criterion

        mu_beta5 = pm.Normal('mu_beta5', mu=0., sigma=1)        # Linear pupil d'
        sigma_beta5 = pm.HalfCauchy('sigma_beta5', beta=0.5)    # Linear pupil d'
        mu_beta6 = pm.Normal('mu_beta6', mu=0., sigma=1)        # Quadratic pupil d'   
        sigma_beta6 = pm.HalfCauchy('sigma_beta6', beta=0.5)    # Quadratic pupil d'
        mu_beta7 = pm.Normal('mu_beta7', mu=0., sigma=1)        # Task effect d'
        sigma_beta7 = pm.HalfCauchy('sigma_beta7', beta=0.5)    # Task effect d'

        # Finally add hyperpriors for  interaction effects between pupil and task. 
        mu_beta8 = pm.Normal('mu_beta8', mu=0., sigma=1)        # task x linear (criterion)
        sigma_beta8 = pm.HalfCauchy('sigma_beta8', beta=0.5)    # task x linear (criterion)
        mu_beta9 = pm.Normal('mu_beta9', mu=0., sigma=1)        # task x quadratic (criterion)
        sigma_beta9 = pm.HalfCauchy('sigma_beta9', beta=0.5)    # task x quadratic (criterion)
        mu_beta10 = pm.Normal('mu_beta10', mu=0., sigma=1)      # task x linear (d')
        sigma_beta10 = pm.HalfCauchy('sigma_beta10', beta=0.5)  # task x linear (d')
        mu_beta11 = pm.Normal('mu_beta11', mu=0., sigma=1)      # task x quadratic pupil (d')
        sigma_beta11 = pm.HalfCauchy('sigma_beta11', beta=0.5)  # task x quadratic (d')

        # Intercept for each county, distributed around group mean mu_a
        β0 = pm.Normal('beta0', mu=mu_beta0, sigma=sigma_beta0, shape=len(h_data.subject.unique()))
        β1 = pm.Normal('beta1', mu=mu_beta1, sigma=sigma_beta1, shape=len(h_data.subject.unique()))
        β2 = pm.Normal('beta2', mu=mu_beta2, sigma=sigma_beta2, shape=len(h_data.subject.unique()))
        β3 = pm.Normal('beta3', mu=mu_beta3, sigma=sigma_beta3, shape=len(h_data.subject.unique()))    
        β4 = pm.Normal('beta4', mu=mu_beta4, sigma=sigma_beta4, shape=len(h_data.subject.unique()))    
        β5 = pm.Normal('beta5', mu=mu_beta5, sigma=sigma_beta5, shape=len(h_data.subject.unique()))    
        β6 = pm.Normal('beta6', mu=mu_beta6, sigma=sigma_beta6, shape=len(h_data.subject.unique()))    
        β7 = pm.Normal('beta7', mu=mu_beta7, sigma=sigma_beta7, shape=len(h_data.subject.unique()))    
        β8 = pm.Normal('beta8', mu=mu_beta8, sigma=sigma_beta8, shape=len(h_data.subject.unique()))    
        β9 = pm.Normal('beta9', mu=mu_beta9, sigma=sigma_beta9, shape=len(h_data.subject.unique()))    
        β10 = pm.Normal('beta10', mu=mu_beta10, sigma=sigma_beta10, shape=len(h_data.subject.unique()))    
        β11 = pm.Normal('beta11', mu=mu_beta11, sigma=sigma_beta11, shape=len(h_data.subject.unique()))    

        # add data vectors
        x1 = pm.Data("stimulus", h_data["stimulus"])
        x2 = pm.Data("tonic", h_data["tonic"])       
        x3 = pm.Data("tonic_sq", h_data["tonic_sq"]) 
        x4 = pm.Data("task", h_data["task"])

        # Note: d' effects are always interaction with β1 (stimulus) and criterion weights 'in interaction' with the intercept (so any effect that does not include β1)
        μ =(β0[subject_idx] +                       # criterion (intercept)
            β1[subject_idx] * x1 +                  # d' (intercept)
            β2[subject_idx] * x2 +                  # linear pupil (criterion)
            β3[subject_idx] * x3 +                  # quadratic pupil (criterion)
            β4[subject_idx] * x4 +                  # task (criterion)
            β5[subject_idx] * x1 * x2 +             # linear pupil (d')
            β6[subject_idx] * x1 * x3 +             # quadratic pupil (d')     
            β7[subject_idx] * x1 * x4 +             # task (d')
            β8[subject_idx] * x2 * x4 +             # task x linear (criterion)
            β9[subject_idx] * x3 * x4 +             # task x quadratic (criterion)
            β10[subject_idx] * x1 * x2 * x4 +       # task x linear (d')
            β11[subject_idx] * x1 * x3 * x4         # task x quadratic (d')
            )        

        θ = pm.Deterministic('θ', pm.math.invprobit(μ))

        # Bernoulli random vector with probability of success
        # given by sigmoid function and actual data as observed
        y = pm.Bernoulli(name='probit', p=θ, observed=h_data.response.values)

    with pupil_model:
        pupil_traces = pm.sample(init = 'adapt_diag', draws=samples, tune=burn, \
                                chains=chains, progressbar=True, cores=chains, \
                                idata_kwargs={"log_likelihood": True}, target_accept=0.95)