Issues Converting Stan's `Phi_approx` to PyMC

erin · September 1, 2024, 6:42am

Hi everyone,

I’m working on converting some Stan code into PyMC and came across a function called Phi_approx. As I understand, it’s a fast approximation of the cumulative standard normal distribution function. You can find more information about it here: Stan Functions Reference.

However, I couldn’t find a direct equivalent function in PyMC, so I wrote one myself. But when I compare the model output between PyMC and Stan, there are significant differences. I suspect the issue might lie in my converted Phi_approx function, but I’m not entirely sure what’s wrong.

Here’s the original Stan code:

ndt_i[i,1] = Phi_approx(ndt_mean[1] + ndt_sd[1] * ndt_i_pr[i,1]) * min(RT_min[i,,1]);
# (where i represents participant number)

And here’s my PyMC translation:

with pm.Model(coords=coords) as model_3:
    ...
    ndt_mean = pm.Normal('ndt_mean', mu=0, sigma=1, dims="time")
    ndt_sigma = pm.Normal('ndt_sigma', mu=0, sigma=1, dims="time")
    ndt_i_pr = pm.Normal('ndt_i_pr', mu=0, sigma=1, dims=["subj","time"])
    rv = pm.Normal.dist(0,1)
    phi = pm.logcdf(rv, ndt_mean + ndt_sigma * ndt_i_pr)
    phi = pm.math.exp(phi)
    ndt_i = pm.Deterministic('ndt_i', phi * rt_min, dims=["subj","time"])
    ...
# (Note: rt_min is an array with a shape of (47, 2))

I’m wondering if there’s a misunderstanding on my part regarding the Phi_approx function or if my definition in PyMC is incorrect. Any advice or suggestions would be greatly appreciated!

Thanks in advance for your help!

jessegrabowski · September 1, 2024, 8:53am

This looks good, does it give the right answer?

Our computational backend automatically optimizes models to use the best available algorithms, so we try to avoid exposing functions like fast_Phi to users. I’m not if we do something smart for Probit. Run your model and let me know how the timings seem.

erin · September 1, 2024, 9:49am

Thanks! I’d also like to provide some clarification about my model in case it helps make the situation clearer.

I’m building a shift lognormal model, where the ndt_i above is used to create a shifted parameter for each participant at a certain time condition. This shifted parameter should multiply with their minimum RT.

The model runs well, but the parameter estimation results are not what I expected. As you can see in the code snippet below, my parameters of interest are rho_R_mu and rho_R_sigma. These parameters are drawn from a multivariate normal distribution and are related to mu. In other models that don’t include this shift parameter, the estimation results match the Stan output exactly. This leads me to believe that the problem arises from my definition or usage of the shift parameter, like:

observed = df_flanker.RT - ndt_i[subj_idx, time_idx].eval()

Since the model is somewhat complex, I’ve included only the key lines here:

coords = {"cond": df_flanker['Condition'].unique(),
          "time": df_flanker['Time'].unique(),
          "subj": df_flanker['subj_num'].unique(),
          "trial": df_flanker['trial_idx']}

with pm.Model(coords=coords) as model_3:
    # Define correlation matrices
    rho_R_mu = pm.LKJCorr('rho_R_mu', n=2, eta=1)
    rho_R_sigma = pm.LKJCorr('rho_R_sigma', n=2, eta=1)
    triu_idx = np.triu_indices(2, k=1)
    mu_corr_upper = pt.set_subtensor(pt.zeros((2, 2))[triu_idx], packed_L_R_mu)
    sigma_corr_upper = pt.set_subtensor(pt.zeros((2, 2))[triu_idx], packed_L_R_sigma)
    R_mu = pt.eye(2) + mu_corr_upper + mu_corr_upper.T
    R_sigma = pt.eye(2) + sigma_corr_upper + sigma_corr_upper.T
    L_R_mu = pt.linalg.cholesky(R_mu)
    L_R_sigma = pt.linalg.cholesky(R_sigma)

    ...
    ndt_mean = pm.Normal('ndt_mean', mu=0, sigma=1, dims="time")
    ndt_sigma = pm.Normal('ndt_sigma', mu=0, sigma=1, dims="time")
    ndt_i_pr = pm.Normal('ndt_i_pr', mu=0, sigma=1, dims=["subj","time"])
    rv = pm.Normal.dist(0,1)
    phi = pm.logcdf(rv, ndt_mean + ndt_sigma * ndt_i_pr)
    phi = pm.math.exp(phi)
    ndt_i = pm.Deterministic('ndt_i', phi * rt_min, dims=["subj","time"])

    ...

    time_idx = pm.MutableData("time_idx", df_flanker.Time, dims="trial")
    subj_idx = pm.MutableData("subj_idx", df_flanker.subj_num, dims="trial")
    cond_idx = pm.MutableData("cond_idx", df_flanker.Condition, dims="trial")
    
    likelihood = pm.LogNormal("likelihood",
                                mu=mu[cond_idx, subj_idx, time_idx], 
                                sigma=sigma[cond_idx, subj_idx, time_idx],
                                observed=df_flanker.RT - ndt_i[subj_idx, time_idx].eval(),
                                dims="trial")

I just ran the model, and here’s the timing message (though I’m not sure if this is exactly what you were asking for )

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (3 chains in 3 jobs)
NUTS: [L_R_mu, L_R_sigma, mu_mean_base, mu_sd_base, sigma_mean_base, sigma_sd_base, mu_mean_delta, mu_sd_delta, sigma_mean_delta, sigma_sd_delta, mu_i_base_pr, sigma_i_base_pr, mu_i_delta_pr, sigma_i_delta_pr, ndt_mean, ndt_sigma, ndt_i_pr]
Sampling 3 chains for 1_000 tune and 3_000 draw iterations (3_000 + 9_000 draws total) took 2521 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics

jessegrabowski · September 1, 2024, 11:10am

This won’t be random, it will fix a single calue of ndt_i and fit the model conditional on whatever the random draw happens to be. Is this your intention?

erin · September 1, 2024, 11:38am

No, ndt_i should be random, so I think I should remove the .eval()? However, when I do that, I get an error:

ValueError: setting an array element with a sequence.

I guess the subtraction isn’t being handled correctly, but I’m not sure how to fix it. Do you have any advice?

jessegrabowski · September 1, 2024, 3:46pm

You can’t have observed data that changes with every draw (it’s not observed that case!) so you need to adjust your model to get ndt_i into the expected value. Basically it is acting like a location parameter – as the documentation for scipy.stats.lognorm notes:

The probability density above is defined in the “standardized” form. To shift
and/or scale the distribution use the loc and scale parameters.
Specifically, lognorm.pdf(x, s, loc, scale) is identically
equivalent to lognorm.pdf(y, s) / scale with
y = (x - loc) / scale. Note that shifting the location of a distribution
does not make it a “noncentral” distribution; noncentral generalizations of
some distributions are available in separate classes.

We don’t offer that parameterize directly, but you can easily implement it either by using pm.CustomDist – here’s some untested code to get you started

def shifted_lognormal(mu, sigma, loc, size=None):
    return loc + pm.LogNormal.dist(mu=mu, sigma=sigma, size=size)

with pm.Model(coords=coords) as model_3:
    # your model...

    obs = pm.CustomDist("obs",
                        mu=mu[cond_idx, subj_idx, time_idx], 
                        sigma=sigma[cond_idx, subj_idx, time_idx],
                        loc=ndt_i[subj_idx, time_idx],
                        observed=df_flanker.RT,
                        dims="trial")

erin · September 2, 2024, 2:01pm

Thanks a lot! pm.CustomDist solved the problem perfectly. I made a few modifications to your code and am sharing the final working version below as a reference—maybe this will help someone else as well

def shifted_lognormal(mu, sigma, loc, size=None):
        return loc + pm.LogNormal.dist(mu=mu, sigma=sigma, size=size)

with pm.Model(coords=coords) as model_3:
    # your model...
        obs = pm.CustomDist("obs",
                    mu[cond_idx, subj_idx, time_idx], 
                    sigma[cond_idx, subj_idx, time_idx],
                    ndt_i[subj_idx, time_idx],
                    dist = shifted_lognormal,
                    observed=df_flanker.RT,
                    dims="trial")

Topic		Replies	Views
Smooth Transition to PyMC3 v5	1	158	June 24, 2024
Using pm.CustomDist and got error:TypeError: rv_op() got an unexpected keyword argument 'mu' v5 modeling	6	376	January 5, 2024
Converting Model from Stan Questions	15	3128	June 27, 2017
Different values between pymc3 and stan version agnostic modeling	2	636	March 31, 2022
Rounding Error Model version agnostic	4	50	January 20, 2025

Issues Converting Stan's `Phi_approx` to PyMC

Related topics