NaN occurred in optimization at first Iteration with ADVI

I am getting NaN while doing optimization with ADVI.

with pm.Model() as irt_model:
    loc = pm.Normal('loc', mu=0., sd=1.,shape=N_QUESTIONS)
    scale = pm.TruncatedNormal('scale', mu=1. , sd=0.1, lower=0. ,shape=N_QUESTIONS)
    guessing = pm.Beta('guessing', alpha=1., beta=5.,shape=N_QUESTIONS)
    d1 = pm.Normal('d1', mu=0. , sd=100. , shape = N_PERSONS)
    eps = pm.Normal('eps', mu=0. , sd=1. , shape = N_PERSONS)
    theta = pm.Deterministic('theta', tt.dot( d1 , PERSON_AGE_INDEX["age"]) + eps)
    
    eta = scale[QUESTION_INDEX] * (theta[PERSON_INDEX] - loc[QUESTION_INDEX]) 
    p = pm.Deterministic('p', guessing[QUESTION_INDEX] + (1 - guessing[QUESTION_INDEX]) * pm.math.sigmoid(eta))

    y = pm.Bernoulli('y', p=p, observed=Y_OBS,shape=(N_PERSONS * N_QUESTIONS))

with irt_model:
    inference = pm.ADVI()
    mean_field = inference.fit(
      2500, 
        obj_optimizer = pm.adamax(learning_rate=0.1)
    )    
    advi_trace = mean_field.sample(1000)

irt_model.check_test_point()
loc -77.19
scale_lowerbound__ 116.23
guessing_logodds__ -91.89
d1 -30935.01
eps -5146.06
y -6222.86
Name: Log-probability of test_point, dtype: float64

I get error in:
----> 5 obj_optimizer = pm.adamax(learning_rate=0.1)
raise FloatingPointError(ā€™\nā€™.join(errmsg))

I appreciate if someone can help me to resolve this issue.

Model is working for MCMC. But not VI.

Could you try lower the initial learning rate?

I tried lower initial learning rates from 0.1 to 0.0001 or 2e-4. But same error. From your previous posts I tried:
mu = inference.approx.params[0]
rho = inference.approx.params[1]
mu.eval() Resulted in: array([nan, nan, nan, ā€¦, nan, nan, nan])
Setting mu and rho to zeros and again running samples results in same NaN errors.
mu.set_value(np.zeros(mu.eval().shape))
rho.set_value(np.zeros(rho.eval().shape))

The model looks fine to me - maybe try changing the TruncatedNormal to HalfNormal?