Error with Model: Initial evaluation of model at starting point failed!

Hi PyMC3 team,

I am currently trying to implement the following model of causal inference that was originally demonstrated in Stan here.

Below is the following code I have run. First, I simulate the dataset:

N = 500 # Total sample size
alpha = 1.0 # Intercept in the y-model
tau = 0.25 # Treatment effect

# Assignment mechanism
N_t = 200
W = np.random.choice(2, N) # Randomly assign treatment

# Simulate the outcomes
mu_c = alpha # Mean for control
sd_c = 1.0 # Standard deviation for control
mu_t = alpha + tau # Mean for treatment
sd_t = 1.0 # Standard deviation for control
means = [mu_c, mu_t]

rho = 0.0 # Correlation
cov_mat = np.array([[sd_c**2.0, rho * sd_c * sd_t], [rho * sd_c * sd_t, sd_t**2.0]]) # Covariant matrix

outcomes = np.random.multivariate_normal(means, cov_mat, size=N) # Generate via multivariate normal
Y0 = outcomes[:, 0]        # potential outcome if W = 1
Y1 = outcomes[:, 1]        # potential outcome if W = 0
tau_unit = Y1 - Y0

# The realization of potential outcomes by the assignment mechanism
Y_obs = Y0 * (1.0 - W) + Y1 * W # Observed outcomes
Y_mis = Y0 * W + Y1 * (1.0 - W) # Missing outcomes

And this is the code to create the model in PyMC3:

# Now let's try and build this model in PyMC3
with pm.Model() as model:

    # Define the priors
    alpha = pm.Normal('alpha', mu=0.0, sigma=5.0)
    tau = pm.Normal('tau', mu=0.0, sigma=5.0)
    sigma_c = pm.Normal('sigma_c', mu=0.0, sigma=5.0)
    sigma_t = pm.Normal('sigma_t', mu=0.0, sigma=5.0)
    mu = alpha + tau * W
    sigma = sigma_t * W + sigma_c * (1.0 - W)

    # Define the likelihood
    y_obs = pm.Normal('Y_obs', mu=mu, sigma=sigma, observed=Y_obs)
    
    # Sample from the posterior
    trace = pm.sample(2000, tune=1000)

Which produces the following error:

---------------------------------------------------------------------------
SamplingError                             Traceback (most recent call last)
<ipython-input-414-f57626c5907e> in <module>
     13 
     14     # Sample from the posterior
---> 15     trace = pm.sample(2000, tune=1000)

/usr/local/lib/python3.8/site-packages/pymc3/sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, pickle_backend, **kwargs)
    426     start = deepcopy(start)
    427     if start is None:
--> 428         check_start_vals(model.test_point, model)
    429     else:
    430         if isinstance(start, dict):

/usr/local/lib/python3.8/site-packages/pymc3/util.py in check_start_vals(start, model)
    235 
    236         if not np.all(np.isfinite(initial_eval)):
--> 237             raise SamplingError(
    238                 "Initial evaluation of model at starting point failed!\n"
    239                 "Starting values:\n{}\n\n"

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'alpha': array(0.), 'tau': array(0.), 'sigma_c': array(0.), 'sigma_t': array(0.)}

Initial evaluation results:
alpha     -2.53
tau       -2.53
sigma_c   -2.53
sigma_t   -2.53
Y_obs      -inf
Name: Log-probability of test_point, dtype: float64

Is there any way I can change the initial starting point so that I don’t encounter this error? Or am I doing something else incorrectly? Any advice on this would be greatly appreciated! Thank you!

Best wishes,
Axel.

3 Likes

Very beginner user here, so consider all I write with rightful suspicion.
The -inf in the evaluation suggests you have some strange parametrization going on, but I honestly could not find anything wrong with your model – but wait for more expert user to chime in.
Something you could try is to get rid of the sigmas parameter completely. Just fix the sigma of y_obs to 1 and see if you still have an error – since the mu parameter is not constrained it should not be the one causing the error.
However, you should be able to tell the sampler where to start. You may rewrite your sampling line as:

trace = pm.sample(2000, tune=1000, start={'alpha': np.array([0.]), 'tau': np.array([0.]), 'sigma_c': np.array([0.]), 'sigma_t': np.array([0.])})

Naturally, you are free to change the starting values as you please – withing the boundaries of acceptability for each parameter.

5 Likes

Hi @non87,

Thanks very much for your suggestions! I’m pleased to say both worked. It should have occurred to me that this parametrisation of sigma means that, for these choice of sigma values, sigma = 1.0, given that the treatment vector W values were only 0 or 1.
The other solution also worked really well; I didn’t realise that there was a start option when running pm.sample, but I’m glad I know now!

I’ll leave the thread open for a little longer to see if there’s any other ways to fix this other than just kick the starting values in the right direction, but thank you again!

1 Like

Hello. Another newbie here, but I think I found the issue and an arguably nicer way to fix it.

Your problem is reproduced by this simple code…

with pm.Model() as model:
    sd = pm.Normal('sd', mu=0, sigma=1)
    y = pm.Normal('y', mu=0, sigma=sd)
    pm.sample()

…which produces this error…

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{‘sd’: array(0.), ‘y’: array(0.)}

Initial evaluation results:
sd -0.92
y -inf
Name: Log-probability of test_point, dtype: float64

I think the problem here is related to the fact that you’re modelling sigma with a normal distribution where values can be negative. Instead you could use a HalfNormal distribution, which is just the positive side of a normal distribution. I think that’s what you intended.

with pm.Model() as model:
    sd = pm.HalfNormal('sd', sigma=1) # <-- yay fix!
    y = pm.Normal('y', mu=0, sigma=sd)
    pm.sample()
3 Likes

Hi! I have a similar issue.

        with pm.Model(coords={"test_event_dim": range(len(used_data)), "test_support_dim": range(2)}) as model:
            w = pm.Dirichlet('w', a=np.ones(n_components))  
            p1 = pm.HalfNormal("p1",sigma=3, dims=("test_support_dim",)).exp()
            p2 = pm.HalfNormal("p2",sigma=3, dims=("test_support_dim",)).exp()
            p3 = pm.HalfNormal("p3",sigma=3, dims=("test_support_dim",)).exp()
            p4 = pm.HalfNormal("p4",sigma=3, dims=("test_support_dim",)).exp()
            y = DirichletMixture('y', w=w, p=[p1, p2, p3, p4], observed=np.array(used_data), comp_shape=(n_components, 2))
            # model.debug(verbose=True)
            # if method == 'mcmc':
            step = pm.Metropolis()
            trace = pm.sample(10000, cores=2, step=step)
            dataframe = az.summary(trace)

The code raise error:
SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{‘w_simplex__’: array([0., 0., 0.]), ‘p1_log__’: array([1.09861229, 1.09861229]), ‘p2_log__’: array([1.09861229, 1.09861229]), ‘p3_log__’: array([1.09861229, 1.09861229]), ‘p4_log__’: array([1.09861229, 1.09861229])}

Logp initial evaluation results:
{‘w’: -2.37, ‘p1’: -1.45, ‘p2’: -1.45, ‘p3’: -1.45, ‘p4’: -1.45, ‘y’: -inf}
You can call model.debug() for more details.

Welcome.

You might try opening a new thread as this one is quite old and will tend to get less attention.

What does your observed data (used_data) look like?

I would also avoid setting the step argument of pm.sample() unless you have very good reason to do so.