Hi all,
I’ve been using v4.0.0b6 to fit a simple multivariate regression model to some data. In order to make the output deterministic there is a call to np.random.seed(1234). I’ve posted some relevant parts of the code below.
When adopting this approach, initially I thought all was well. Running pm.sample() repeatedly on the same machine appeared to give convergence to the same solution (checked by inspecting trace.model_logp, and looking at the model parameters. Aside: is this a valid way to do it?)
This was checked on the same machine running (a) in a poetry environment, and (b) inside a docker container; in both cases the environments were set up using the same poetry commands and in both cases the fitted model appeared to be identical using the inspection approach described above.
The same docker container was then used in CI to run the same model fit process, and it does not appear to converge to the same solution. However, running it multiple times on the same CI machine gives the same converged solution each time, just a different one to the original machine.
A colleague then ran it on a third machine, and saw the same behaviour, i.e. different to machines 1 and 2, but consistent convergence on that machine.
Does anyone have any idea what might be causing this or any thoughts on where to look?
One potentially relevant warning we are seeing is "aesaraf.py:1005: UserWarning: The parameter ‘updates’ of aesara.function() expects an OrderedDict, got <class ‘dict’>. Using a standard dictionary here results in non-deterministic behavior. You should use an OrderedDict if you are using Python 2.7 (collections.OrderedDict for older python), or use a list of (shared, update) pairs. Do not just convert your dictionary to this type before the call as the conversion will still be non-deterministic."
However, a bit of searching suggests that we can ignore it in this case (?)
Many thanks for any suggestions.
Code snippets below. If more is needed please say.
# Model (x1, x2, x3, noise are random arrays, c0, c1... consts)
y = c0 + (c1 * x1) + (c2 * x2) + (c3 * x3) + (c4 * noise)
# dataframe constructed from the following dict:
data = {'x1': x1, 'x2': x2, 'x3': x3, 'y': y}
# Fit model
np.random.seed(self.seed)
...
with pm.Model() as model:
# Priors
sigma = pm.HalfNormal('sigma', sigma=2)
intercept = pm.Normal('intercept', mu=20, sigma=200)
beta = pm.Normal('beta', mu=15, sigma=20, dims=3)
mu = intercept + pm.math.dot(df[['x1','x2','x3'], beta)
pm.Normal('likelihood', mu=mu, sigma=sigma, observed=df['y'])
# draw posterior samples
trace = pm.sample(draws=1000, tune=1000, chains=4, random_seed=self.seed,
return_inferencedata=False,
compute_convergence_checks=True)
# inspection, for e.g:
trace.model_logp[0:10]
trace.beta[0:5]