I was following the bayesian weibull tutorial given here Bayesian Parametric Survival Analysis with PyMC3 — PyMC3 3.11.4 documentation, however, in my application, I will need to wrap the whole training code in a custom python function. I also need to generate some random points in the same function before training the model :-
def gumbel_sf(y, μ, σ):
return 1.0 - tt.exp(-tt.exp(-(y - μ) / σ))
def generate_dist():
return np.random.choice([0, 1], size = 500)
def tutorial_model(X, y_std, cens):
dist = generate_dist()
VAGUE_PRIOR_SD = 5.0
with pm.Model() as weibull_model:
β = pm.Normal("β", 0.0, VAGUE_PRIOR_SD, shape=2)
X_ = shared(X)
with weibull_model:
η = β.dot(X_.T)
with weibull_model:
s = pm.HalfNormal("s", 5.0)
y = np.log(df.time.values)
# y_std = (y - y.mean()) / y.std()
# cens = df.event.values == 0.0
cens_ = shared(cens)
with weibull_model:
y_obs = pm.Gumbel("y_obs", η[~cens_], s, observed=y_std[~cens])
with weibull_model:
y_cens = pm.Potential("y_cens", gumbel_sf(y_std[cens], η[cens_], s))
SEED = 845199 # from random.org, for reproducibility
SAMPLE_KWARGS = {"chains": 3, "tune": 100, "random_seed": [SEED, SEED + 1, SEED + 2]}
with weibull_model:
weibull_trace = pm.sample(**SAMPLE_KWARGS)
return weibull_model, weibull_trace, dist
The problem is, the randomly generated array, dist
, after two function calls, starts getting the exact same values as in the second function call. So for example, if I call the function five times, for the first two calls the dist values are random, but starting from the third call till the fifth call, the dist
array remains exactly the same as in the second call. What is more weird is, this happens only when pymc3 model is trained to generate the trace. I experimented this by commenting out all the pymc3 related code just to make sure I am not messing around with the numpy random function.
I have been stuck with this problem for almost two weeks now, any help will be appreciated