Many thanks, @cluhmann and @ricardoV94!
Here’s the corrected code (which I can’t copy-paste directly into this editor from my notebook - long story):
lower_bound = 0
upper_bound = np.max(data['late_days'])
with pm.Model() as threshold_model:
late_days = pm.Uniform(
'payment_behavior',
lower = lower_bound,
higher = upper_bound
)
degress_freedom = pm.Exponential('degrees_freedom', lam = 1)
thresholds = pm.StudentT(
'thresholds',
nu = degrees_freedom,
mu = late_days,
sd = 1,
observed = data['late_days'].values
)
trace = pm.sample(5_000, tune = 5_000) # based on your input re samples
Regarding the other points:
- I’m running the project on
pymc3 3.11.4, (MacOS Ventura) - the complete warning stated:
/opt/homebrew/anaconda3/envs/pymc_framework/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:624: RuntimeWarning: overflow encountered in _beta_ppf
return _boost._beta_ppf(q, a, b)
Sampling 4 chains for 1
Then the kernel broke and the sampling process was interrupted.
However, I have now rerun the code successfully and conclude my mistake was taking too large sample sizes.
As a follow up question: what criteria could be taken into account for defining the values for draws and for tune params?
Thank you again!