I wrote a model with the structure like a Hirarchical Linear Regression case:
My input data is large (8000 records) but is sufficiently normalized ( I can assure ) for the model to converge. Anyway, can my model get good posterior distributions, if I get these warnings in the trace when sampling ?
I know this is a bit of a silly question, but I would like to know if a model can work correctly even having these possible errors, and if not, how can I do to improve the model.
Thankyou from a PyMC3 beginner !
if a model can work correctly even having these possible errors
It might be heresy, but in my experience a few divergences don’t immediately disqualify a model from being useful…
However, a few thoughts:
- In general, it looks like your sampler was running for quite a long time for what appears a simple model and smallish dataset. Assuming that you’re using a reasonably capable machine, that length of time and the divergences would suggest to me that the posterior parameter space is hard to navigate. Per the error messages, you could try to reparameterise,
- Because it’s a hierarchical model, you probably ought to reparameterise (and non-center the hierarchical params) anyway, see this long-form example notebook for explanation of why this is a good idea: https://docs.pymc.io/notebooks/Diagnosing_biased_Inference_with_Divergences.html.
- You might also consider using a different distribution for your
sigma_slope
. An exponential has support at zero, and any samples from zero or very near it will kill your variance and give the sampler a hard time
Also in general you probably want to include prior and posterior predictive checks into your workflow, e.g. https://docs.pymc.io/notebooks/posterior_predictive.html
1 Like
Thankyou so much, @jonsedar
I can see how giant this community is. Recently, in order to have references to write my model, which is for a university work in Spain, I was watching your YouTube video, studying this case:
https://www.youtube.com/watch?v=Jb9eklfbDyg
Do you have similar examples referring to GLM - Logistic ?
I will try to see the documentation you sent me, and the advice, and I will tell you in another message if I have doubts, thank you very much.
Haha, that’s quite a while ago I’m pretty sure general practice (and certainly my own experience) has moved on quite a lot since then, so please don’t use my old models for anything! They’re not non-centred for a start…
Do you mean this one? https://docs.pymc.io/notebooks/GLM-logistic.html
Haha don’t be so hard on yourself ! It helped me to get an idea.
Yes, could be a good one !
I will explain you briefly why I use GLM-logistic
in my model and if you can tell me, about your experience, if my reasoning to build the pymc-model is correct, that would be great, it would help me a lot to hit the target:
-
I have to predict which types of transactions are more appropriate to be possible cases of money laundering in bank accounts of a bank. (likelihood->bernoulli: 0 → normal behavior; 1 → criminal behavior).
-
My synthetic dataset contains 8000 example cases of transactions containing 8 different types (supermarket spending, gas station spending… so up to 8) and within those 8 types, 3 characteristics of each type of transaction (day of the month in which the transaction is executed, times it is performed in a month, individual amount of each transaction).
So, I have created in that way my pymc3.model, as you can see again here, roughly speaking, the GLM-Logistic is a good choice ?
Thankyou for your time @jonsedar Greetings from Spain!
Sure, a Bernoulli likelihood doesn’t sound unreasonable (and greetings from Korea btw)
As it happens, I have a linear submodel very much like that as part of a bigger model I’m building right now. The following probably doesn’t mean much without lengthy explanation, but if you’re thinking about this already I would hope it helps more than hinders
The linear model comprises 21 ‘insured states’ (USA States) that I’ve expressed as a non-centred part-pooled hierarchy, plus about 8 other features (all pooled), and this is transformed through an invlogit to become psi, which informs the Bernoulli to generate pi.
## hierarchical intercept on insured state, non-centered param
psi_insured_state_mu = pm.Normal('psi_insured_state_mu',
mu=0., sigma=1.)
psi_insured_state_sigma = pm.InverseGamma('psi_insured_state_sigma',
alpha=11., beta=10.)
psi_insured_state_offset = pm.Normal('psi_insured_state_offset',
mu=0., sigma=1.,
dims='names_insured_state_topn')
psi_insured_state = (psi_insured_state_mu +
psi_insured_state_offset * psi_insured_state_sigma)
## the rest of the features, pooled linear model
psi_b = pm.Normal('psi_b', mu=0., sigma=1., dims='names_j_psi')
psi_bx = psi_insured_state[x_insured_state_psi] + tt.dot(psi_b, x_psi.T)
psi = pm.math.invlogit(psi_bx)
## bernoulli likelihood
pi = pm.Bernoulli('pi', p=psi, observed=y_pi, dims='obs_id_psi')
1 Like
Thank you very much for presenting me this example, it can help me to give another vision to my case.
I forgot to show you before, and it is also another doubt I have. I check my model with check_test_point(), I don’t know very well how to interpret it with respect to my model, does it give reasonable values? It’s the last question hehe.
Well, there’s no -inf
s I don’t think this function gives you much introspection into the model architecture though - you probably want to look at the prior and posterior predictive checks e.g. Prior and Posterior Predictive Checks — PyMC3 3.10.0 documentation
Cheers, Jon