Hello all,
I bought a new pc with a good cpu to run pymc3 models that i have quickly. At first it was incredibly fast - it seemed like the step of initializing NUTS was nearly instantaneous. However after repeatedly running pymc3-involved scripts for a day, it was now taking 10 seconds for NUTS initialization to complete and the sampling to start. I’m not familiar with the package enough to trouble shoot this. Is this a theano-related problem? I feel like I am doing something wrong for it to have worked so quickly at the start of the PCs life, days ago. Let me know if you have any suggestions.
I get the following error which may lead to the slowing down:
WARNING (theano.link.c.cmodule): Deleting (broken cache directory [EOF]): ./.theano/compiledir_linux-5.15–generic-x86_64-with-glibc2.35 and so on
1 Like
The cache is the only thing I can think of that might “accumulate” over successive runs of a given pymc script. You might check out what all is in that directory.
I’d also recommend switching to v4. Just my solo experience, but this kind of thing has been less of a headache for me compared to v3.
Thats fair, i will do that next. Ive always put it off because I don’t want to deal with syntax changes and such. I looked around trying to find a list of documented changes for this purpose but i couldnt. Do you or anyone else know if this exists?
Nevermind, found something!
Hi, I’ve been getting this message throughout my use of PYMC (probably getting close to 2 years now) and I was wondering when you get it.
For me, i run models on WSL 2 and ubuntu inside a tmux terminal. What i find is that over time similar models or reruns of the same models gradually get slower and slower until i get fed up and restart my computer. Afterwards, on the first run i get this warning but my sampling speed has sped right up again. It’s an issue ive just lived with because i haven’t found a solution but i was wondering if you had experienced the same thing.
Im using v5 now but it is still the same warning just with pytensor instead of theano.
Thanks in advance to you or anyone else that can help.
What warning are you getting? And can you share a small reproducible example?
Hi ricardo,
I get a warning like this
WARNING (pytensor.link.c.cmodule): Deleting (broken cache directory [EOF]): ./.pytensor/compiledir_linux-5.15–generic-x86_64-with-glibc2.35/tmp…
I cant really provide a reproducable example, all i know is that over time the sample rate (using DEMet Z at the moment) gradually gets slower if im running scripts repeatedly. I usually just solve it by rebooting my computer. When i run the first model after reboot i usually get the above warning but my sample rates have returned to normal.
Ive only just realised that perhaps there is possible a link between the sample rate slowing and having crashes or maybe canceling a run with ctrl + c.
What I’m thinking of trying is deleting all the ‘tmp’ files in
./.pytensor/compiledir_linux-5.15–generic-x86_64-with-glibc2.35
rather than rebooting and see if that works. Is that safe to do?
You can run pytensor-cache purge
but it sounds like there maybe a bug that would be great to report if you could provide more info. Do you notice memory or cpu activity increasing over runs?
Hi Ricardo, finally got to the point when it happened again. I can confirm that purging the pytensor cache returns sampling speeds to their normal level. As for CPU usage and memory, they are at around 100% and 83% respectively during normal operation, not really sure that helps at all though.