How to properly use split data with PyMC?

guin0x · December 17, 2022, 6:30pm

First of all, thanks for the tip on how to have different sizes of train/test data — that works.

However, I am crying because the vector-form thing didn’t actually work… I believe that before it seemed to have worked because I reduced the size of X_train so much that doing by_hand or X @ beta would have around the same speed.

Now, again, its taking around 120 sec for the by_hand approach while it takes ~18min for the X @ beta. Maybe is because of the BLAS thing; please see below my remarks.

According to the benchmark given by check_blas.py for openblas/8 on a i7 950 it took 3.70s for “10 executions of gemm in float64 with matrices of shape 2000x2000”; while mine took around 10s for 10 executions of gemm with matrices of shape 5000x5000. So it seems to me that the timing is OK(?) Given that i7 950 is a bit worse than my AMD Ryzen 7 5800H and the matrices I tested were 2.5x bigger.

I do not have this flag, the only flag I have is -lblas. Should I install MKL? (even though my CPU is AMD?) Or should I install AML?

-------------------------------EDIT-------------------------------
I have upgraded my pymc to pymc>=5.0.0 and installed MKL. It looks like the vector form now is working properly but I’ll test it a few more times before I say for sure. It is worth mentioning that now doing the BLAS tests with both aesara and pytensor still doesn’t show the -lmkl_rt flag; and both tests take the same ~10 sec.

But now I get a lot of different warnings when importing and using pymc. Nonethless, I believe the main question of this thread have been answered and I’ll create a new thread with my other questions regarding MKL/BLAS and all these warnings.

Thanks once again for your very helpful assistance.

Topic		Replies	Views
How to make out-of-sample predictions with pymc model v5	1	653	February 8, 2023
Predictions with PYMC, Dont work on test data set modeling	3	53	January 18, 2025
Using sample posterior predictive on new data v5 modeling	4	67	April 30, 2025
Getting the same prediction when using the PyMC3 data container to generate Bayesian regression prediction using new data Questions theano , modeling	3	507	December 10, 2022
[Beginner level question on modeling] Bayesian analysis of F1 scores from two ML models v5 modeling	5	414	January 24, 2023

How to properly use split data with PyMC?

Related topics