Imblance Binary Classification With pymc3

I have a binary classification problem where I have around 15 features. I have chosen these features using some other model. Now I want to perform Bayesian Logistic on these features. My target classes are highly imbalance(minority class is 0.001%) and I have around 6 million records. I want to build a model which can be trained nighty or weekend using Bayesian logistic.

Currently, I have divided the data into 15 parts and then I train my model on the first part and test on the last part then I am updating my priors using Interpolated method of pymc3 and rerun the model using the 2nd set of data. I am checking the accuracy and other metrics(ROC, f1-score) after each run.


  1. My score is not improving.
  2. Am I using the right approch?
  3. This process is taking too much time.

If someone can guide me with the right approach and code snippets it will be very helpful for me.

1 Like

have you solved this problem?