Hello! I’m using pymc-marketing’s CLV related models and had some questions as a new user.
I feel kinda dumb, but there doesn’t seem to be a lot of discourse surrounding the CLV usage online.
I’m using the Pareto NBD model vs the BetaGeoModel ('cause I don’t want to assume customer active-ness for new users), and am trying to find methods for faster sampling/fitting methods (that aren’t MAP). I’ve been going back and forth between using the fit_method = “demz” and the fit_method=“mcmc”, but I’ve run into issues with both.
For these issues, I’m working with about 690,000 customers, using a terminal-based environment (spyder), and I am using versions:
- pymc - 5.20.1
- pymc_marketing - 0.11.1
- nutpie - 0.13.4
For the demz fitting, I find that I need to constantly need to increase the draw and tune parameters and often times a chain will get stuck and estimate 6+ hours for completion (I’ve read that this could mean that my posterior geometry could be problematic but I’ve no idea how to remedy this).
For the mcmc fitting, I find that the default sampler for mcmc can take a really-really long time. Before, when I was experimenting with the BetaGeoModel I tried using the nutpie sampler and found that to be wonderful! But, it seems applying nutpie to the mcmc fitting still takes a really long time, and no progress bar appears (even if I declare progress_bar=True).
So, my main issues are: re-runs, runtime, progress-bars, chains getting stuck.
Also, trying to use the plot_expected_purchases_ppc for prior use takes an extremely long time (10-30 minutes, even if I reduce the samples to 50)??
Any guidance on this would be amazing and I appreciate any time given to this in advance.
( reiterate “long-time” a lot, to which I typically mean to be about 4 to 12 hours)
(No issues with using Gamma-gamma and doing the final CLV related portions)
-Sarah