Hi,
I am trying to build a model with Pymc_Marketing I created a model with the following config
model_config = {
‘a_prior’: {‘dist’: ‘HalfNormal’,
‘kwargs’: {‘sigma’: 100}},
‘b_prior’: {‘dist’: ‘HalfNormal’,
‘kwargs’: {‘sigma’: 100}},
‘alpha_prior’: {‘dist’: ‘HalfNormal’,
‘kwargs’: {‘sigma’: 100}},
‘r_prior’: {‘dist’: ‘HalfNormal’,
‘kwargs’: {‘sigma’: 100}},
}
sample_kwargs = {
“draws”: 2_000,
“chains”: 5,
“target_accept”: 0.9,
“random_seed”: 42,
}
bgm = clv.BetaGeoModel(
data = data_summary_rfm,
model_config = model_config
)
bgm.build_model()
bgm
bgm.fit(**sample_kwargs)
My Data is pretty big i have around 500000 unique customers. I am trying to see the probability of alive of customers in the next 90 Days. Even running in batches the kernel is getting killed. Is there a better way to to do the same?
steps = 90
batch_size=5000
sdata = data_summary_rfm.copy()
future_alive_all =
for start in tqdm(range(0, len(sdata), batch_size)):
end = start + batch_size
batch = sdata.iloc[start:end]
for t in progress_bar(range(steps)):
future_data = batch.copy()
future_data["T"] = future_data["T"] + t
future_alive = tt.expected_probability_alive(data=future_data)
#future_alive_all.append(future_alive)
batch_result = pd.DataFrame({
'customer_id': batch['customer_id'].values,
'Prob_of_alive': future_alive.mean(('chain', 'draw')).values
})
# Append the result to the list
future_alive_all.append(batch_result)