Pymc_Marketing expected_probability_alive

Hi,

I am trying to build a model with Pymc_Marketing I created a model with the following config
model_config = {
‘a_prior’: {‘dist’: ‘HalfNormal’,
‘kwargs’: {‘sigma’: 100}},
‘b_prior’: {‘dist’: ‘HalfNormal’,
‘kwargs’: {‘sigma’: 100}},
‘alpha_prior’: {‘dist’: ‘HalfNormal’,
‘kwargs’: {‘sigma’: 100}},
‘r_prior’: {‘dist’: ‘HalfNormal’,
‘kwargs’: {‘sigma’: 100}},

}
sample_kwargs = {
“draws”: 2_000,
“chains”: 5,
“target_accept”: 0.9,
“random_seed”: 42,
}

bgm = clv.BetaGeoModel(
data = data_summary_rfm,
model_config = model_config

)
bgm.build_model()
bgm

bgm.fit(**sample_kwargs)

My Data is pretty big i have around 500000 unique customers. I am trying to see the probability of alive of customers in the next 90 Days. Even running in batches the kernel is getting killed. Is there a better way to to do the same?

steps = 90
batch_size=5000

sdata = data_summary_rfm.copy()

future_alive_all =

for start in tqdm(range(0, len(sdata), batch_size)):
end = start + batch_size
batch = sdata.iloc[start:end]

for t in progress_bar(range(steps)):
    future_data = batch.copy()
    future_data["T"] = future_data["T"] + t
    future_alive = tt.expected_probability_alive(data=future_data)
    #future_alive_all.append(future_alive)
    batch_result = pd.DataFrame({
    'customer_id': batch['customer_id'].values,
    'Prob_of_alive': future_alive.mean(('chain', 'draw')).values
})

# Append the result to the list
future_alive_all.append(batch_result)

You will probably want to thin your dataset with the thin_fit_resultmethod besides running in batches