Goal is similar to this thread: Bayesian sample size estimation for given HPD.
The approach is similar to the comment:
# define your model n = theano.shared(100, dtype=int)) X = theano.shared(50, dtype=float)) with model: p = pm.Beta('p', alpha=2, beta=2) y_obs = pm.Binomial('y_obs', p=p, n=n, observed=X) sample_size = [10, 100, 1000, 10000] # for loop for s in sample_size: n.set_value(s) X.set_value(np.sum(new_observed_X)) with model: trace = pm.sample() # compute HPD.
Is there a recommended approach to parallelize the loop for increasing sample sizes? So far I have tried using an EC2 with 96 cores, but that didn’t really offer a speed up. I have also tried using python’s
concurrent.futures library to parallelize the loop, but it doesn’t offer any speedup. The only other idea I have is to run each model on a separate instance (e.g. with kubernetes).
Any thoughts are greatly appreciated!