Combining multiple traces

Hi all,

I’m training on a relatively large dataset, 20000+ data points, and I’ve noticed that sampling time significantly improves if I can bring the dataset down around ~1000 points. So naturally I’m looking for ways to batch the sampling. I have two questions to this end:

  1. Assuming that the distributions of the training data in each of the batches are sufficiently similar, what are any theoretical issues sampling in this way?

  2. On the practical side of things, I would like to combine the traces, as in append/concatenate them together successively, and I don’t see any builtin methods for doing this. Is there a recommended or easy way to do this? It seems as if assignment is prohibited.

trace['coeff',:] = np.concatenate((trace['coeff',:],trace1['coeff',:]))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
----> 1 trace[‘coeff’,:] = np.concatenate((trace[‘coeff’,:],trace[‘coeff’,:]))

TypeError: 'MultiTrace' object does not support item assignment
1 Like

Hi!

It has been some time since you asked this question, but I have the same doubts. Is it possible to combine several traces?

yes, having the traces as InferenceData (you can use the return_inferencedata argument of pm.sample) use arviz.concat to combine multiple traces.

2 Likes