Run time of sample_prior_predictive

I’m trying to generate prior predictive samples from a pretty large model (267 variables). It’s causing python to eat up a full CPU and become unresponsive.

I’m not sure why this should happen, because this is effectively a Bayes network with no evidence, so it should be possible to topologically sort all the variables, and then sample them in that order. So AFAICT, this should be (given a naive implementation) at worst-case quadratic (n log n to sort, then linear in the number of samples).

Does this mean I’m doing something horribly wrong? Any suggestions for debugging my model when this happens?

Thanks!

Hmm, maybe you can share your example with @lucianopaz and @colcarroll, I find the current implementation fairly fast (not as fast as writtent it by hand in vanilla scipy random, but pretty close and must fast than using a for loop).

The current implementation of sample_prior_predictive relies heavily on the function distributions.distribution.draw_values.

At the moment, this function does not use the Bayes network explicitly, and does not do a topological sort. What it does is:

  1. It does some type checking to see if the variables inputed to draw_values are numbers, arrays, constants or shared’s so it can return those values directly.
  2. If not, it checks whether their values are defined in the point dictionary.
  3. If not, it checks to see if the variables have a random or a distribution.random method, and if they do, calls them to get the values drawn.
  4. If not, it tries to compile the theano expression into a function and calls it.

Crucially, point 3 enters a recursion because the distribution’s random methods usually start with a draw_values statement to get the distribution’s parameter values. Effectively, these nested calls travel through the Bayes network backwards. Maybe for a large hierarchical model, the nested calls could lead to a very deep recursion, and could possibly be the cause of the long runtime.

There is also the issue that the nested calls do not propagate the distribution values outward, so in the end, each variable is sampled from its marginal distribution, or in some hierarchical models, they are sampled from a totally wrong distribution (check out this issue here and this discourse thread here). We are working on a solution to this problem that explicitly uses the Bayes network, does a topological sort and then a forward pass through the network, but it is not finished and at the moment is also slow (take a look at this PR).

1 Like

Quick question – would it help for me to sort the variable list when I pass it to sample_prior_predictive, as a work-around while the rewrite is being prepared?

Thanks!

What is the best way to share a model? I’m happy to share mine, but my code has a huge dependency tail, and the models are changing moment-to-moment, as I work to debug the model (this is what I am using prior sampling for).

So getting a minimal body of code to share is not feasible for me.

But if there’s some way for me to persist my model so that it can be loaded by others, I’d be happy to share it. Is pickling the way to go? I have found problems where the pickle format is brittle. Is there a more robust serialization method I should use?

Regrettably, sorting wont help you for now because the point dictionary is not updated as samples get drawn. The only solution for now would be to do a loop over the variables in the sorted order, manually call draw_values inside the loop for each variable, and update both inputs and givens for the next iteration. This would completely bypass sample_prior_predictive, and by updating point you should only go one level deep into the recurrence.

About the model serialization, I don’t know really… pickle won’t work without the actual code of the class definition. I don’t know of other standard alternatives. Maybe dill?

But if the object is of type pm.Model, and uses all standard random variable types, wouldn’t pickling give something reloadable? Or would there be enough differences in things like compiled theano code to break the pickle loader?

If you contact me off-list, I could try and send it to you as an experiment.

Sorry for the delay in replying.

I had interpreted from this phrase you wrote

[…] but my code has a huge dependency tail, […]

to mean that your model depended on a lot of custom code, like custom Ops or Distributions. But if that is not the case, then you should be able to share your model by just pickling it. I’m not really sure what would happen with the data of shared tensors though. You could check if it’s properly pickled too by trying to pickle.dump your model somewhere and then pickle.load it from a different location or python session, in which you only imported pymc3.

Yes, your interpretation is correct. The dependency tail is just the code needed to build the model. I have picked this and can reload as follows:

import pymc3
import gzip
import pickle
with gzip.GzipFile('/tmp/pickled_model.pickle.gz','rb') as file:
        model = pickle.load(file)

The model I have saved has 92 un-transformed variable names, 184 total variable names, and takes about 10 minutes on a new MacBook pro to generate 500 prior samples. It took about 20 minutes to generate 500 samples (+ training and other overhead) using pm.sampling.sample.

Would it be helpful for me to make an issue about sampling time and provide this model as an attachment?

Yes, you can submit an issue on GitHub. However, given that the whole implementation of draw_values will change in the near future to solve bugs that the present implementation has, it is most likely that we won’t address the slowness problem until draw_values is finally pushed to its new form. I’m sorry but I don’t know when that will happen.

1 Like

Would it be duplicating effort if I was to try to sort the variables inside sample_prior_predictive to make things more efficient there?

Do you mean that you would like to contribute a pull request that only does a topological sort of the nodes before the call to draw_values? That may be a duplicate effort but any contribution is truly welcome.
Have you tested if draw_values is faster for your model when the input list is topologically sorted?

If you do want to write a PR, please open an issue it GitHub so you can reference the PR to it.

OK, I’ll have a look and see what I can come up with.
This is all just in aid of debugging my model, so I have to make sure I don’t take all my time working on that instead of getting the model turning over!