Run time of sample_prior_predictive

The current implementation of sample_prior_predictive relies heavily on the function distributions.distribution.draw_values.

At the moment, this function does not use the Bayes network explicitly, and does not do a topological sort. What it does is:

  1. It does some type checking to see if the variables inputed to draw_values are numbers, arrays, constants or shared’s so it can return those values directly.
  2. If not, it checks whether their values are defined in the point dictionary.
  3. If not, it checks to see if the variables have a random or a distribution.random method, and if they do, calls them to get the values drawn.
  4. If not, it tries to compile the theano expression into a function and calls it.

Crucially, point 3 enters a recursion because the distribution’s random methods usually start with a draw_values statement to get the distribution’s parameter values. Effectively, these nested calls travel through the Bayes network backwards. Maybe for a large hierarchical model, the nested calls could lead to a very deep recursion, and could possibly be the cause of the long runtime.

There is also the issue that the nested calls do not propagate the distribution values outward, so in the end, each variable is sampled from its marginal distribution, or in some hierarchical models, they are sampled from a totally wrong distribution (check out this issue here and this discourse thread here). We are working on a solution to this problem that explicitly uses the Bayes network, does a topological sort and then a forward pass through the network, but it is not finished and at the moment is also slow (take a look at this PR).

1 Like