The current implementation of sample_prior_predictive
relies heavily on the function distributions.distribution.draw_values
.
At the moment, this function does not use the Bayes network explicitly, and does not do a topological sort. What it does is:
- It does some type checking to see if the variables inputed to
draw_values
are numbers, arrays, constants or shared’s so it can return those values directly. - If not, it checks whether their values are defined in the
point
dictionary. - If not, it checks to see if the variables have a
random
or adistribution.random
method, and if they do, calls them to get the values drawn. - If not, it tries to compile the
theano
expression into a function and calls it.
Crucially, point 3 enters a recursion because the distribution’s random
methods usually start with a draw_values
statement to get the distribution’s parameter values. Effectively, these nested calls travel through the Bayes network backwards. Maybe for a large hierarchical model, the nested calls could lead to a very deep recursion, and could possibly be the cause of the long runtime.
There is also the issue that the nested calls do not propagate the distribution values outward, so in the end, each variable is sampled from its marginal distribution, or in some hierarchical models, they are sampled from a totally wrong distribution (check out this issue here and this discourse thread here). We are working on a solution to this problem that explicitly uses the Bayes network, does a topological sort and then a forward pass through the network, but it is not finished and at the moment is also slow (take a look at this PR).