After initialization pymc3 sometimes has a long pause for some model types that I am using. When I kill the process the errors seem to indicate that it was stuck in the theano graph. I have a lot of simulations that basically take the same form though the number of variables can change so I’m wrapping everything into a form notionally like the following:
#Assign the distributions to the random variables
for i,name in enumerate(parameters.names):
#Construct the parameter vector
params_ = tt.stack(rvs,axis=0).reshape([1,len(rvs)])
surrogates = 
for indx,surrogate in enumerate(surrogate_models):
#Append an empty list for each surrogate model
for i,hps in enumerate(surrogate['hyper_parameters']):
#Construct the surrogate
#Form the likelihood
Where my surrogate definition constructs a Latent gaussian process that I’m evaluating over params_ but the hyper parameters are fixed. In other words, my surrogate model is the result of gp._build_conditional().
Is there a better way to build a model that won’t run into the graph problem?
Using a for loop to build your RandomVariables is always quite slow - is it possible to vectorized it?
Also, if your surrogate model is a GP, it doesnt scale too well with dimension, so try to either keep it small or use an approximation (e.g.,
@junpenglao I tried to form the rvs using
pm.Normal('rvs',...,shape = 1,some_length)
but that didn’t speed up things for the graph. Each of my gps has 50 points. Would that be a problem? If I have less surrogate models my sampling rate is just fine but with more surrogates I can’t even start sampling.
Right, then likely it is the memory problem from your surrogate models - try to either use the approximation mentioned above, or lower the number of surrogate models.
I was able to solve it by reducing the number of GPs I used in a single simulation. I ended up having to fit a MvNormal to my posterior and used that as my prior in a subsequent analysis. It seems to be working okay though I am having other problems that I’ll ask the group about.
LIkely it is stuck in the theano graph optimization. If you have many nodes this may take a while. You can try to decrease the amount of optimization by setting your theano_flags:
Or you do it in the theano.config file within your script…
It’s a great idea to set to fast_compile when you are debugging your model - but you should use the default fast_run when you are actually running it
@hvasbath @junpenglao I do think it was getting stuck in the graph optimization so in the future I might try changing to fast_compile just to see.