Long pause after initialization


#1

After initialization pymc3 sometimes has a long pause for some model types that I am using. When I kill the process the errors seem to indicate that it was stuck in the theano graph. I have a lot of simulations that basically take the same form though the number of variables can change so I’m wrapping everything into a form notionally like the following:

#Assign the distributions to the random variables
for i,name in enumerate(parameters.names):
        with model:
            rvs.append(parameters.pymc3_distributions[i](name,**parameters.pymc3_distribution_kwargs[i]))

#Construct the parameter vector
with model:
    params_ = tt.stack(rvs,axis=0).reshape([1,len(rvs)])

surrogates = []

for indx,surrogate in enumerate(surrogate_models):
    
    #Append an empty list for each surrogate model
    surrogates.append([])
    ys.append([])
    
    for i,hps in enumerate(surrogate['hyper_parameters']):
        
        with model:
            #Construct the surrogate
            surrogates[indx].append(surrogate definition...)
            
            #Form the likelihood
            ys[indx].append(pm.Normal('s_{0}_{1}'.format(indx,i),mu=surrogates[indx][i],\
                                      sd=0.1*std_amplitudes[indx][i],\
                                      observed=test_values[indx][i]))

Where my surrogate definition constructs a Latent gaussian process that I’m evaluating over params_ but the hyper parameters are fixed. In other words, my surrogate model is the result of gp._build_conditional().

Is there a better way to build a model that won’t run into the graph problem?


#2

Using a for loop to build your RandomVariables is always quite slow - is it possible to vectorized it?
Also, if your surrogate model is a GP, it doesnt scale too well with dimension, so try to either keep it small or use an approximation (e.g., approx="FITC")


#3

@junpenglao I tried to form the rvs using

pm.Normal('rvs',...,shape = 1,some_length)

but that didn’t speed up things for the graph. Each of my gps has 50 points. Would that be a problem? If I have less surrogate models my sampling rate is just fine but with more surrogates I can’t even start sampling.


#4

Right, then likely it is the memory problem from your surrogate models - try to either use the approximation mentioned above, or lower the number of surrogate models.


#5

I was able to solve it by reducing the number of GPs I used in a single simulation. I ended up having to fit a MvNormal to my posterior and used that as my prior in a subsequent analysis. It seems to be working okay though I am having other problems that I’ll ask the group about.


#6

LIkely it is stuck in the theano graph optimization. If you have many nodes this may take a while. You can try to decrease the amount of optimization by setting your theano_flags:
THEANO_FLAGS=optimizer=fast_compile your_script.py

Or you do it in the theano.config file within your script…


#7

It’s a great idea to set to fast_compile when you are debugging your model - but you should use the default fast_run when you are actually running it :slight_smile:


#8

@hvasbath @junpenglao I do think it was getting stuck in the graph optimization so in the future I might try changing to fast_compile just to see.