Saving intermediate traces

Hi all,
I was wondering whether it is possible to save intermediate values of my trace while sampling. I’m asking because I am planning to do a run which is going to last for more than a day, where samples take several seconds to compute (it’s quite a complicated function for which I don’t have gradients (yet), so right now I’m using Metropolis sampling). I’d like to check some of the intermediate results to see what is happening.

Below is the code that I tried to run to achieve what I want:

with pm.Model() as model:

    BoundedNormal = pm.Bound(pm.Normal, lower=0.0)
    params = BoundedNormal('parameters',mu=0.5,sd=0.2,shape=(10,))

    likelihood_num = pm.Normal('y_num', mu=J_mcmc_num(params), sd = log10_std_num, observed = log10_measured_num)
    likelihood_weight = pm.Normal('y_weight', mu=J_mcmc_weight(params), sd = log10_std_weight, observed = log10_measured_weight)    

    step = pm.Metropolis()
    
    #save some intermediate traces 
    for i1 in range(5):
        
        if i1 == 0:
            trace = pm.sample(1000,tune=1000,step=step,chains=2,cores=2,start=None,trace=None,discard_tuned_samples=False)      
        else:
            trace = pm.sample(1000,tune=0,step=step,chains=2,cores=2,trace=trace)      
                
        with open(os.path.join(outDir,traceFile), 'wb') as buff:
            pickle.dump({'model': model, 'trace': trace}, buff)

As you can see in this traceplot however, at 2000 samples the step doesn’t seem to be tuned anymore:


I don’t think I am able to use the live_plot argument in pm.sample since I am running my script on a cluster.

Any help would be much appreciated!

Maybe something like:

for i in range(...):
    with model:
        trace = pm.sample(...)
    # save trace

Also, try DEMetropolis which might fit your use case here.

Thanks for the suggestion, I tried:

for i1 in range(5):
    with pm.Model() as model:

        BoundedNormal = pm.Bound(pm.Normal, lower=0.0)
        params = BoundedNormal('parameters',mu=0.5,sd=0.2,shape=(10,))

        likelihood_num = pm.Normal('y_num', mu=J_mcmc_num(params), sd = log10_std_num, observed = log10_measured_num)
        likelihood_weight = pm.Normal('y_weight', mu=J_mcmc_weight(params), sd = log10_std_weight, observed = log10_measured_weight)

        step = pm.Metropolis()

        if i1 == 0:
            trace = pm.sample(500,tune=500,step=step,chains=1,cores=1,start=None,trace=None,discard_tuned_samples=False)
        else:
            trace = pm.sample(500,tune=0,step=step,chains=1,cores=1,trace=trace,discard_tuned_samples=False)

        with open(os.path.join(outDir,traceFile), 'wb') as buff:
            pickle.dump({'model': model, 'trace': trace}, buff)

It still doesn’t do what I want however, as you can see the trace gets stuck after 1000 iterations:

I guess my question is related to the one in this thread?
https://discourse.pymc.io/t/pause-and-resume-training-a-model-that-can-change-during-training/3196/2

And thanks for suggesting the DEMetropolis step, will definitely give it a try!

Update as of 2022: one should look into mcbackend for live streaming MCMC draws to a database from which they can be downloaded in real time.

1 Like