Stochastic volatility model estimation with MCMC doesnt terminate

Does someone know why the code below doesn’t terminate at:

model = make_stochastic_volatility_model(sp500_df)

This line of code is from the following in which I try to estimate the parameters of a stochastic volatility model:

image

import os as os
import sys as sys 
path = os.path.dirname(os.path.realpath('__file__'))
#import pandas_market_calendars as mcal
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc3 as pm
from pymc3.distributions.timeseries import GaussianRandomWalk
import pandas_datareader as pdr
import datetime
import pprint
import seaborn as sns
sns.set(palette = 'bright',color_codes=True)


# obtain sp500 log returns (scaled by 100)
start_date = datetime.datetime(1987, 12, 31)
end_date = datetime.datetime(2000, 12, 31)
sp500_df = pdr.DataReader('^GSPC', 'yahoo',start_date,end_date)
sp500_df["returns"] = sp500_df["Close"]/sp500_df["Close"].shift(1)
sp500_df.dropna(inplace=True)
sp500_df["log_returns"] = 100*np.log(sp500_df["returns"])

# plot sp500 log returns
fig, ax = plt.subplots(figsize=(16, 8))
sp500_df['log_returns'].plot(y="change", label='S&P 500', ax=ax)  
ax.set(xlabel='time', ylabel='returns')

def make_stochastic_volatility_model(data):
    with pm.Model() as model:
        obs = data.shape[0]
        
        # prior distribution of parameters
        mu = pm.Normal(name='mu',mu=0,sigma=1)
        epsilon = pm.Normal(name='epsilon',mu=0,sigma=1,shape=obs)
        eta = pm.Normal(name='eta',mu=0,sigma=1,shape=obs)
        
        beta_help = pm.Beta('beta_help',alpha=20,beta=1.5) #testval=0.5125
        beta = 2*beta_help-1
        alpha_help = pm.Normal('alpha_help',mu=0,sigma=5) 
        alpha = alpha_help*(1-beta) #HalfNormal?
        varH = pm.InverseGamma('varH',alpha=2.5,beta=0.025) #testval=0.025
        rho = pm.Uniform('rho',-1,1) #testval=-0.4
        
        r_list = []
        ln_h_list = []
        
        for t in range(obs):
            if t == 0:
                h_current = pm.HalfNormal(f'h_{t}',sigma=1.0)
                ln_h_current = pm.Deterministic(f'ln_h_{t}',np.log(h_current))
            else:
                ln_h_current = pm.Normal(f'ln_h_{t}',
                                         mu=alpha+(beta)*ln_h_current,
                                         sigma=np.sqrt(varH))
                h_current = pm.Deterministic(f'h_{t}',np.exp(ln_h_current))

            ln_h_list.append(ln_h_current)
            r_current = pm.Normal(f'r_{t}',mu=mu,
                                  sigma=np.sqrt(h_current),
                                  observed=data['log_returns'][t])
            r_list.append(r_current)
    return model


model = make_stochastic_volatility_model(sp500_df)

# Check model
# pm.model_to_graphviz(model)

with model:
    # Use Maximum A Posteriori (MAP) optimisation 
    # as initial value for MCMC
    # start = pm.find_MAP()

    # Use the No-U-Turn Sampler
    step = pm.NUTS()

    # Calculate the trace
    trace = pm.sample(2000, step,  
                      random_seed=42, progressbar=True, cores=1, tune=1000)

Hi Dennis,
What do you mean by “the code doesn’t terminate”? Doesn’t it error out? If yes, what’s the error message?

Hi there,

this is what I got by running in colab:

INFO (theano.gof.compilelock): Refreshing lock /root/.theano/compiledir_Linux-4.19.104±x86_64-with-Ubuntu-18.04-bionic-x86_64-3.6.9-64/lock_dir/lock

This is what I got by running in spyder:

Mmmh, doesn’t ring any bell for me :thinking:
Maybe try clearing out the Theano cache?

I tried that as well but then the compiler keeps busy for hours, though I’m used to seeing the chains within 10 minutes:

How unfortunate, really liked the setup of pymc3, but I think I ve to shift gears to some old school winbugs in R.

I appreciate the help and will update if I have found a solution to it!

Yeah sorry I can’t help you more here. It feels to me like it might be related to Google Colab :man_shrugging:
Maybe someone with experience in that can help you out!

Looks like you are building a large theano graph with the for loop implementation (which cause the model to hang as it is usually very slow) - in general, you should try vectorizing it or write it as a theano scan

1 Like