Stochastic volatility jumps

Hi. I’m trying to enhance a stochastic volatility model to account for discontinuous price jumps.

I am currently using an SDE implementation that relies on EulerMaruyama to solve for mean-reverting log-volatility. It seems fast and accurate as far as modeling day-to-day volatility but can’t accommodate for those idiosyncratic “jump” events.

The SDE I am using is of the form:

def sde(x, theta, mu, sigma):
    return (theta * (mu - x), sigma)`

, where the first part of the returned tuple is the mean reversion term and the second is Brownian noise.

Is there a way I can modify this SDE to incorporate the jump term? Or should I add the jump on top of the SDE? In either case, what would the term be?

I played around with a Bernoulli process to try to simulate a low probability event. I also see these types of processes being referred to as a compounded Poisson, or hyper-exponential in the academic literature. But for now, I am just trying to understand what the naive/straightforward approach could be, from the pymc perspective.

Any advice would be greatly appreciated! Thank you!

1 Like

I found some models in academic literature, as well as a Stan implementation here: 2.5 Stochastic Volatility Models | Stan User’s Guide. The Stan model does not actually implement the volatility jumps but I think it can be extended to that, provided it can be ported to pymc.

The issue is that the model uses different SDEs for the first log-volatility observation, h(t=1), and the subsequent observations h(t>1), as described in the above link and the original publication by Kim et al (1998).

I took a first stab at pymc implementation, subclassing EulerMaruyama and overwriting the logp method to handle the first and subsequent observations differently. Unfortunately, I think I did something wrong, as it is now very sensitive to the log-vol mu prior, and if mu is significantly negative (e.g., -1) it fails with a SamplingError: Initial evaluation of model at starting point failed!. That is a problem because I think the true log-vol mean is somewhere in the -1 to -5 range, and the model won’t converge otherwise.

I would appreciate any feedback, ideas on what I might have done wrong, or alternative ways to implement. Here’s the code below. Thank you, and Happy New Year!

import theano
import theano.tensor as at
import pymc3 as pm
from pymc3.distributions.timeseries import EulerMaruyama

class EulerMaruyama_SVJ(EulerMaruyama):

    def logp(self, x):

        # at t1
        f, g = self.h1_sde(*self.sde_pars)
        mu1 = (x[0] + self.dt * f).reshape(-1, 1)
        sd1 = (at.sqrt(self.dt) * g).reshape(-1, 1)
        
        # at t > 1
        x_gte1 = x[1:-1]
        theano.config.compute_test_value = 'ignore'
        f, g = self.h_gt1_sde(x_gte1, *self.sde_pars)
        mu_gt1 = (x_gte1 + self.dt * f).reshape(-1, 1)
        sd_gt1 = (at.sqrt(self.dt) * g).reshape(-1, 1)
        
        # concatenate and call logp
        mu = at.concatenate([mu1, mu_gt1], axis=0)
        sd = at.concatenate([mu1, mu_gt1], axis=0)
        return at.sum(pm.Normal.dist(mu=mu, sd=sd).logp(x[1:]))

    def h1_sde(self, mu, phi, sigma):
        return (mu, sigma / at.sqrt(1 - phi * phi))

    def h_gt1_sde(self, x, mu, phi, sigma):
        return (mu + phi * (x - mu), sigma)

def make_SVJ_model(data):
    """
    See:
    https://mc-stan.org/docs/2_21/stan-users-guide/stochastic-volatility-models.html
    """
    with pm.Model(coords={'time': data.index.values}) as model:

        # SDE params
        mu = pm.Cachy('mu', 0, 10) # mean log-volatility
        phi = pm.Uniform('phi', -1, 1) # volatility persistence
        sigma = pm.Normal('sigma', 0, 5)

        h = EulerMaruyama_SVJ('h',
                              dt=1.0,
                              sde_fn=None, # SDEs are defined inside the class
                              sde_pars=(mu, phi, sigma),
                              dims='time',
                              testval=np.ones_like(data),
                              )

        # likelihood
        y = pm.Normal('obs', 0, pm.math.exp(h/2), observed=data, dims='time')

    return model

So when I call:

import numpy as np
import yfinance as yf

data = yf.download('SPY')
returns = np.log(data['Adj Close']).diff().dropna()

m = make_SVJ_model(returns.iloc[-500:])
with m:
    trace = pm.sample(4000, tune=1000, return_inferencedata=True)

It fails with a:

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'mu': array(-1.), 'phi': array(0.95), 'sigma': array(0.25), 'h': array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
[...]
Initial evaluation results:
mu         3.69
phi        3.69
sigma      3.69
h          -inf
obs     -709.49
Name: Log-probability of test_point, dtype: float64

I am not familiar with this SDE implementation, is there a paper you can link to? The pymc3 docs have a stochastic volatility model - perhaps you could change the log vol to a StudentT random walk instead of a gaussian?

1 Like

Yes, you could create a RW like this: pm.StudentT("x").cumsum().

Note that time series support will be much improved in PyMC 4.0 so that you can write your RV as a scan-for-loop.

Hey. Thanks so much for your reply. I actually found an error in my code, and after correcting I can get some samples, but the result I get for h(t=1) is visually off on the volatility graph, so I still must have misspecified something there. As you can see in the graph below, there is no reason for the posterior predictive spike (blue) to be there at t=1, as the observed returns (orange) are pretty flat in that region (you can also see the corresponding volatility spike in the second graph as well):

I am going off this publication by Alexopoulos et al (2019): https://arxiv.org/pdf/1904.05312.pdf. So, these would be equations 1 and 2 there. It builds on the original work by Kim et al (1998): http://apps.olin.wustl.edu/faculty/chib/papers/KimShephardChib98.pdf, which shows essentially the same series of equations (Eq. 1) and is also described in Stan’s docs I referenced above.

I might try to implement without the Euler method, as I think I may be doing something wrong there (although I believe it samples much faster with it).

I tried the implementation in the docs initially, but thought it was slow and didn’t model big jumps well. But I will certainly revisit it now and check what kind of results I get. Thank you, and please let me know if you have any additional thoughts.

Could you please elaborate on the meaning of .cumsum(). Isn’t that the CDF function? Why does it make sense to use one here? Thank you.