I’m trying to implement a multivariate stochastic volatility model in PyMC3. The model consists of two correlated SDEs: the return process (observed) and the volatility process (unobserved). I basically need to extend either the EulerMaruyama timeseries class to handle a bivariate SDE, or, the MvGaussianRandomWalk timeseries class to handle the drift coefficients. It’s unclear to me how I would pass in the observed data into MvGaussianRandomWalk when only one of the two processes is observed. Any help on this? Thanks!!
You could define the observed (return) process as an EulerMaruyama SDE and pass to it as parameter the volatility process. In a separate GaussianRandomWalk or EulerMaruyama SDE you can then define the volatility process itself.
Example:
volatility = pm.GaussianRandomWalk('volatility', sd=.5)
sde = lambda x, drift, volatility: (drift * x, volatility)
y = ts.EulerMaruyama('y', 1.0, sde, [drift, volatility], shape=N, testval=np.zeros(shape=N), observed=Y)
…or something to that effect. The volatility can be another EulerMaruyama SDE. I don’t know if that answers the question: do you mean that you expect the return process to influence the volatility process as well?
Thanks for the response. You are correct in that the return process depends on the volatility process, and your solution would work if the two processes were independent. The key challenge here is that my return process and volatility process are not independent; their brownian motions are correlated. I think what I need to do is create a new class of EulerMaruyama extended to support a multivariate SDE, similar to how MVGuassianRandowmWalk extends GuassianRandomWalk. I’m just not sure how to code that up
Ah, I see. Maybe you can “cheat” by explicitly adding a common innovation factor in the SDE.
Wouldn’t that be equivalent to having a common factor (as in a hierarchical model) for your brownian motion, shared between the volatility and return processes (and thus correlation)?
Example:
common_innovation = pm.Normal('common_factor', mu=0.0, sd=1.)
sde_volatility = lambda x, drift, innovation: (drift * x + innovation, sigma)
volatility = ts.EulerMaruyama('volatility', 1.0, sde_volatility, [drift, common_innovation], shape=N, testval=np.zeros(shape=N))
sde = lambda x, drift, innovation: (drift * x + innovation, volatility)
y = ts.EulerMaruyama('y', 1.0, sde, [drift, common_innovation], shape=N, testval=np.zeros(shape=N), observed=Y)
…something like that. Just a thought.
Thanks again for replying. Your solution now assumes perfect correlation. I need a multivariate posterior with some non-zero correlation
You’re right, my example code is nonsense: the sde adds innovation and then once again it adds the volatility output, so y actually depends twice on the innovation… The concept I was trying to propose was one where you have 3 innovations: a common one, and then 2 specific ones (for the return and volatility) drawn from a normal distribution whose mean is the common one. Thus you would have correlation (and not perfect one).
However I’m having a hard time coming up with a sensible code example because I’m having a hard time visualizing what the underlying equation you’re trying to reproduce is. Perhaps posting that would help people better answer your question.
In this model, the stock price S_t is driven by the stochastic volatility model:
dS_t = S_tr_tdt + S_t\sqrt{V_t}dW_t^s
dV_t =\kappa_v (\theta_v-V_t)dt+\sigma_v\sqrt{V_t}dW_t^v
where the Brownian motions have correlation \rho
Discretizing the SDE and redefining some parameters, we have
Y_t=\sqrt{V_{t-1}}\epsilon_t^s
V_t=\alpha_v+\beta_vV_{t-1}+\sigma_v\sqrt{V_{t-1}}\epsilon_t^v
Where Y_t are excess returns and V_t is the latent volatility process
Ok, to be able to frame this in the context of 2 SDEs using PyMC3’s EulerMaruyama method, I would tentatively propose the following (and I’m not a mathematician so mistakes and massive abuse of notation are very likely):
- to fit the Euler-Maruyama formulation of:
Y_{n+1} = Y_{n} + a(Yn)dt + b(Yn)dW_t,
we set:
Y_t = Y_{t-1} + S_tr_tdt + S_t\sqrt{V_t}dW_t^s
V_t = V_{t-1} + \kappa_v (\theta_v-V_t)dt+\sigma_v\sqrt{V_t}dW_t^v
so for Y_t we have:
a(x) = S_tr_t
b(x) = S_t\sqrt{V_t}
and for V_t we have:
a(x) = \kappa_v (\theta_v-V_t)
b(x) = \sigma_v\sqrt{V_t}
- as you said already, this doesn’t solve your problem because dW_t^s and dW_t^v are correlated by some non-zero \rho. So to solve this, let’s come up witih a hierarchical approach to dW_t like this:
dW_t^v = dW_t^C + dW_t^{v'}
dW_t^s = dW_t^C + dW_t^{s'}
where dW_t^{v'} and dW_t^{s'} are unique to the volatility and return processes respectively. Would you agree in this case that dW_t^v and dW_t^s are correlated gaussian innovations with some \rho that is proportional to the ratio between the standard deviation of dW_t^C and the idiosyncratic innovations?
- If that is agreed, then what we have left to do is to re-formulate the stochastic processes in light of the above concept (substituting the Brownian innovation for the common + idiosyncratic ones), and assuming dt = 1 for simplicity:
Y_t = Y_{t-1} + S_tr_t + S_t\sqrt{V_t}dW_t^C + S_t\sqrt{V_t}dW_t^{s'}
V_t = V_{t-1} + \kappa_v (\theta_v-V_t) + \sigma_v\sqrt{V_t}dW_t^{C}+\sigma_v\sqrt{V_t}dW_t^{v'}
and now for Y_t we have:
a(x) = S_tr_t + S_t\sqrt{V_t}dW_t^C
b(x) = S_t\sqrt{V_t}
and for V_t we have:
a(x) = \kappa_v (\theta_v-V_t) + \sigma_v\sqrt{V_t}dW_t^{C}
b(x) = \sigma_v\sqrt{V_t}
Would that work, at least conceptually?
There’s a straightforward way to reformulate the process in terms of two independent Brownian motions. I considered this approach but didn’t think it could be implemented with the existing EulerMaruyama class
Y_t=\sqrt{V_{t-1}}\epsilon_t^s
V_t=\alpha_v+\beta_vV_{t-1}+\sigma_v\sqrt{V_{t-1}}[\rho\epsilon_t^s+\sqrt{1-\rho^2}\epsilon_t^v]
Here, the two epsilons are independent, and we’ve specified the correlation \rho
If there’s a way to implement this in the existing classes, great!
I think it can. I’d try something along those lines:
error_s = pm.Normal('error_s', mu=0.0, sd=1.)
sde = lambda x, sigma_v, rho, Vt, beta_v, alpha_v, error_s: (alpha_v + beta_v * Vt + sigma_v * np.sqrt(Vt) * rho * error_s,
sigma_v * np.sqrt(Vt) * np.sqrt(1-rho*rho))
Vt = ts.EulerMaruyama('Vt', 1.0, sde, [sigma_v, rho, Vt, alpha_v, beta_v, error_s], ...)
expected = pm.Deterministic('expected', np.sqrt(Vt) * error_s)
Yt = pm.Normal('Yt', mu=expected, sd=1., observed=Y)
Thanks! This looks promising but you’ve introduced an addition, third Brownian motion with the command:
Yt = pm.Normal(‘Yt’, mu=expected, sd=1., observed=Y)
Since we’ve already specified error_s, the return is just deterministic:
Yt = pm.Deterministic('Yt', np.sqrt(Vt) * error_s, observed=returns)
But, Pymc3 won’t let me pass ‘observed’ with Deterministic()
TypeError: Deterministic() got an unexpected keyword argument ‘observed’
Yes, unfortunately I’m not sure why PyMC3 won’t allow that, and I’m not aware of a workaround (without introducing superfluous noise). That’s why I added the third Brownian motion. Perhaps giving it a really low standard deviation is enough? But I’ll let others who are more knowledgeable about PyMC3 answer that question.
Ok I tried implementing this solution but am getting an error
import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm
from pymc3.distributions.timeseries import EulerMaruyama
returns = np.genfromtxt(pm.get_data("SP500.csv"))
def lin_sde(Vt, sigma_v, rho, beta_v, alpha_v, error_s):
return alpha_v + beta_v * Vt + sigma_v * np.sqrt(Vt) * rho * error_s, sigma_v * np.sqrt(Vt) * np.sqrt(1-rho*rho)
with pm.Model() as sp500_model:
alpha_v=pm.Normal('alpha',0,sd=100)
beta_v=pm.Normal('beta',0,sd=10)
sigma_v=pm.InverseGamma('sigmaV',2.5,0.1)
rho=pm.Uniform('rho',-.9,.9)
error_s = pm.Normal('error_s', mu=0.0, sd=1.)
Vt = EulerMaruyama('Vt', 1.0, lin_sde, [sigma_v, rho, alpha_v, beta_v, error_s], shape=len(returns),testval=np.repeat(.01,len(returns)))
expected = pm.Deterministic('expected', np.sqrt(Vt) * error_s)
Yt = pm.Normal('Yt', mu=expected, sd=0.0001, observed=returns)
with sp500_model:
trace = pm.sample(2000,chains=1,cores=1)
Here is the traceback:
Traceback (most recent call last):
File "<ipython-input-34-58522298a257>", line 21, in <module>
trace = pm.sample(2000,chains=1,cores=1)
File "C:\ProgramData\Anaconda3\lib\site-packages\pymc3\sampling.py", line 469, in sample
trace = _sample_many(**sample_args)
File "C:\ProgramData\Anaconda3\lib\site-packages\pymc3\sampling.py", line 515, in _sample_many
step=step, random_seed=random_seed[i], **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\pymc3\sampling.py", line 559, in _sample
for it, strace in enumerate(sampling):
File "C:\Users\X197713\AppData\Roaming\Python\Python36\site-packages\tqdm\_tqdm.py", line 927, in __iter__
for obj in iterable:
File "C:\ProgramData\Anaconda3\lib\site-packages\pymc3\sampling.py", line 655, in _iter_sample
point, states = step.step(point)
File "C:\ProgramData\Anaconda3\lib\site-packages\pymc3\step_methods\arraystep.py", line 247, in step
apoint, stats = self.astep(array)
File "C:\ProgramData\Anaconda3\lib\site-packages\pymc3\step_methods\hmc\base_hmc.py", line 117, in astep
'might be misspecified.' % start.energy)
ValueError: Bad initial energy: nan. The model might be misspecified.
I think that’s a consequence of the very low sd value, actually… try sd=0.1 for example?
error_s should be a random walk, not a scalar. with that change I’m getting a different error about Input dimension mismatch.
def lin_sde(Vt, sigma_v, rho, beta_v, alpha_v, error_s):
return alpha_v + beta_v * Vt + sigma_v * np.sqrt(Vt) * rho * error_s, sigma_v * np.sqrt(Vt) * np.sqrt(1-rho*rho)
with pm.Model() as sp500_model:
alpha_v=pm.Normal('alpha',0,sd=100)
beta_v=pm.Normal('beta',0,sd=10)
sigma_v=pm.InverseGamma('sigmaV',2.5,0.1)
rho=pm.Uniform('rho',-.9,.9)
error_s = pm.GaussianRandomWalk('error_s', shape=len(returns))
Vt = EulerMaruyama('Vt', 1.0, lin_sde, [sigma_v, rho, alpha_v, beta_v, error_s], shape=len(returns),testval=np.repeat(.01,len(returns)))
expected = pm.Deterministic('expected', np.sqrt(Vt) * error_s)
Yt = pm.Normal('Yt', mu=expected, sd=0.1, observed=returns)
with sp500_model:
trace = pm.sample(2000,chains=1,cores=1)
ValueError: Input dimension mis-match. (input[0].shape[0] = 3812, input[1].shape[0] = 3813)
Though you haven’t shown me the sde function itself, if it’s similar to the one I wrote, error_s should be a normal distribution, not a random walk. That is, within the context of the SDE, you draw one innovation per time step, but it doesn’t drift through time: only Vt does.
In the equations you described:
Y_t=\sqrt{V_{t-1}}\epsilon_t^s
V_t=\alpha_v+\beta_vV_{t-1}+\sigma_v\sqrt{V_{t-1}}[\rho\epsilon_t^s+\sqrt{1-\rho^2}\epsilon_t^v]
\epsilon_t^s ~ Normal(0, \sigma_s), no?
or you would say that \epsilon_t^s = \epsilon_{t-1}^s + \epsilon_t^* ? (in which case, yes it would be a random walk)
I think we had it right the first time. error_s is a just a vector of i.i.d standard normals
error_s = pm.Normal('error_s',mu=0.0,sd=1, shape=len(returns))
Though I’m still getting an error about input dimension
ValueError: Input dimension mis-match. (input[0].shape[0] = 3812, input[1].shape[0] = 3813)
I got it to run but the results don’t make sense. I don’t think this is the right way to solve the problem. The additional normal random required to pass in the observed returns is undesirable. There’s got to be a way to write a multivariate class to support this
import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm
from pymc3.distributions.timeseries import EulerMaruyama
returns = np.genfromtxt(pm.get_data("SP500.csv"))
def lin_sde(Vt, sigma_v, rho, beta_v, alpha_v, error_s):
return alpha_v + beta_v * Vt + sigma_v * np.sqrt(Vt) * rho * error_s, sigma_v * np.sqrt(Vt) * np.sqrt(1-rho*rho)
with pm.Model() as sp500_model:
alpha_v=pm.Normal('alpha',0,sd=100)
beta_v=pm.Normal('beta',0,sd=10)
sigma_v=pm.InverseGamma('sigmaV',2.5,0.1)
rho=pm.Uniform('rho',-.9,.9)
error_s = pm.Normal('error_s',mu=0.0,sd=1)
Vt = EulerMaruyama('Vt', 1.0, lin_sde, [sigma_v, rho, alpha_v, beta_v, error_s], shape=len(returns),testval=np.repeat(.01,len(returns)))
expected = pm.Deterministic('expected', np.sqrt(Vt) * error_s)
Yt = pm.Normal('Yt', mu=expected, sd=0.01, observed=returns)
with sp500_model:
trace = pm.sample(2000,chains=1,cores=1,init='adapt_diag')
pm.traceplot(trace, [alpha_v,beta_v,sigma_v,rho]);
fig, ax = plt.subplots(figsize=(14, 8))
ax.plot(returns)
ax.plot(trace['Vt',::5].T, 'r', alpha=.03);
ax.set(xlabel='time', ylabel='returns')
ax.legend(['S&P500', 'stoch vol']);
Adding additional noise is the workaround, so your solution makes sense to me.
Going way back to your original question, this is possible by
1, Cast your observed into a theano shared variable.
2, Concatenate the observed with the latent process.
And then pass it to the observed in MvGaussianRandomWalk
.