Estimate parameters of two correlated processes

Let’s assume we have two instantaneously correlated processes: S_1, S_2 both driven by Geometric Brownian Motion such that
\left( {\begin{array}{*{20}{c}} {d{S_1}(t)}\\ {d{S_2}(t)} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {{\mu _1}{S_1}(t)dt}\\ {{\mu _2}{S_2}(t)dt} \end{array}} \right) + \left( {\begin{array}{*{20}{c}} {{\sigma _1}{S_1}(t)d{W_1}(t)}\\ {{\sigma _2}{S_2}(t)\left[ {\rho d{W_1}(t) + \sqrt {1 - {\rho ^2}} d{W_2}(t)} \right]} \end{array}} \right)
This can alternatively be written as a vector-valued process
\begin{equation} d{\bf{S}}(t) = \mu (t,{\bf{S}}(t))dt + \sigma (t,{\bf{S}}(t))dW(t) \end{equation}
and we can easily see that
\begin{equation} d{\bf{S}}(t)|{\bf{S}}(t)\sim N\left( {\left( {\begin{array}{*{20}{c}} {{\mu _1}{S_1}(t)dt}\\ {{\mu _2}{S_2}(t)dt} \end{array}} \right),\left( {\begin{array}{*{20}{c}} {S_1^2(t)\sigma _1^2dt}&{\rho {S_1}(t){S_2}(t){\sigma _1}{\sigma _2}dt}\\ {\rho {S_1}(t){S_2}(t){\sigma _1}{\sigma _2}dt}&{S_2^2(t)\sigma _2^2dt} \end{array}} \right)} \right) \end{equation}
If we observe daily values for S_1,S_2 (and their increments dS_1, dS_2), then I can easily use pymc to estimate their parameters (\mu_1, \sigma_1) and (\mu_2, \sigma_2), e.g

dt = 1/252
with pm.Model() as model:
    mu_param = pm.Uniform('mu_param', -0.5, 0.5)
    sigma_param = pm.Uniform('sigma_param', 0.0001, 0.3)
    dS = pm.Normal('dS', mu = (data['seriesA'].values*mu_param)*dt,
                  sigma =  data['seriesA'].values*sigma_param*np.sqrt(dt), 
                  observed = data['dS_seriesA'].values)
    trace = pm.sample(50000)

However, I’d like to know how to use pymc to estimate the correlation parameter \rho from the joint dynamics. I’ve explored MvNormal but I can’t find the right way how to set up the estimation task.

My working ipynb+data is on Github:

Can someone pls give me a hint how to estimate the parameters in the multidimensional model?

Maybe this doc is what you are looking for:

MvGaussianRandomWalk would work but it assumes that the covariance matrix is constant.
Thus, doing some tweaks, I’d have to rewrite the above as
\begin{equation} \left[ {\begin{array}{*{20}{c}} {\ln {S_1}(t)}\\ {\ln {S_2}(t)} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\ln {S_1}(t - \Delta t)}\\ {\ln {S_2}(t - \Delta t)} \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} {{\mu _1} - \frac{1}{2}\sigma _1^2}\\ {{\mu _2} - \frac{1}{2}\sigma _2^2} \end{array}} \right]\Delta t + \left[ {\begin{array}{*{20}{c}} {{\rho _1}}&0\\ {{\sigma _2}\rho }&{{\sigma _2}\sqrt {1 - {\rho ^2}} } \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {\Delta {W_1}(t)}\\ {\Delta {W_2}(t)} \end{array}} \right] \end{equation}
which means we work with a two-dimensional normal distribution with constant elements in the covariance matrix:
\begin{equation} \left[ {\begin{array}{*{20}{c}} {\ln {S_1}(t)}\\ {\ln {S_2}(t)} \end{array}} \right]{\sim}N\left( {\left[ {\begin{array}{*{20}{c}} {\ln {S_1}(t - \Delta t)}\\ {\ln {S_2}(t - \Delta t)} \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} {{\mu _1} - \frac{1}{2}\sigma _1^2}\\ {{\mu _2} - \frac{1}{2}\sigma _2^2} \end{array}} \right]\Delta t,\left[ {\begin{array}{*{20}{c}} {\Delta t\rho _1^2}&{\Delta t{\sigma _1}{\sigma _2}\rho }\\ {\Delta t{\sigma _1}{\sigma _2}\rho }&{\Delta t\sigma _2^2} \end{array}} \right]} \right) \end{equation}
Not every time this transformation is possible (when I define a different dynamics of S_1, S_2).

So my question is - can I somehow make the covariance matrix depend on the previous values of the ‘gaussian random walk’ in pymc?

Sure, you can write it using a theano.scan - whether you can inference the parameter or not might depends on the dependency.
Also, note that if you have a covariance matrix that depends on previous values, you might be able to express it as a Gaussian Process, which has a pretty good support in PyMC3

Hi Marek,

I’ve had similar challenges implementing a multivariate SDE, where the two processes are correlated. See my latest post on this thread below. I ended up using theano.scan to update the volatility process (which depends on previous values) , but I think in general having a Multivariate SDE class would handle these types of problems would be incredibly valuable.

Marek, it looks like you are estimating two correlated geometric Brownian motions (Black model). I’m very interested in your solution as I am trying to solve a very similar problem