randomWalk predictions

I want to do a timeseries forecasting procedure using PyMC3. I am open to suggestions, but I have tried modeling the forecast using a gaussianRandomWalk, by adding NaNs to the dataset before sampling. This leads to the sampling and hence prection of the missing datapoints in the model. My question is:

  1. Is this the best approach for this?
  2. Can someone help me explain why the forecasted values diverge so rapidly, when the observed data seems more stable? see figure.

The dataset used is from a classic volatility example. Code can be seen below:

import pandas as pd 
import pymc3 as pm
import matplotlib.pyplot as plt
import numpy as np

returns = pd.read_csv(pm.get_data('SP500.csv'), parse_dates=True, index_col=0) 
nanList = pd.Series([np.nan], index=["change"])
for i in range(500):
    returns = returns.append(nanList, ignore_index=True)


with pm.Model() as sp500_model: 
    nu = pm.Exponential('nu', 1/10., testval=5.) 
    sigma = pm.Exponential('sigma', 1/0.02, testval=.1)
    s = pm.GaussianRandomWalk('s', sigma=sigma, shape=len(returns)) 
    volatility_process = pm.Deterministic('volatility_process', pm.math.exp(-2*s)**0.5)
    r = pm.StudentT('r', nu=nu, sigma=volatility_process, observed=returns['change'])
    trace = pm.sample(2000)
    pm.traceplot(trace, varnames=['nu', 'sigma'])

    fig, ax = plt.subplots(figsize=(15, 8)) 
    ax.plot(returns.index, 1/np.exp(trace['s',::5].T), 'C3', alpha=.03) 
    ax.set(title='volatility_process', xlabel='time', ylabel='volatility')
    ax.legend(['S&P500', 'stochastic volatility process'])
    plt.ylim([-0.1, 0.15])

    plt.plot(1/np.exp(trace['s',::5].T), 'C3', alpha=0.03)