Prediction using sampling

Hey!
I was trying to predict the returns of an algorithmic trading strategy on out of sample data using StudentT distribution as done in the library pyfolio which uses pymc3 for bayesian prediction. It seems to me that some parts in the prediction part of the code in pyfolio are a bit outdated though. I just wanted to know how prediction using sampling can be done or specifically what I am doing wrong in the current methodology as the predictions seem to be a little off, the current method I am using is as follows:

with pm.Model() as model:

    mu = pm.Normal('mean returns', mu=0, sd=.01, testval=data.mean())

    sigma = pm.HalfCauchy('volatility', beta=1, testval=data.std())

    nu = pm.Exponential('nu_minus_two', 1. / 10., testval=3.)

    returns = pm.StudentT('returns', nu=nu + 2, mu=mu, sd=sigma,
                          observed=data)
    pm.Deterministic('annual volatility',
                     returns.distribution.variance**.5 * np.sqrt(252))

    pm.Deterministic('sharpe', returns.distribution.mean /
                     returns.distribution.variance**.5 *
                     np.sqrt(252))

    trace = pm.sample(samples, progressbar=progressbar)
return model, trace

The above, from what I understand samples the hyperparameters and ‘fits’ them using the observed data. I then use the model and trace returned from here as follows:

with model:    
    returns_test = pm.StudentT('returns_test', nu=nu+2, mu=mu, sd=sigma, shape=(len(returns_test), 1))
    ppc_samples = pm.sample_posterior_predictive(trace, samples=samples,
                                model=model, var_names=['returns_test'],
                                progressbar=progressbar)
    return trace, ppc_samples['returns_test']

So basically I use the trace object (samples of hyperparameters tuned onto the training set) to predict on the test set, here I am using the assumption that the returns would be identically distributed for both in sample and OOS, which is reasonably correct for the purpose of analyzing a strategy’s performance. These ppc samples are then used to generated percentile cones to analyze which percentile zone the strategy lies in, the problem right now is that the cones generated using the above methodology are far too ‘conservative’ and seem to predict a percentile range of 25-50 for a strategy which is performing reasonably worse than in sample performance (lies in the 5th percentile and below using monte carlo simulations, which seems to be more realistic considering it’s performance).

Am I using the correct method for prediction? If yes, can anyone help me with why the ‘predictions’ are coming to be so conservative, and if not, what is the correct method for performing predictions using sampling?

NOTE: The returns are a pandas series of day wise returns of the strategy (NOT in percentages), which I normalize (Z score normalization) before I do the trace generation and sampling, and ‘de-normalize’ after the ppc samples are generated.

P.S sorry for the long question, I just wanted to be as clear as possible in trying to convey what problem I was solving and the challenge I was facing in doing so. Any help would be appreciated.
Thanks in advance!