I am trying to understand a bit more what is going on when the Gaussian Process predict() method is called using a trace as the point. The trace object has samples from several random variables in it, but I only get a single mean and variance back. How do we designate which variable this is predicting, or why in this case do I not receive a vector back, for each value in X_predict?
Eg:
trace = pm.sample(draws=1000)
mu, var = gp.predict(x_pred, point=trace, diag=True)
Result: mu = [0.83], var= [0.03]
trace: <MultiTrace: 4 chains, 1000 iterations, 5 variables>
x_pred: <ndarray, shape (1,50)>
Thanks!
Do you get the same result each time you use gp.predict? If your points is you trace and it has mutiple iterations, i think it goes through them one by one. If you use a Map estimate instead you will get the same mu and var every time. If you want the predictive mean and var(or sd) i would use the conditional method and then find the predictive mean from that.
It gives different values each time.
Its pretty ambiguous though which value its even predicting though since there are 5 variables in the trace, and I only get one mean number out of the function (instead of a vector of means for each variable or something).
I think I changed from using find_map() as the warnings stated this was a poor way to establish this.
yes find MAP is poor. Are you trying to get the statistics for your variables? Use pm.summary(trace) for that
All I really want is the predicted values of f at x_pred
Could you post you entire model? If you want your predicted values (i am assuming the predictive mean and standard deviation) you can first use the conditional method to generate new points
with model:
f_pred = gp.conditional("f_pred", X_new)
with model:
pred_samples = pm.sample_posterior_predictive(trace, vars=[f_pred], samples=1000)
where X_new could be over the same range as you original data. Then you can loop over your pred samples to get their mean and standard deviation at those X_new points
mu = np.zeros(len(X_new))
sd = np.zeros(len(X_new))
for i in range(0,len(X_new)):
mu[i] = np.mean(pred_samples["f_pred"][:,i])
sd[i] = np.std(pred_samples["f_pred"][:,i])
If you want the value only at the point of your data you could pass your original X into the conditional instead of X_new
The conditional method in this case will randomly draw 1000 samples of parameters from the trace. Then, for each sample, it will draw 100 random numbers from a normal distribution specified by the values of mu and std in that sample. If you do not want the predictive mean, but instead all the predictive samples, you could defer from taking their mean like i did above.
Let’s say I have 2 predictor variables. How can I define X_new in order to get a point prediction, i.e., the mean.