Use saved gaussian model from sci-kit in pymc3?

naitikshukla · November 15, 2017, 9:10am

I am very new to pymc3 and python itself so please bear with me if I am writing something wrong.

I have saved Gaussian regression model(gp) from sci-kit(with 6 input), saved using pickle. Now i have a file which contains some inputs(x{3 input}) and output(y) for gp.

Now question is for gp how i can calibrate remaining input(6 - 3 ), so that combining with remaining input(3) it yield approximate same output(y) or with minimal error, which is known. Currently I am thinking something like this:

gp = joblib.load('finalized_model_gp.pkl')				#Load saved gp model from scikit
x3,y1,num_records = load_file_data('DATAFIELD.csv',3)	#will return 3 input(x3 = x1,x2,x3) for gp model and 1 output variable(y)

def predict_y(x1,x2,x3,Q1,Q2,Q3):
	return gp.predict(x1,x2,x3,Q1,Q2,Q3)

with pm.Model() as model:
	Q1 = pm.Uniform('Q1',0,1)		#input 4, remaining 3 inputs that need to be calibrated for gp, with some known prior
	Q2 = pm.Uniform('Q2',0,1)		#input 5
	Q3 = pm.Uniform('Q3',0,1)		#input 6
	
	#Might be it should use gp in here to calibrate Q1,Q2,Q3 with x1,x2,x3 for y1 using predict_y()
	y = pm.MvNormal('likelihood', observed=y1 , mu=mu, cov=tt.eye(num_records), shape=(num_records))
    trace_ = pm.sample(500, step,progressbar=True,discard_tuned_samples=False) #some sampling.

I am not sure if this can be done using pymc3 or should i look into some other approach?

junpenglao · November 15, 2017, 9:26am

Hi @naitikshukla,
(This is the same as Finding posterior for calibration using saved Gaussian model in pymc3 right? sorry about the non-response).

What you want to do can surely be done in PyMC3. However, I would suggest you to build the GP in PyMC3 and calibrated there instead, so you can perform the inference in a coherent framework.

Otherwise, if you instead still want to use the fitted GP from scikit-learn, what you can do is isolate the parameters from the fitted GP, namely the mean and standard deviation (or standard error) of Q1, Q2, Q3. And instead of using a Uniform, use a Normal distribution to define it in the pm.Model:

with pm.Model() as model:
    Q1 = pm.Normal('Q1', mu_q1, sd_q1)
    Q2 = pm.Normal('Q2', mu_q2, sd_q2)
    Q3 = pm.Normal('Q3', mu_q3, sd_q3)
    gp = ... # use Q1, Q2, Q3 to build a GP in pymc3, and calibrate it using the new observation
             # more details in http://docs.pymc.io/notebooks/GP-Marginal.html

I guess Q1, Q2, Q3 is the parameters of the kernel function, so similar to

with pm.Model() as gp:
    ℓ = pm.Gamma("ℓ", alpha=2, beta=1)
    η = pm.HalfCauchy("η", beta=5)
    cov = η**2 * pm.gp.cov.Matern52(1, ℓ)
    gp = pm.gp.Marginal(cov_func=cov)

    σ = pm.HalfCauchy("σ", beta=5)
    y_ = gp.marginal_likelihood("y", X=X, y=y, noise=σ)

in your case it will go like this:

with model: # the model you define above with the Q1, Q2, Q3
    cov = Q1**2 * pm.gp.cov.Matern52(1, Q2)
    gp = pm.gp.Marginal(cov_func=cov)
    y_ = gp.marginal_likelihood("y", X=X, y=y, noise=Q3)

of course you need to make sure the Q1, Q2, Q3 in this case correspondent to the right input parameter for pm.gp.

Let me know if there is anything unclear

naitikshukla · November 30, 2017, 6:37am

Thankyou for writing in detail, But I am still confused where I am using my pretrained model from scikit in above approach you shared.

I can have mean and sd for Q1,Q2 and Q3 , but do i have to make GP process again in pymc3 for sampling, or I have to use only pymc3 GP only for this.
Thanks in Advance

junpenglao · November 30, 2017, 7:09am

You will have to make the GP process again in pymc3, but you can set the parameters using the fitted parameters from scikit-learn (or build the prior accordingly using these fitted parameters). That’s what I meant

with model: # the model you define above with the Q1, Q2, Q3
    cov = Q1**2 * pm.gp.cov.Matern52(1, Q2)
    gp = pm.gp.Marginal(cov_func=cov)
    y_ = gp.marginal_likelihood("y", X=X, y=y, noise=Q3)

Q1, Q2, Q3 are the parameters from the pretrained scikit-learn GP.

Topic		Replies	Views
Finding posterior for calibration using saved Gaussian model in pymc3 Questions theano	2	1345	November 15, 2017
Combining GP and PyMC3 version agnostic gaussian_process	1	582	September 1, 2022
Saving and Loading GP model in PYMC3 Questions	6	8633	January 6, 2022
How to calibrate the GP model with various data? Questions	2	791	September 22, 2018
Combine pymc3 with a scikit-learn object Questions	4	666	November 17, 2022

Use saved gaussian model from sci-kit in pymc3?

Related topics