I have 2 files for sensors data like:
- Simulated-data-file : (input:xc1,xc2,…,parameters:Q1,Q2,…,output:yc)
- Field Data File: (input:xf1,xf2,…,output:yf)
I have problem for simulation of Bayesian Calibration of sensor’s data.
Now what i want is to train/find values for Calibration parameter(Q) which when used with xf, will yield yf’ which should be ~yf
There are approaches which suggest using gaussian process being called for every run of MCMC sampling for calibration. They combine both Simulation as well as Field data for this purpose.
That solution is fully Bayesian approach as learning from previous step will be used for next run using gaussian process as simulator.
But now challenge for me is that i want Bayesian approach,but now i have pre-trained Gaussian Regressor model{n(xc,Q)} using (xc,Q) from simulated data, and now i am not merging simulated and field data.
Now i left with xf,yf and n(xc,Q){pretrained model}.
I know the prior for all Q i.e Uniform(0,1)
- Now i stuck with no idea of how to use pretrained Gaussian model
to find calibrated Q, which will be best to describe yf. - Currently i am looking for pymc3 for this approach , but not finding
enough support in documents for this kind of approach.
Below are the sample field-data:
yf,xf1,xf2,xf3
1562,1.69166666666667,37.9301075268817,204.021505376344
1395,0.0610119047619034,51.1919642857143,181.846726190476
1279,4.35067204301076,50.6075268817204,219.407258064516
Link for trained Gaussian model from simulated-data: https://github.com/naitikshukla/BEEHub/tree/master/initial/finalized_model_gp.pkl
Currently trying below code in pymc3:
if __name__ == '__main__':
#Load Model
gp = joblib.load('beehub/testdataset/finalized_model_gp.pkl')
#Load Data
xf,yf,n = load_obs_data('DATAFIELD.csv',3) #UDF provide filename and number of xf
with pm.Model() as model:
yf_flt = yf.values.reshape(1,12)
Q1 = pm.Uniform('Q1',0,1, observed = yf_flt)
Q2 = pm.Uniform('Q2',0,1, observed = yf_flt)
Q3 = pm.Uniform('Q3',0,1, observed = yf_flt)
lambda_e = pm.Gamma('Noise',10,0.03)
y = pm.MvNormal('likelihood', observed=yf_flt, mu={use gp here like gp.predict(xf,Q1,Q2,Q3)}?, cov=lambda_e*tt.eye(n), shape=(n))
#running MCMC sampling
start = pm.find_MAP()
step = pm.NUTS()
trace_ = pm.sample(500, step,progressbar=True,discard_tuned_samples=False)
trace = trace_[500:]
#plot Noise and delta graph
pm.traceplot(trace,varnames=['Q1','Q2','Q3'])
# Draw samples from posterior
ppc1 = pm.sample_ppc(trace, samples=500, model=model)
[Note]: Script to train GP model also given in github link (train_gp.py), trained using numpy arrays in data.
!! Sorry for long post , but I am not sure what to ask exactly so tried to write as much i can understand. Please correct me if i am wrong anywhere in my assumption, but basic thing is simple to :
use pretrained model to find best Q parameter for field data yf. !!