I have been looking into data fusion methods and what caught my eyes is the idea of using Kalman filter which looks into data fusion data which looks into mean and variance of Gaussian distribution and implements the prediction and correction from weak sensor to stronger/more accurate sensor. I looked into the following GitHub links to get a further understanding of fusion techniques:
So, in the first link, I found they were talking about the discrete Bayesian filter, but, they didn’t mention about the continuous Bayesian filter. So, I thought to do the same steps with the idea from Kalman filter to implement a continuous Bayesian filter with the help of PyMC3 package. The steps involved can be found in the second link and code is below.
mean = mean0
var = var0
def predict(var, mean, varMove, meanMove):
new_var = var + varMove
new_mean= mean+ meanMove
return new_var, new_mean
def correct(var, mean, varSensor, meanSensor):
new_mean=(varSensor*mean + var*meanSensor) / (var+varSensor)
new_var = 1/(1/var +1/varSensor)
return new_var, new_mean
# Predict
var, mean = predict(var, mean, varMove, distances[m])
#print('mean: %.2f\tvar:%.2f' % (mean, var))
plt.plot(x,mlab.normpdf(x, mean, var), label='%i. step (Prediction)' % (m+1))
# Correct
var, mean = correct(var, mean, varSensor, positions[m])
print('After correction: mean= %.2f\tvar= %.2f' % (mean, var))
plt.plot(x,mlab.normpdf(x, mean, var), label='%i. step (Correction)' % (m+1))
So, from the data that I am using, I implemented the following link,
Fitting empirical distribution to theoretical ones with Scipy
from the link above and the distribution list, Hypersecant distribution fits with my data. To implement the data fusion idea is to be able to get the initial data and then use the parameters from the first model and implement them into the prediction models and collect parameters from the prediction model to the correction model. The first thing I had to do was to create the custom distribution to create the Hypersecant distribution. For the custom hypersecant distribution, it can be found in the following question:
Creating a Custom Hypersecant distribution
After creating the custom distribution, then the bayesian fusion can be done. Here is my approach to the fusion (I apologise for the lengthy code),
def S2_0_Bayesian_Interface(data):
########################################################################################################################
with pm.Model() as model_0:
# Prior Distributions for unknown model parameters:
sigma_0 = pm.HalfNormal('sigma_0', sd=5)
mu_0 = pm.Normal('mu_0', mu=0, sd=5)
# Observed data is from a Likelihood distributions (Likelihood (sampling distribution) of observations):
observed_data_0 = Hypsecant('observed_data_0', mu=mu_0, sigma=sigma_0, observed=data)
# draw 5000 posterior samples
trace_0 = pm.sample(draws=1000, tune=500, target_accept = 0.9, chains=3, cores=1, progressbar=True)
# Obtaining Posterior Predictive Sampling:
post_pred_0 = pm.sample_posterior_predictive(trace_0, samples=1000)
print(post_pred_0['observed_data_0'].shape)
print('\nSummary: ')
print(pm.stats.summary(data=trace_0))
print(pm.stats.summary(data=post_pred_0))
########################################################################################################################
return trace_0, post_pred_0
def S2_P_Bayesian_Interface(mu, sigma, data, trace):
########################################################################################################################
with pm.Model() as model_P:
# Prior Distributions for unknown model parameters from posterior distribution:
sigma_P = pm.HalfNormal('sigma_P', sigma=np.std(sigma))
mu_P = pm.Normal('mu_P', mu=np.mean(mu), sigma=np.std(mu))
# Observed data is from a Likelihood distributions (Likelihood (sampling distribution) of observations):
observed_data_P = Hypsecant('observed_data_P', mu=mu_P, sigma=sigma_P, observed=data)
# draw 5000 posterior samples
trace_P = pm.sample(draws=1000, tune=500, target_accept = 0.9, chains=3, cores=1, progressbar=True)
# Obtaining Posterior Predictive Sampling:
post_pred_P = pm.sample_posterior_predictive(trace_P, samples=1000)
# post_pred_P = pm.sample_posterior_predictive(trace_P, samples=1500)
print(post_pred_P['observed_data_P'].shape)
print('\nSummary: ')
print(pm.stats.summary(data=trace_P))
print(pm.stats.summary(data=post_pred_P))
########################################################################################################################
return trace_P, post_pred_P
def S1_C_Bayesian_Interface(mu, sigma, data, trace):
########################################################################################################################
with pm.Model() as model_C:
# Prior Distributions for unknown model parameters from posterior distribution:
sigma_C = pm.HalfNormal('sigma_C', sigma=np.std(sigma))
mu_C = pm.Normal('mu_C', mu=np.mean(mu), sigma=np.std(mu))
# Observed data is from a Likelihood distributions (Likelihood (sampling distribution) of observations):
observed_data_C = Hypsecant('observed_data_C', mu=mu_C, sigma=sigma_C, observed=data)
# draw 5000 posterior samples
trace_C = pm.sample(draws=1000, tune=500, target_accept = 0.9, chains=3, cores=1, progressbar=True)
# Obtaining Posterior Predictive Sampling:
post_pred_C = pm.sample_posterior_predictive(trace_C, samples=1000)
print(post_pred_C['observed_data_C'].shape)
print('\nSummary: ')
print(pm.stats.summary(data=trace_C))
print(pm.stats.summary(data=post_pred_C))
########################################################################################################################
return trace_C, post_pred_C
# Initializing the PDF Distribution:
print('#######################################################################################################')
print('Designing the Bayesian PDF for initial Sensor 2:')
trace_S2_0, post_pred_S2_0 = S2_0_Bayesian_Interface(data=S2_data_list[0])
best_dist = getattr(st, 'hypsecant')
param_S2_0 = [np.mean(trace_S2_0['mu_0']), np.mean(trace_S2_0['sigma_0'])]
print('Parameters values = ', param_S2_0)
pdf_S2_0, B_x_axis_pdf_S2_0, y_axis_pdf_S2_0 = make_pdf(data=S2_data_list[0], dist=best_dist, params=param_S2_0, size=len(trace_S2_0))
print('#######################################################################################################')
plt.plot(B_x_axis_pdf_S2_0, y_axis_pdf_S2_0, label='%i. step (Initial[S2])' % (m + 1))
for m in range(len(S2_data_list)):
print('\n')
print('Acessing list number: ', m)
# Predict:
print('#######################################################################################################')
print('Designing the Bayesian PDF for predictive Sensor 2:')
trace_S2_P, post_pred_S2_P = S2_P_Bayesian_Interface(mu=trace_S2_0['mu_0'], sigma=trace_S2_0['sigma_0'], data=S2_data_list[m])
best_dist = getattr(st, 'hypsecant')
param_S2_P = [np.mean(trace_S2_P['mu_P']), np.mean(trace_S2_P['sigma_P'])]
print('Parameters values = ', param_S2_P)
pdf_S2_P, B_x_axis_pdf_S2_P, y_axis_pdf_S2_P = make_pdf(data=S2_data_list[m], dist=best_dist, params=param_S2_P, size=len(trace_S2_P))
print('#######################################################################################################')
plt.plot(B_x_axis_pdf_S2_P, y_axis_pdf_S2_P, label='%i. step (Prediction[S2])' % (m + 1))
# Correct:
print('#######################################################################################################')
print('Designing the Bayesian PDF for correction Sensor 1:')
trace_S1_C, post_pred_S1_C = S1_C_Bayesian_Interface(mu=trace_S2_P['mu_P'], sigma=trace_S2_P['sigma_P'], data=S1_data_list[m])
best_dist = getattr(st, 'hypsecant')
param_S1_C = [np.mean(trace_S1_C['mu_C']), np.mean(trace_S1_C['sigma_C'])]
print('Parameters values = ', param_S1_C)
pdf_S1_C, B_x_axis_pdf_S1_C, y_axis_pdf_S1_C = make_pdf(data=S1_data_list[m], dist=best_dist, params=param_S1_C, size=len(trace_S1_C))
print('#######################################################################################################')
plt.plot(B_x_axis_pdf_S1_C, y_axis_pdf_S1_C, label='%i. step (Correction[S1])' % (m + 1))
corr_pdf_S1 = y_axis_pdf_S1_C
corr_pdf_S1_list.append(corr_pdf_S1)
print('Corrected PDF: ', corr_pdf_S1)
print('Length of Corrected PDF list = ', len(corr_pdf_S1_list))
x_corr_pdf_list.append(B_x_axis_pdf_S1_C)
However, I am not sure if I am doing the steps correctly. Here are my questions:
- Am I approaching the idea correctly? If not, how can I modify the code? Also, if possible share a tutorial explaining the idea of bayesian fusion/bayesian updating?
- Everytime I re-run the code, I keep on getting different distributions when fusing the data together. Could the problem lie within the
def random
section in my custom distribution? or could it be because everytime I run the code I would get different values inmu
andsigma
which could cause the reason for different fused distribution? - Could it be because of the mu code line, e.g.
mu=np.mean(mu), sigma=np.std(mu)
where it should bemu=np.mean(mu), sigma=np.std(sigma)
?
P.S:
for the make_pdf
code it can be found in the Fitting empirical distribution to theoretical ones with Scipy.