Neural Nets for solving Inverse Problems using PyMC

I’m trying to update a method used in High Energy Particle Physics to solve inverse problems in relation to data recorded at a particle detector, such as what would be used in the Large Hadron Collider. The aim is to derive the distribution that generated the observed data which has been distorted or ‘smeared’ by the effects of the detector.

I have adapted a method that is used on a toy datasets of a Gaussian distribution with noise component. This method uses two neural nets to iteratively use a prior and artifial detector effects to generate weights, and then apply these weights to the observed data to try and replicate the unknown distribution.

It has been suggested to me that I could transition to using a Bayesian Neural Net that has a set of distributional weights for each layer using PyMC. But I am at a loss of have to implement this. Ultimately I would like to have a probablistic estimate for the unfolded distribution, rather than a point estimates.

If anyone is able to advice on how I might go about tackling this challenge or perhaps some ideas to try I would be very grateful!

Thanks for any feedback!
Steve

import numpy as np
from matplotlib import pyplot as plt
from matplotlib.pyplot import figure
from keras.layers import Dense, Input
from keras.models import Model
from sklearn.model_selection import train_test_split
np.random.seed(666)
#This would be the data if there was no detector effects
gen_x = np.random.normal(0.3, 2.5,1000)
# By putting this random noise on each data point we are 'smearning' the data. This is what would be recorded at the detector
sim_y = [event+np.random.normal(-2.5,.2) for event in gen_x]
# This is the unknown distribution we are trying to replicate
truth_x = np.random.normal(1.2, 3.,1000)
# This represents the observed data recorded at the detector that has been 'smeared'.
data_y = [event+np.random.normal(-2.5,.2) for event in truth_x]
# Plot showing the binned data. The shaded histograms are the target distributions, the lined data is the smeared data that would be observed at the detector
variable_min = np.percentile(gen_x, 0.5)
variable_max = np.percentile(gen_x, 99.5)

figure(figsize=(12, 8))

_,_,_=plt.hist(gen_x,bins=np.linspace(variable_min, variable_max,16),color='blue',alpha=0.5,label="Gen")
_,_,_=plt.hist(sim_y,bins=np.linspace(variable_min, variable_max,16),histtype="step",color='blue',label="Sim")
_,_,_=plt.hist(truth_x,bins=np.linspace(variable_min, variable_max,16),color='orange',alpha=0.5,label="Truth")
_,_,_=plt.hist(data_y,bins=np.linspace(variable_min, variable_max,16),histtype="step",color='orange',label="Data")
plt.xlabel("x")
plt.ylabel("events")
plt.legend(frameon=False)
# Simple model used in the unfolding
inputs = Input((1, ))
hidden_layer_1 = Dense(50, activation='relu')(inputs)
hidden_layer_2 = Dense(50, activation='relu')(hidden_layer_1)
hidden_layer_3 = Dense(50, activation='relu')(hidden_layer_2)
outputs = Dense(1, activation='sigmoid')(hidden_layer_3)
model = Model(inputs=inputs, outputs=outputs)
# function to reweight the data
def reweight(events,model,batch_size=100):
    f = model.predict(events, batch_size=batch_size)
    weights = f / (1. - f)
    return np.squeeze(np.nan_to_num(weights))

# function to unfold the data sets
def unfold(gen_x, sim_y ,data_y, iterations, model, verbose=0):

    weights = np.empty(shape=(iterations, 2, len(gen_x)))
    
    labels0 = np.zeros(len(gen_x))
    labels_unknown = np.ones(len(data_y))
    
    xvals_1 = np.concatenate((sim_y, data_y))
    yvals_1 = np.concatenate((labels0, labels_unknown))

    xvals_2 = np.concatenate((gen_x, gen_x))
    yvals_2 = np.concatenate((labels0, labels_unknown))

    # initial iterative weights are ones
    weights_pull = np.ones(len(sim_y))
    weights_push = np.ones(len(sim_y))
    
    for i in range(iterations):
        print("\nITERATION: {}\n".format(i + 1))
        
        # STEP 1: classify Sim. (which is reweighted by weights_push) to Data
        # weights reweighted Sim. --> Data
            
        weights_1 = np.concatenate((weights_push, np.ones(len(data_y))))

        X_train_1, X_test_1, Y_train_1, Y_test_1, w_train_1, w_test_1 = train_test_split(xvals_1, yvals_1, weights_1)  

        model.compile(loss='binary_crossentropy',
                      optimizer='Adam',
                      metrics=['accuracy'])
        model.fit(X_train_1,
                  Y_train_1,
                  epochs=100,
                  batch_size=100,
                  validation_data=(X_test_1, Y_test_1),
                  verbose=0)

        weights_pull = weights_push * reweight(sim_y, model)
        weights[i, :1, :] = weights_pull

        # STEP 2: classify Gen. to reweighted Gen. (which is reweighted by weights_pull)
        # weights Gen. --> reweighted Gen.

        weights_2 = np.concatenate((np.ones(len(gen_x)), weights_pull))
        # ones for Gen. (not MC weights), actual weights for (reweighted) Gen.

        X_train_2, X_test_2, Y_train_2, Y_test_2, w_train_2, w_test_2 = train_test_split(xvals_2, yvals_2, weights_2)   
        
        model.compile(loss='binary_crossentropy',
                      optimizer='Adam',
                      metrics=['accuracy'])
        model.fit(X_train_2,
                  Y_train_2,
                  epochs=50,
                  batch_size=200,
                  validation_data=(X_test_2, Y_test_2),
                  verbose=0)
        
        weights_push = reweight(gen_x,model)
        weights[i, 1:2, :] = weights_push
        pass
        
    return weights
iterations = 6
myweights = unfold(gen_x, sim_y, data_y, iterations, model)
figure(figsize=(12, 8))

_,_,_=plt.hist(data_y,bins=np.linspace(variable_min, variable_max,16),color='blue',alpha=0.5,label="Data")
_,_,_=plt.hist(truth_x,bins=np.linspace(variable_min, variable_max,16),color='orange',alpha=0.5,label="Truth")
_,_,_=plt.hist(gen_x,weights=myweights[-1, 0, :], bins=np.linspace(variable_min, variable_max,16),color='grey',histtype="step",label="Unfolded", lw="2")

plt.xlabel("X")
plt.ylabel("Events")
plt.legend()

I don’t have a lot of experience working with Bayesian neural nets, but I’ve read trying to do full MCMC on them is very difficult, so ADVI is the way to go. There’s an example of a bayesian neural network using ADVI in the examples gallery here. The exact same model as the example can be sampled however you wish, though.

The example uses tanh activations, but aesara has the full suite of neural network stuff in aesara.tensor, for example aesara.tensor.nnet.relu and aesara.tensor.sigmoid (or pm.invlogit, which is equivalent). Check the aesara docs for more info.

2 Likes

Thank you Jesse! I’ll have a look at the ADVI and see if I can carry over anything from the example using aesara.

Really appreacciate your input!

1 Like