Fitting a spectra of gaussians

TFSM · February 1, 2018, 8:42pm

Ultimately I would like to do model selection to find the number of atomic emission spectral lines in a measured spectrum. I would first like to be able to specify the number of gaussians to fit to a spectrum. I see there is a normal mixture and maybe thats what I should be using but I can’t make sense of it. I have been able to fit data with two peaks with the but with the code below I would have to make a function every time I want to specify a different number of gaussians. What I want is a way to have a way to specify an arbitrary number of gaussians.

def gaussian(x,amp,cent,sig):
    return amp*np.exp( -1*(x-cent)**2/2/sig**2)

amp = 1
cent = 0
sig = 0.1
x = np.linspace(-2,2,100)
f_true = gaussian(x,amp,cent,sig) + gaussian(x,1,-1,sig)
noise = np.random.normal(size=len(x))*.002
f = f_true + noise

with Model() as gaussian_model:

    #Parameters with priors
    amp = Uniform('amplitude',lower=0,upper=2)
    cent= Uniform('centroid',lower=-2,upper=2)
    sig = Uniform('sigma',lower=0, upper=4)
    amp2 = Uniform('amplitude2',lower=0,upper=2)
    cent2= Uniform('centroid2',lower=-2,upper=2)
    sig2 = Uniform('sigma2',lower=0, upper=4)

    sigma = Uniform('sigmat',lower=0,upper=1)

    gauss = Deterministic('gauss',amp*np.exp( -1*(cent -x)**2/2/sig**2)+amp2*np.exp( -1*(cent2 -x)**2/2/sig2**2))
    likeli = Normal('y', mu = gauss, sd=sigma,observed = f)

    start=find_MAP()
    step=NUTS()
    trace=sample(1000,step,start=start)

I tried to do something like the snippet below but this caused a huge slow down as well as a terrible fit.

def multi_gaussian(x,amp,cent,sig):
    function=0
    for i in range(0,len(amp)):
        function = function + amp[i]*np.exp(-1*(x-cent[i])**2/2/sig[i]**2)

junpenglao · February 2, 2018, 7:48am

You shouldn’t be using NormalMixture for this kind of data. In short, if you plot the histogram of the data and it looks like a mixture of gaussian, then you should use NormalMixture. Otherwise, what you are trying to do is estimating some parameter of a nonlinear function.

What you are doing is on the right track, trying to implement a better mixture gaussian function should help:

from pymc3.math import logsumexp
import theano.tensor as tt
def mixture_density(w, mu, sd, x):
    logp = tt.log(w) + pm.Normal.dist(mu, sd).logp(x)
    return tt.sum(tt.exp(logp))

w, mu, sd should be theano vector. So you will need to do for example: w = tt.stack([amp, amp2])

TFSM · February 2, 2018, 5:51pm

Thanks, I think I should be able to work with this. Did I just miss documentation on this? I searched for a few days and came up with nothing.

junpenglao · February 2, 2018, 9:40pm

Not for this specific usage. We have a few examples on mixture models but that’s more to model the distribution (historgram) of data.

TFSM · February 7, 2018, 9:35pm

I am confused what is going on with the theano mixture model. How does it know the grid to put the function on? It looks like ‘x’ in your mixture model is not the same x array that I have in the code above?

Are there any examples of fitting non linear functions? I cannot find any after a pretty extensive search.

junpenglao · February 7, 2018, 11:05pm

It is determined by the input x, as the function evaluates on x_1, x_2,…, x_n

You are right there is no good collection of examples dedicated to fitting non-linear functions. I put up a small Gist you might find helpful: Nonlinear_funcs.ipynb · GitHub

StanczakDominik · January 10, 2020, 11:14am

Interestingly, rerunning the last cell in the gist with sample(random_seed=1), it appears that sometimes the two chains disagree on which Gaussian is which:

>>> values = trace.get_values('mu', combine=False)
>>> values[0][-1], values[1][-1]
(array([-1.05009326,  2.59436901]), array([ 2.59135272, -1.01800517]))

This results in r_hat values on each of the n_shape=2 parameters (mu, w, sd) of 1.83 and warnings. Is there anything that can be done to avoid that?

Topic		Replies	Views
Modeling a Spectrum of Gaussians with SMC v5 modeling , smc	11	164	August 5, 2024
Line fitting for synthetic spectra using pymc3 Questions	1	891	January 7, 2020
Fitting Gaussian line using posterior trace samples to double peak gaussian data graph v5 gaussian_process , modeling	10	285	April 3, 2024
Switching between class distributions Questions	2	774	January 12, 2020
Fitting a histogram Questions	8	1604	November 27, 2020

Fitting a spectra of gaussians

Related topics