Doubts on linear model definition for curve fitting

I am using Pymc3 to fit the spectrum (light histogram) of galaxies. Basically you have a have a linear system:

A = w_{i} \cdot B

where A is your observed histogram, B is a user declared matrix of stellar histograms and w_i is the weight of each component: The variable you want to solve. Graphically, in the plot below you have the purple line and you want to measured how much the other curves have contributed to it.


Actually, @jrmullaney posted a similar problem whose issues I share: The simulation becomes much slower in time and with the number of components although I still get very good results.

The answers explain possible causes for this behaviour and I am trying to find the guilty one.

I would like to reparameterise the model but I am not sure which is the right direction: Since I have very good estimates for the priors mean values I could scale them to zero but this will cause troubles since physically the weights cannot be negative. Would the scale of the input data affect? Maybe the fact that some of the components are similar is affecting the speed but in that case I do not understand why if I increase the likelihood standard deviation the speed increases.

Any advice is most welcome.

(I have included working sample code here):

from os import environ
environ["MKL_THREADING_LAYER"] = "GNU"
import theano
import theano.tensor as tt
import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt

# Define model data
x_array, y_CurvesMatrix = np.arange(4700, 6200), np.loadtxt('stellar_spectra.txt')
weightsTrue = np.array([0.75, 1.1, 0.5, 0.2])       # Declare each curve weight
spectrumTrue = weightsTrue.dot(y_CurvesMatrix)      # Generate synthetic observation

# Add some noise
error_Pixel = 0.02
obsNoise = np.random.normal(0, error_Pixel, size=x_array.size)

# Observed spectrum curve
obsSpectrum = spectrumTrue + obsNoise
# Data for the priors
weights_priors = np.array([0.735, 1.15, 0.525, 0.192]) # Give distribution centers some offset
weights_std = weightsTrue * 0.10
n_weights = weightsTrue.size

# Std for the likelihood
curveUncertainty_array = error_Pixel * np.ones(x_array.size)

# Declare theano variables
curvesMatrix_tt = theano.shared(y_CurvesMatrix)

# Pymc3 model
with pm.Model() as model:

    # Weights prior
    w_i = pm.Normal('w_i', mu=weights_priors, sd=weights_std, shape=n_weights)

    #Synthetic emission of the model
    spectrum = w_i.dot(curvesMatrix_tt)

    # Likelihood
    Y = pm.Normal('Y', mu=spectrum, sd=curveUncertainty_array, observed=obsSpectrum)

    # Launch model
    model.check_test_point()
    trace_nuts = pm.sample(1000, tune=500, step=pm.NUTS())


# Plot model output data
fig, ax = plt.subplots(1, 1, figsize=(10, 5))
for i in np.arange(y_CurvesMatrix.shape[0]):
    ax.plot(x_array, y_CurvesMatrix[i,:] * weightsTrue[i], label=r'$Curve_{{\,{}}}\cdot w_{{{}}}$'.format(i,i), linestyle=':')
ax.plot(x_array, obsSpectrum, label='Observed spectrum')
ax.plot(x_array, weightsFit_nutts.dot(y_CurvesMatrix), label='Fit spectrum NUTS', linestyle='-')
ax.update({'xlabel': 'Wavelength (nm)', 'ylabel': 'Flux (normalised)'})
ax.legend()
plt.show()

Here is the data stellar_spectra.txt (146.5 KB)

Scaling the predictor would likely help, you dont need to be too hang up on the output that the weight being negative and thus physically impossible. If you think of the parameters space being defined on some coordinate system, as long as the rule of the transformation is clear, it doesn’t matter that much that in some coordinate system the values are negative. At the end of the model fitting, you would most likely inference/interpret the parameters after transformation and satisfy the physical requirement.