Hello Everybody, I need a bit of help regarding a project that employs Bayesian modeling to determine parameters for fluorescence decay of excited fluorophores for use in FRET modeling.
(I am running PyMC 5.16.2 with numpy 1.26 on a MacBook with Apple Silicon if this serves any value)
My current model code is as follows:
def fluorescenceModel(r, t, R_0, f_D, data, numGauss, numComponents) -> pm.Model:
deltaR = r[1] - r[0]
forsterAcceleration = 1 + np.power(R_0 / r, 6)
T = t[:, np.newaxis]
with pm.Model() as fluorescentModel:
tauFluorescence = pm.TruncatedNormal('rateFluorescence', mu=70, sigma=8, lower=0.0, shape=numComponents)
alphaFluorescence = pm.Dirichlet('alphaF', a=np.ones(numComponents), shape=numComponents)
alphaDistance = pm.Dirichlet('alphaD', a=np.ones(numGauss), shape=numGauss)
muDistance = pm.TruncatedNormal('r_0D', mu=50, sigma=60, lower=0.0, upper=120, shape=numGauss)
sigmaDistance = pm.TruncatedNormal('sigmaD', mu=10, sigma=13, lower=0.0, upper=30, shape=numGauss)
K_Transpose = (1 / tauFluorescence)[np.newaxis, :] # 1 x m vector/matrix
Alpha = alphaFluorescence[:, np.newaxis]
lambdaArg = - T @ K_Transpose # n x m
Lambda = np.exp(lambdaArg)
I_D = Lambda @ Alpha # n x 1, where n = t.size
AcceleratedTimeMatrix = T @ forsterAcceleration[np.newaxis, :]
Epsilon = pt.zeros((t.size, r.size)) # n x k
for j in range(r.size):
Epsilon = pt.set_subtensor(Epsilon[j, :], expData(AcceleratedTimeMatrix[j, :], tauFluorescence, alphaFluorescence))
P = pt.as_tensor_variable(distDistribution(r, alphaDistance, muDistance, sigmaDistance, numGauss) * deltaR)
I_DA = Epsilon @ P
I_F = f_D * I_D + (1 - f_D) * I_DA
likelihood = pm.Poisson('likelihood', mu=I_F, observed=data[:, np.newaxis])
return fluorescentModel
with some helper functions I wrote being
def distDistribution(r, alpha, mu, sigma, numComp):
P = 0
for i in range(numComp):
P += (alpha[i] / sigma[i] ) * np.exp(- 0.5 * np.square( (r[:, np.newaxis] - mu[i]) / sigma[i]))
return P / np.sqrt(2 * np.pi)
and
def expData(
domain: np.ndarray,
tau: np.ndarray,
alphas: np.ndarray) -> np.ndarray:
arg = domain[:, np.newaxis] / tau[np.newaxis, :] # n x m array
arg = np.exp(-arg) # n x m array
ans = arg @ alphas[:, np.newaxis] # n x m @ m x 1 = n x 1 2D array
return ans.flatten()
The model works for miniscule ranges of inputs, such as when the r
and t
parameters are np.ndarrays of size 5 and 6, respectively. However, when I try increasing the size of the parameters (I obtain them using np.linspace
), the time required blows up massively to a point where I haven’t yet managed to obtain samples even after waiting for ~1 hour. I have verified through the use of rudimentary print
statements that the graph computations for the matrices are done fairly quickly ( < 0.01 sec) after the method is called with the appropriate arguments.
Does anyone have any recommendations on how I can possibly optimize my model? In addition, if anything seems glaringly wrong with the implementation, please do let me know.
(Briefly edited as a function definition was cutoff in OG)