I am trying to fit a model according to some data.
The data I have are supposed to obey the following formula:
\mu_{g, k} = (\mu_0 + \frac{\alpha_g}{\delta_g})\exp(-\delta_g t_k) + \frac{\alpha_g}{\delta_g} + \beta_g \exp(-\delta_g t_k) \sum_{i=0}^{k-1} \exp(\delta_g t_i) \frac{\eta_i}{\gamma_g+\eta_i} (t_{i+1}-t_i)
The parameters are as follows: (\alpha_g, \beta_g, \gamma_g, \delta_g) for g \in \{1, \cdots, 12\} and \eta_k for k \in \{0, \cdots, 9\}. The t_k for k \in \{0, \cdots, 10\} are given (time points where the data was collected).
I tried to implement the procedure in Python using the pymc
library. I’m totally new to the subject. I read a good part of the documentation, and totally the get started page.
I came up with the following code
# Estimate
# 1. for each of the 12 genes under consideration the 4 parameters encoding their expression level
# 2. and for the (putative) transcription factor the 10 values describing its activity over time
# using the find_MAP procedure of the pymc3 library.
from io import BytesIO
import matplotlib.pyplot as plt
plt.style.use('seaborn-darkgrid')
import numpy as np
import pymc3 as pm
print('Running on NumPy v{}'.format(np.__version__))
print('Running on PyMC3 v{}'.format(pm.__version__))
Binary = open('data.txt').read().replace(',','.').encode()
expr = np.genfromtxt(BytesIO(Binary), missing_values = 'NA', filling_values = np.nan, skip_header = 1)
C, R, T, G = 2, 4, 11, 12 # condition, replicat, time, gene
exprcrtg = expr[ : , 3 : ].reshape(C, R, T, G)
time = np.array([0., 0.5, 1., 3.5, 6.5, 9.25, 12.5, 18.5, 22.5, 26.5, 30.5])
with pm.Model() as Basic_model:
# Priors for unknown model parameters
alpha = pm.HalfNormal('alpha', sd = 10, shape = G)
beta = pm.HalfNormal('beta' , sd = 10, shape = G)
gamma = pm.HalfNormal('gamma', sd = 10, shape = G)
delta = pm.HalfNormal('delta', sd = 10, shape = G)
eta = pm.HalfNormal('eta' , sd = 10, shape = T-1)
sigma = pm.HalfNormal('sigma', sd = 10, shape = G)
# Expected value of outcome
alpha = alpha[np.newaxis, : ]
beta = beta [np.newaxis, : ]
gamma = gamma[np.newaxis, : ]
delta = delta[np.newaxis, : ]
eta = eta [ : ,np.newaxis]
time = time [ : ,np.newaxis]
mu0 = np.nanmean(exprcrtg[ : , : , 0, : ], axis = (0, 1))
integral = np.cumsum(np.exp(delta*time[:-1]) * eta / (gamma + eta) * (time[1:] - time[:-1]))
mu = (mu0 - alpha/delta)*np.exp(-delta*time[1:]) + alpha/delta + beta*np.exp(-delta*time[1:])*integral
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu = mu, sd = sigma, observed = exprcrtg[ : , : , 1: , : ])
# Maximum a posteriori estimate
map_estimate = pm.find_MAP(model = Basic_model)
print(map_estimate)
I pasted the data here.
I get the following error:
ValueError: setting an array element with a sequence.
To eliminate the possibility of a trival error, I’ve tried this bit of code outside of the pm.Model()
and it shows no error.
So my guess is that I am making a wrong use of pm.Model()
but I have no clue as to what it could be.
I am very willing to share more information about the model if it can help to solve my issue.