I am trying to fit a series of parameter inputs to a function where the inputs are able to change over time (and eventually space).
The items follow a trajectory which can be approximated by:
Q is a quantity that varies for each item over time.
are assumed constant for each item.
Assuming each item i is pulled sequentially in time and some form of a gaussian process sets the
q_i, d_i, b_i for the item then:
where t_i is the time that sample i was drawn.
As an example the following code creates a data set
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
rng = np.random.default_rng(seed=0)
n = 30 # number of samples
t_i = np.linspace(0,100, n)[:, None]
# the samples are pulled at even intervals of time between t=0 and t=100
# create a generic function w.r.t. time
ℓ_true = 10.0
η_true = 3.0
cov_func = (η_true ** 2) * pm.gp.cov.Matern52(1, ℓ_true)
mean_func = pm.gp.mean.Zero()
f_true = rng.multivariate_normal(
mean_func(t_i).eval(),
cov_func(t_i).eval() + 1e-8*np.eye(n),
1
).flatten()
# create the individual samples scaling the inputs as appropriate for my case
# give each parameter a little noise as well
q_is = 2000+f_true*100+rng.standard_t(10,size=n)*50
d_is = .03-f_true/500+rng.standard_t(10,size=n)*.002
b_is = .95+f_true/20+rng.standard_t(10, size=n)*.01
# the range of Q that samples are monitored through
Q = np.linspace(0,100000, 100)
def equation(Q, qi, di, bi):
return (qi**(-bi)*(Q*di*(bi-1)+qi))**(1/(1-bi))
# Create simulation where each row is a sample of qi, di, bi
# and each column is the value of the equation at the respective Q
sample_data = equation(
np.tile(Q,(len(q_is),1)),
np.tile(q_is,(len(Q),1)).T,
np.tile(d_is, (len(Q),1)).T,
np.tile(b_is, (len(Q), 1)).T
)
# separate the feature data (time in this case) for each item in a dataframe
feature_data = pd.DataFrame(
np.hstack([np.arange(n)[:, None],t_i]),
columns = ['sample_id', 'sample_init_time']
)
# create a dataframe identifying sample_id with the equation output and respective Q
# first flatten and expand the arrays appropriately
state = sample_data.ravel()
sample_id =np.tile(np.arange(n), (len(Q),1)).T.ravel()
Qs = np.tile(Q, (n,1)).ravel()
state_data = pd.DataFrame(
np.array([sample_id, state, Qs]).T,
columns = ['sample_id', 'state', 'Q']
)
# add noise to the true state
state_data["measured_values"] = state_data['state']+100*rng.standard_t(3, size=len(state_data))
# clip negative values because in this case they would not be possible
state_data["measured_values"] = state_data['measured_values']*(state_data['measured_values']>0)
state_data['sample_id'] = state_data['sample_id'].astype(int)
Not sure if that is clear but hopefully you can see that we take a sample and we watch the decay. Some time later we take a different sample and that decay profile might be similar to the previous decay profile or a least we would expect it to be.
Given that this process happens what are the best estimates of qi, di, bi for each sample. Now the model
with pm.Model() as model:
# I start off with a lot of the variables appropriately scaled
# so that the sampler is likely drawing from random variables that are within
# a couple of orders of magnitude between the largest and smallest variables.
# I have read that can help
q_scaled = pm.HalfCauchy("qi_scaled", 2)
d_scaled = pm.HalfCauchy("d_scaled", 2)
b_scaled = pm.Beta("b_scaled", alpha=1, beta=1, testval=.65)
# have to set a test value different than mean because function is undefined at b=1
# when this is rescaled by multiplying by 2 the mean would be 1 without that
ls = pm.InverseGamma("ls", 1, 1)
q_η = pm.HalfCauchy("q_η", 1)
d_η = pm.HalfCauchy("d_η", 1)
b_η = pm.HalfCauchy("b_η", 1)
q_cov = (q_η**2)*pm.gp.cov.ExpQuad(1, ls)
q_gp = pm.gp.Latent(cov_func=q_cov)
q_t = q_gp.prior("q_t", X= feature_data.sample_init_time.values[:,None])
d_cov = (d_η**2)*pm.gp.cov.ExpQuad(1, ls)
d_gp = pm.gp.Latent(cov_func=d_cov)
d_t = d_gp.prior("d_t", X= feature_data.sample_init_time.values[:,None])
b_cov = (b_η**2)*pm.gp.cov.ExpQuad(1, ls)
b_gp = pm.gp.Latent(cov_func=b_cov)
b_t = b_gp.prior("b_t",X= feature_data.sample_init_time.values[:,None])
qi = pm.Deterministic("qi", (q_scaled+q_t)*1000)
di = pm.Deterministic("di", (d_scaled+d_t)/100)
bi = pm.Deterministic("bi", (b_scaled+b_t)*2)
q = qi[state_data.sample_id.values]
d = di[state_data.sample_id.values]
b = bi[state_data.sample_id.values]
base = q**(-b)*(state_data.Q.values*d*(b-1)+q)
exponent = (1/(1-b))
# add potential where base goes negative because base**exponent is likely to be undefined
undefined = pm.Potential("undefined", pm.Exponential.dist(1).logp(-base*(base<0)))
base_ = base*(base>0)
mu = base_**exponent
ν = pm.InverseGamma("measurement_nu", 1,1)
σ = pm.HalfCauchy("measurement_sigma_scaled", 1)
obs = pm.StudentT(
"observed",
mu=mu,
sigma=σ*100,
nu=ν,
observed =state_data.measured_values.values
)
and then trying to sample
with model:
trace=pm.sample(tune=1000, draws=1000, return_inferencedata=True)
I get the failure
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [measurement_sigma_scaled, measurement_nu, b_t_rotated_, d_t_rotated_, q_t_rotated_, b_η, d_η, q_η, ls, b_scaled, d_scaled, qi_scaled]
0.00% [0/8000 00:00<00:00 Sampling 4 chains, 0 divergences]
---------------------------------------------------------------------------
RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\compile\function\types.py", line 974, in __call__
self.fn()
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\graph\op.py", line 476, in rval
r = p(n, [x[0] for x in i], o)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\tensor\slinalg.py", line 253, in perform
rval = scipy.linalg.solve_triangular(A, b, lower=False)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\scipy\linalg\basic.py", line 334, in solve_triangular
b1 = _asarray_validated(b, check_finite=check_finite)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\scipy\_lib\_util.py", line 262, in _asarray_validated
a = toarray(a)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\numpy\lib\function_base.py", line 488, in asarray_chkfinite
raise ValueError(
ValueError: array must not contain infs or NaNs
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\pymc3\parallel_sampling.py", line 137, in run
self._start_loop()
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\pymc3\parallel_sampling.py", line 191, in _start_loop
point, stats = self._compute_point()
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\pymc3\parallel_sampling.py", line 216, in _compute_point
point, stats = self._step_method.step(self._point)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\pymc3\step_methods\arraystep.py", line 276, in step
apoint, stats = self.astep(array)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\pymc3\step_methods\hmc\base_hmc.py", line 139, in astep
start = self.integrator.compute_state(q0, p0)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\pymc3\step_methods\hmc\integration.py", line 44, in compute_state
logp, dlogp = self._logp_dlogp_func(q)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\pymc3\model.py", line 734, in __call__
output = self._theano_function(array)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\compile\function\types.py", line 987, in __call__
raise_with_op(
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\link\utils.py", line 508, in raise_with_op
raise exc_value.with_traceback(exc_trace)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\compile\function\types.py", line 974, in __call__
self.fn()
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\graph\op.py", line 476, in rval
r = p(n, [x[0] for x in i], o)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\tensor\slinalg.py", line 253, in perform
rval = scipy.linalg.solve_triangular(A, b, lower=False)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\scipy\linalg\basic.py", line 334, in solve_triangular
b1 = _asarray_validated(b, check_finite=check_finite)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\scipy\_lib\_util.py", line 262, in _asarray_validated
a = toarray(a)
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\numpy\lib\function_base.py", line 488, in asarray_chkfinite
raise ValueError(
ValueError: array must not contain infs or NaNs
Apply node that caused the error: Solve{A_structure='upper_triangular', lower=False, overwrite_A=False, overwrite_b=False}(InplaceDimShuffle{1,0}.0, Elemwise{Composite{((i0 * i1) - i2)}}[(0, 0)].0)
Toposort index: 237
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
Inputs shapes: [(30, 30), (30, 30)]
Inputs strides: [(240, 8), (240, 8)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[InplaceDimShuffle{1,0}(Solve{A_structure='upper_triangular', lower=False, overwrite_A=False, overwrite_b=False}.0)]]
Backtrace when the node is created(use Theano flag traceback__limit=N to make it longer):
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\gradient.py", line 1393, in access_grad_cache
term = access_term_cache(node)[idx]
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\gradient.py", line 1061, in access_term_cache
output_grads = [access_grad_cache(var) for var in node.outputs]
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\gradient.py", line 1061, in <listcomp>
output_grads = [access_grad_cache(var) for var in node.outputs]
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\gradient.py", line 1393, in access_grad_cache
term = access_term_cache(node)[idx]
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\gradient.py", line 1061, in access_term_cache
output_grads = [access_grad_cache(var) for var in node.outputs]
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\gradient.py", line 1061, in <listcomp>
output_grads = [access_grad_cache(var) for var in node.outputs]
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\gradient.py", line 1393, in access_grad_cache
term = access_term_cache(node)[idx]
File "C:\Users\rlarson\Anaconda3\envs\complex_prod\lib\site-packages\theano\gradient.py", line 1220, in access_term_cache
input_grads = node.op.L_op(inputs, node.outputs, new_output_grads)
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
"""
I have tried this with a linear model and do not get this type of error. There is a little bug where b could be close to 1 causing \frac{1}{1-b} to be undefined but without any samples it is hard to tell if this is the spot that is causing the issue. If it is are there any ideas on how best to smooth over that spot?
Update: I used a cubic spline function to smooth over the discontinuity. The spline allows for both a smooth function and a smooth gradient with respect to b. I chose to use the spline for values of b
The error still happens. On the other hand if I initialize with find_MAP it runs.
Thanks in advance.