Indexing using a free random variable

charlotte.burki · March 13, 2018, 10:26am

Hi,

I’m trying to use Pymc3 to find the appropriate number of componements used to predict a time serie of concentrations called aCH and aCOH. These are two distincts PLS models. Each model can use between 1 and 20 number of components. The predictions are stored in an ndarray with the time series prediction. aCH_[0,1] would be the predictions using 2 components. Such as:

This is similar for aCOH

The array is then shared with theano such as:

aCH_=theano.shared(aCH_)
aCOH_=theano.shared(aCOH_)

The model that I’m trying to tune is defined as:

OC=b1*aCH+aCOH

Now I tried to define my model with the number of components defined as free random variables from a discrete uniform distribution and the coefficient b1 used to multiply the of aCH prediction is defined as a uniform continuous free random variable.

from scipy import optimize
basic_model = pm.Model()

with basic_model:

    # Priors for unknown model parameters
    b1 = pm.Uniform('b1', lower=0.3, upper=0.5,testval=0.45).astype('float32')
    ncomp_aCH = pm.DiscreteUniform('ncomp_aCH', lower=0, upper=19,testval=10)
    ncomp_aCOH = pm.DiscreteUniform('ncomp_aCOH', lower=0, upper=19,testval=10)

    aCH=aCH_[0,ncomp_aCH]
    aCOH=aCOH_[0,ncomp_aCOH]


    out= b1*aCH+aCOH


    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=out, tau=sigma, observed=Y)
    step = pm.Metropolis()
    trace = pm.sample(100000, step)

where Y is a vector of true measurements and sigma a vector of my measurements uncertainties.

Now when I run my model I have a index out of bonds error. I also tried to used a Categorical distribution to generate my number of components free random variables as suggested here: Issue with DiscreteUniform distribution.
But I get the same error

Such as (just showing the beginning of the error message as it is very long:

RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
“”"
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\lib\site-packages\theano\compile\function_module.py”, line 903, in call
self.fn() if output_subset is None else
IndexError: index out of bounds

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\lib\site-packages\joblib_parallel_backends.py”, line 350, in call
return self.func(*args, **kwargs)
File “C:\ProgramData\Anaconda3\lib\site-packages\joblib\parallel.py”, line 131, in call
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File “C:\ProgramData\Anaconda3\lib\site-packages\joblib\parallel.py”, line 131, in
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File “C:\ProgramData\Anaconda3\lib\site-packages\pymc3\sampling.py”, line 526, in _sample
for it, strace in enumerate(sampling):
File “C:\ProgramData\Anaconda3\lib\site-packages\tqdm_tqdm.py”, line 959, in iter
for obj in iterable:
File “C:\ProgramData\Anaconda3\lib\site-packages\pymc3\sampling.py”, line 624, in _iter_sample
point, states = step.step(point)
File “C:\ProgramData\Anaconda3\lib\site-packages\pymc3\step_methods\compound.py”, line 24, in step
point, state = method.step(point)
File “C:\ProgramData\Anaconda3\lib\site-packages\pymc3\step_methods\arraystep.py”, line 162, in step
apoint, stats = self.astep(self.bij.map(point))
File “C:\ProgramData\Anaconda3\lib\site-packages\pymc3\step_methods\metropolis.py”, line 162, in astep
accept = self.delta_logp(q, q0)
File “C:\ProgramData\Anaconda3\lib\site-packages\theano\compile\function_module.py”, line 917, in call
storage_map=getattr(self.fn, ‘storage_map’, None))
File “C:\ProgramData\Anaconda3\lib\site-packages\theano\gof\link.py”, line 325, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File “C:\ProgramData\Anaconda3\lib\site-packages\six.py”, line 692, in reraise
raise value.with_traceback(tb)
File “C:\ProgramData\Anaconda3\lib\site-packages\theano\compile\function_module.py”, line 903, in call
self.fn() if output_subset is None else
IndexError: index out of bounds
Apply node that caused the error: Subtensor{int64, int64}(<TensorType(float64, 3D)>, Constant{0}, ScalarFromTensor.0)
Toposort index: 11
Inputs types: [TensorType(float64, 3D), Scalar(int64), Scalar(int64)]
Inputs shapes: [(1, 20, 119), (), ()]
Inputs strides: [(19040, 952, 8), (), ()]
Inputs values: [‘not shown’, 0, 21]
Outputs clients: [[Elemwise{Composite{((i0 * sqr((i1 - (i2 + i3)))) + i4)}}(TensorConstant{[-0.372381…29548011]}, TensorConstant{[ 2.449864…74623214]}, Elemwise{mul,no_inplace}.0, Subtensor{int64, int64}.0, TensorConstant{[-2.825712…05703082]})]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File “C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py”, line 235, in dispatch_shell
handler(stream, idents, msg)
File “C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py”, line 399, in execute_request
user_expressions, allow_stdin)
File “C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\ipkernel.py”, line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File “C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\zmqshell.py”, line 533, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File “C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py”, line 2698, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File “C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py”, line 2802, in run_ast_nodes
if self.run_code(code, result):
File “C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py”, line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File “”, line 12, in
aCOH=aCOH_[0,ncomp_aCOH]

HINT: Use the Theano flag ‘exception_verbosity=high’ for a debugprint and storage map footprint of this apply node.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py”, line 119, in worker
result = (True, func(*args, **kwds))
File “C:\ProgramData\Anaconda3\lib\site-packages\joblib_parallel_backends.py”, line 359, in call
raise TransportableException(text, e_type)
joblib.my_exceptions.TransportableException: TransportableException

IndexError Tue Mar 13 11:29:00 2018
PID: 17988 Python 3.6.3: C:\ProgramData\Anaconda3\python.exe
…
C:\ProgramData\Anaconda3\lib\site-packages\joblib\parallel.py in call(self=<joblib.parallel.BatchedCalls object>)
126 def init(self, iterator_slice):
127 self.items = list(iterator_slice)
128 self.size = len(self.items)
129
130 def call(self):
–> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
self.items = [(, (0, True, 570756469, {'b1_interval_’: array(1.09861229), ‘ncomp_aCH’: array(10, dtype=int64), ‘ncomp_aCOH’: array(10, dtype=int64)}), {‘draws’: 100500, ‘live_plot’: False, ‘live_plot_kwargs’: None, ‘model’: <pymc3.model.Model object>, ‘step’: <pymc3.step_methods.compound.CompoundStep object>, ‘trace’: None, ‘tune’: 500})]
132
133 def len(self):
134 return self._size
135

junpenglao · March 13, 2018, 11:43am

Could you share your data as well?

charlotte.burki · March 13, 2018, 12:08pm

Yes sure, here are the data aCOH_, aCH_ , Y and sigma. These are numpy objects.

aCH.npy (18.7 KB)

aCOH.npy (18.7 KB)

charlotte.burki · March 13, 2018, 12:08pm

In two different posts because I’m not allowed to put more than two attachements.

sigma.npy (1.1 KB)

Y.npy (1.1 KB)

junpenglao · March 13, 2018, 12:34pm

Yes using Metropolis could produce invalid proposal as described in the other post you mentioned, but using Categorical distribution seems to work:

n = aCH_.eval().shape[1]
with pm.Model() as basic_model:
    # Priors for unknown model parameters
    b1 = pm.Uniform('b1', lower=0.3, upper=0.5, testval=0.45)
    ncomp_aCH = pm.Categorical('ncomp_aCH', p=np.ones(n)/n)
    ncomp_aCOH = pm.Categorical('ncomp_aCOH', p=np.ones(n)/n)

    aCH=aCH_[0, ncomp_aCH]
    aCOH=aCOH_[0, ncomp_aCOH]

    out= b1*aCH+aCOH

    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=out, tau=sigma, observed=Y)

    trace = pm.sample(10000)

charlotte.burki · March 13, 2018, 12:38pm

Thanks, yes it works!

But there is no way to make it work using Metropolis?

junpenglao · March 13, 2018, 12:44pm

Currently, Metropolis still suffer the same problem as explained in Issue with DiscreteUniform distribution…

charlotte.burki · March 13, 2018, 12:45pm

Alright thanks!

charlotte.burki · March 22, 2018, 11:31am

I have run this model with 8 millions iterations with 1 millions tuning steps and still have a Gelman Rubin above 1.4 and a lot of divergences. Do you know have any idea of what causes that problem? Increasing target_accept doesn’t help either.

junpenglao · March 22, 2018, 12:11pm

More iterations doesn’t help when there are problems in your model. In this case, a few suggestion:

Increase target_accept leads to smaller step size, but it only helps when you are seeing <10 divergences. When there are lots of divergent, it means something is wrong with the model
looking at the output warning, one chain output 49 divergences, while the other one output 5e6 divergence(!). It seems the chain is stuck in some high curvature area. A better starting point could help
However, the sampling is likely still problematic as you are mixing CategoricalGibbsSampler with NUTS. Mixing discrete parameters with continuous using NUTS is not well tested. Reading from your code it seems you are trying to learn a sparse weight coefficient in

    aCH=aCH_[0, ncomp_aCH]
    aCOH=aCOH_[0, ncomp_aCOH]
    out= b1*aCH+aCOH

A better way to do is model b1 as a vector using a regulated prior to impose sparseness, eg regularized horseshoe, and do matrix multiplication.

charlotte.burki · March 22, 2018, 1:19pm

Thanks for the advices! I’m going to try to mix Categorical GibbsSampler with Metropolis instead of Nuts as these sampling methods are closer. Concerning the b1 coefficient, it really should be a scaler and not a vector. It is a physical property that should be the same for each sample.

charlotte.burki · March 23, 2018, 8:53am

If I use Metropolis, then the variables updated by Categorical Gibbs Sampler are not updated at all. This is very strange. Do you have any idea why?

junpenglao · March 23, 2018, 1:32pm

Try different start values.

charlotte.burki · March 23, 2018, 1:54pm

I tried. Doesn’t move from the starting values

junpenglao · March 23, 2018, 10:18pm

I had another look at the model, with the following code:

with pm.Model() as basic_model:
    # Priors for unknown model parameters
    b1 = pm.Uniform('b1', lower=0.3, upper=0.5, testval=0.45)
    ncomp_aCH = pm.Categorical('ncomp_aCH', p=np.ones(n)/n)
    ncomp_aCOH = pm.Categorical('ncomp_aCOH', p=np.ones(n)/n)

    aCH=aCH_[0, ncomp_aCH]
    aCOH=aCOH_[0, ncomp_aCOH]

    out= b1*aCH+aCOH

    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=out, tau=sigma, observed=Y)
    trace = pm.sample(discard_tuned_samples=False)

which gives me a trace:

So looking at the trace, The problem seems to be that ncomp_aCH is stuck after a few step. Either it is highly likely (think of the posterior being almost like a delta function) or there is some problem with the sampler.

I verify the logp of ncomp_aCH using the following:

point = trace[-1]
point
# {'b1': 0.4980582094475776,
#  'b1_interval__': 4.624950462026797,
#  'ncomp_aCH': 14,
#  'ncomp_aCOH': 17}
for i in range(n):
    point['ncomp_aCH']=i
    print(point['ncomp_aCH'], basic_model.logp(point))

0 -1110.9799301118057
1 -1118.7101039762993
2 -1128.3039141889512
3 -1125.591418583891
4 -1133.5723103615805
5 -1134.252739856408
6 -1131.6183448546024
7 -1132.7872247881817
8 -1111.4298570895123
9 -1134.4647708083562
10 -1144.1176821443162
11 -1141.1537636543512
12 -1097.8710713143184
13 -1086.384805832985
14 -1038.9504119754274
15 -1053.7949113856769
16 -1102.021203844887
17 -1102.374012701112
18 -1104.585462412219
19 -1148.0232245001032

Looking from the output, ncomp_aCH=14 does have the lowest logp. In fact, the dlogp between ncomp_aCH=14 and ncomp_aCH=15 is np.exp(1038.9504119754274-1053.7949113856769) = 3.573681299085053e-07 (the acceptance proability), which is why all the proposal value are rejected.
I think you should double check your model, as the output from the sampler seems to be correct.

junpenglao · March 29, 2018, 5:22am

I think there is a way to marginalized instead of indexing. The thought behind it is that when you computed the posterior expectation of ncomp_aCH or ncomp_aCOH, it’s a weight vector with n elements that summed to 1. So instead of restricting the vector to take value only 0 or 1 (ie an index vector), let it take values between 0 and 1:

with pm.Model() as basic_model:
    # Priors for unknown model parameters
    b1 = pm.Uniform('b1', lower=0.3, upper=0.5)
    ncomp_aCH = pm.Dirichlet('ncomp_aCH', a=np.ones(n)/n)
    ncomp_aCOH = pm.Dirichlet('ncomp_aCOH', a=np.ones(n)/n)

    aCH = aCH_.dot(ncomp_aCH)
    aCOH = aCOH_.dot(ncomp_aCOH)

    out = b1 * aCH + aCOH

    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=out, tau=sigma, observed=Y)
    trace = pm.sample(1000, tune=2000)

By using a Dirichlet with alpha < 1., the resulting weight will be sparse (with the extreme being a index vector).
However, the above model is difficult to sample (lots of divergences), with some reparameterization:

import theano.tensor as tt
Dirprior = 1./n
with pm.Model() as basic_model:
    # Priors for unknown model parameters
    b1 = pm.Uniform('b1', lower=0.3, upper=0.5)

    beta1 = pm.Gamma('beta1', alpha=Dirprior, beta=1., shape=n)
    ncomp_aCH = pm.Deterministic('ncomp_aCH', beta1/tt.sum(beta1))
    beta2 = pm.Gamma('beta2', alpha=Dirprior, beta=1., shape=n)
    ncomp_aCOH = pm.Deterministic('ncomp_aCOH', beta2/tt.sum(beta2))

    aCH = aCH_.dot(ncomp_aCH)
    aCOH = aCOH_.dot(ncomp_aCOH)

    out = b1 * aCH + aCOH

    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=out, tau=sigma, observed=Y)
    trace = pm.sample(1000, tune=2000)

pm.forestplot(trace,varnames=['b1', 'ncomp_aCH', 'ncomp_aCOH']);

The result is quite similar with using indexing but IMO this model is much better.

charlotte.burki · March 29, 2018, 7:39am

This sounds like a very intersting option. I will give it a try, will come back to you. Thank you!

charlotte.burki · March 29, 2018, 9:16am

Could you precise how you modified aCH_ and aCOH_ because how I send it to you it doesn’t have the right dimension. It has the dimension:

shapes (20,92)
So I tried to transpose it using numpy.transpose but it doesn’ work.

Also you set n=20 right?

junpenglao · March 29, 2018, 1:07pm

I put everything in a gist: https://gist.github.com/junpenglao/fde55c4ac0f4607e47ea98305c8ffa8c

charlotte.burki · March 29, 2018, 1:49pm

That’s great! Thank you very much for all the time you spend playing with the model!

Topic		Replies	Views
Indexes of RVs in modeling? Development theano	0	487	November 20, 2020
RV-Dependent Observations from Irregularly-Shaped Dataset Questions	8	949	July 26, 2017
Bayesian multilevel model indexing problems Questions	1	509	June 4, 2021
Model doesn't contain any free variables error Questions	8	1904	December 19, 2020
Multiple covariance structures v3 covariance	0	385	March 19, 2022

Indexing using a free random variable

Related topics