Trouble with Initilization and Posterior Sampling

theano

#1

Hello, I am a newbie at PyMC and probabilistic computing and created a model using a custom likelihood. When I try and implement the model using NUTS, however, I get an error of bad init energy: inf. I’ve gone through the debugging of theano and can’t understand why this error is being created. Any help would be much appreciated.

Additionally, I am trying to sample from the posterior predictive of this model. However, I get an error that no random attribute was found. How does one create a “random attribute”?

The custom class is defined as

class univautoreg(Continuous):
    def __init__(self,theta0=None, v=None, *args, **kwargs):
        super(univautoreg, self).__init__(*args, **kwargs)
        self.theta0 = theta0
        self.mean = tt.as_tensor_variable(0.)
        self.v=v
    def logp(self,value):
        theta0=self.theta0
        v=self.v
        theta=theta0.flatten()
        valuenew=value.flatten()
        #consider doing tensor version of this to speed up
        def calc_next_err(value,last_value,theta_t,theta_last,vin): #calculates next perior error epsilon_t, nu_t
            this_period=pm.math.stack([value,theta_t],axis=0)
            meanval=theta_t*last_value
            last_period=pm.math.stack([meanval,theta_last],axis=0)
            err=this_period-last_period
            errnew=err.reshape([2,1])
            lik= MvNormal.dist(mu=np.zeros(2), cov=vin).logp(errnew) #likelihood of each is from multivariate normal #check if possible misspecification here
            return lik
        err_like, _  = scan(fn=calc_next_err,
                         sequences=[dict(input=valuenew, taps=[0,-1]),dict(input=theta, taps=[0,-1])],
                         outputs_info=[None],
                         non_sequences=[v]
                        )
        liketotal=tt.sum(err_like)
        return (liketotal)

and the model is

theta_bar = .5
P_bar = 2
P_bar_sd = np.sqrt(P_bar)
with pm.Model() as model1:
    #Variance function
    BoundedNorm = pm.Bound(pm.Normal,lower=-1.0,upper=1.0) #defines bounded normal
    BoundedGRW = pm.Bound(pm.GaussianRandomWalk, lower=-1.0, upper=1.0) #defines bounded gaussian random walk
    packed_cov = pm.LKJCholeskyCov('chol_cov', n=2, eta=2., sd_dist=pm.HalfCauchy.dist(beta=1.5)) #equivalent of p(V), gives prior on cholesky of COV
    chol = pm.expand_packed_triangular(2,packed_cov) #generates cholesky decomp. from packed triangular matrix in previous problem
    v=pm.Deterministic('v',tt.dot(chol,chol.T)) #generates cov matrix from cholesky decomposition
    sdtemp=tt.sqrt(tt.diag(v)) #takes sqrt of each variance to get sd
    sd = sdtemp[1] #recovers sd
    #Theta prior
    thetainit=BoundedNorm.dist(mu=theta_bar,sd=P_bar_sd)
    #Theta Updating Process
    theta=BoundedGRW('theta', sd=sd, init=thetainit, shape=len(y_obs)) #gives prior on sequence of thetas
    #Likelihood function
    like = univautoreg('like', theta0=theta, v=v, observed=y_obs) #specifies model from class
    step =pm.NUTS() #sets type of model
    start= pm.find_MAP() #see if I can figure out why it is generating NA or inf here.
    tracemodel1 = pm.sample(tune=2000,cores=4,start=start, samples=200,step=step) #runs sample on posterior

The output is

logp = 3,221.6, ||grad|| = 1,441.4: 100%|██████████| 119/119 [00:02<00:00, 51.18it/s] 
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta_interval__, chol_cov_cholesky_cov_packed__]
  0%|          | 0/2500 [00:00<?, ?it/s]

A partial summary of the error is:

raceback (most recent call last):
  File "~/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 350, in __call__
    return self.func(*args, **kwargs)
  File "~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 131, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 131, in <listcomp>
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "~/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py", line 554, in _sample
    for it, strace in enumerate(sampling):
  File "~/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 930, in __iter__
    for obj in iterable:
  File "~/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py", line 652, in _iter_sample
    point, states = step.step(point)
  File "~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/arraystep.py", line 222, in step
    apoint, stats = self.astep(array)
  File "~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/base_hmc.py", line 117, in astep
    'might be misspecified.' % start.energy)
ValueError: Bad initial energy: inf. The model might be misspecified.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "~/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "~/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 359, in __call__
    raise TransportableException(text, e_type)
joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
ValueError                                         Tue Jun 12 16:38:52 2018
PID: 86414       Python 3.6.3: ~/anaconda3/bin/python
...........................................................................
~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _sample>, (0, True, 102429697, {'chol_cov': array([  1.06861937,   0.41486711, -35.3894836 ]), 'chol_cov_cholesky_cov_packed__': array([  1.06861937,   0.41486711, -35.3894836 ]), 'theta': array([ 0.16569299, -0.25800513, -0.83105768, -0...6, -0.9114371 ,  0.87463396,
       -0.27778238]), 'theta_interval__': array([ 0.33446958, -0.52794019, -2.38309172, -2...3, -3.07189759,  2.70493092,
       -0.57055482]), 'v': array([[8.47600078, 1.20782636],
       [1.20782636, 0.17211472]])}), {'draws': 2500, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'samples': 200, 'step': <pymc3.step_methods.hmc.nuts.NUTS object>, 'trace': None, 'tune': 2000})]
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _sample>
        args = (0, True, 102429697, {'chol_cov': array([  1.06861937,   0.41486711, -35.3894836 ]), 'chol_cov_cholesky_cov_packed__': array([  1.06861937,   0.41486711, -35.3894836 ]), 'theta': array([ 0.16569299, -0.25800513, -0.83105768, -0...6, -0.9114371 ,  0.87463396,
       -0.27778238]), 'theta_interval__': array([ 0.33446958, -0.52794019, -2.38309172, -2...3, -3.07189759,  2.70493092,
       -0.57055482]), 'v': array([[8.47600078, 1.20782636],
       [1.20782636, 0.17211472]])})
        kwargs = {'draws': 2500, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'samples': 200, 'step': <pymc3.step_methods.hmc.nuts.NUTS object>, 'trace': None, 'tune': 2000}
    132 
    133     def __len__(self):
    134         return self._size
    135 
...
p(self=<pymc3.step_methods.hmc.nuts.NUTS object>, q0=array([ 3.34469580e-01, -5.27940186e-01, -2.3830...37e+00,
        4.14867113e-01, -3.53894836e+01]))
    112         start = self.integrator.compute_state(q0, p0)
    113 
    114         if not np.isfinite(start.energy):
    115             self.potential.raise_ok()
    116             raise ValueError('Bad initial energy: %s. The model '
--> 117                              'might be misspecified.' % start.energy)
        start.energy = inf
    118 
    119         adapt_step = self.tune and self.adapt_step_size
    120         step_size = self.step_adapt.current(adapt_step)
    121         self.step_size = step_size

ValueError: Bad initial energy: inf. The model might be misspecified.

I can add the full traceback if needed. I am using generated data according to the model here. For completeness here is the code for generating the data:

T=50
theta_true=np.empty([T+1,1])
y_obs=np.empty([T+1,1])
theta_true[0]=st.norm.rvs(loc=.3,scale=.2)
while (theta_true[0]>1 or theta_true[0]<-1):
    theta_true[0]=st.norm.rvs(loc=.3,scale=.2)
errs=st.multivariate_normal.rvs(mean=[0,0], cov=[[0.4, 0.3], [0.3, 0.5]], size=T+1)
nu=errs[:,0]
err=errs[:,1]
y_obs[0]=err[0]
for t in range(1,T+1):
    theta_true[t]=theta_true[t-1]+nu[t]
    while (theta_true[t]>1 or theta_true[t]<-1):
        theta_true[t]=st.norm.rvs(loc=theta_true[t-1], scale=np.sqrt(.5))
    y_obs[t]= theta_true[t]*y_obs[t-1]+err[t]

Thank you again for any help in advance!


#2

The find_MAP() runs fine which means the logp evaluation is ok, the gradient problem is possible related to the fact that gradient is usually very small near the MAP. Have you try just using the default sampling: tracemodel1 = pm.sample(1000, tune=1000)?


#3

Hi Junpenglao. Thank you so much for the tip. I am running it and once it finishes running I will let you know if it works. It is at least running, however. (It is very slow at the moment… it takes about 16 seconds per iteration on a CPU, which I assume is due to the scan function, rather than some sort of tensor operator implementation. If you have any suggestions as to how to speed it up, it would be much appreciated.


#4

Yes the scan is likely very slow - this is an auto regressive model right? have you try the AR(n) in timeseries distribution?


#5

I looked into it. Unfortunately, it’s not applicable due to the weird construction of the time series here.
The model is
y_t= \theta_t*y_{t-1} + \varepsilon_t
with
\theta_t = \theta_{t-1} + \nu_t,
with \nu_t and \varepsilon_t jointly distributed as a m.v.normal ( mean 0 and cov V, with a Inverse-Wishart prior, although I changed to LKJCholesky on Covariance that is not necessarily uncorrelated) so I need a custom function on the ys I think. The assumption on updating is that p(\theta_t|V,y^T) \propto \mathbb{1}_{|\theta|<1}p(\theta_{t-1}|V,y^T) where V is the covariance matrix. I used a modified gaussian random walk on the thetas, however (I can’t use AR1 for them because theta_0 has a prior that is a bounded normal.)


#6

So about 30% of the way through it failed. The error message was:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta_interval__, chol_cov_cholesky_cov_packed__]
 35%|███▌      | 709/2000 [5:00:17<9:06:48, 25.41s/it]

---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
...
  File "~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/base_hmc.py", line 115, in astep
    self.potential.raise_ok()
  File "~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/quadpotential.py", line 190, in raise_ok
    raise ValueError('Mass matrix contains zeros on the diagonal. '
ValueError: Mass matrix contains zeros on the diagonal. Some derivatives might always be zero.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "~/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "~/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 359, in __call__
    raise TransportableException(text, e_type)
joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
ValueError                                         Wed Jun 13 16:39:41 2018
PID: 99318       Python 3.6.3: ~/anaconda3/bin/python
...........................................................................
~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _sample>, (0, True, 496596156, {'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}), {'draws': 2000, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'step': <pymc3.step_methods.hmc.nuts.NUTS object>, 'trace': None, 'tune': 1000})]
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _sample>
        args = (0, True, 496596156, {'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])})
        kwargs = {'draws': 2000, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'step': <pymc3.step_methods.hmc.nuts.NUTS object>, 'trace': None, 'tune': 1000}
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py in _sample(chain=0, progressbar=True, random_seed=496596156, start={'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}, draws=2000, step=<pymc3.step_methods.hmc.nuts.NUTS object>, trace=None, tune=1000, model=<pymc3.model.Model object>, live_plot=False, live_plot_kwargs=None, **kwargs={})
    549                             tune, model, random_seed)
    550     if progressbar:
    551         sampling = tqdm(sampling, total=draws)
    552     try:
    553         strace = None
--> 554         for it, strace in enumerate(sampling):
        it = 708
        strace = <pymc3.backends.ndarray.NDArray object>
        sampling =  35%|███▌      | 709/2000 [5:00:17<9:06:48, 25.41s/it]
    555             if live_plot:
    556                 if live_plot_kwargs is None:
    557                     live_plot_kwargs = {}
    558                 if it >= skip_first:

...........................................................................
~/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py in __iter__(self= 35%|███▌      | 709/2000 [5:00:17<9:06:48, 25.41s/it])
    925             except AttributeError:
    926                 raise TqdmDeprecationWarning("""\
    927 Please use `tqdm_gui(...)` instead of `tqdm(..., gui=True)`
    928 """, fp_write=getattr(self.fp, 'write', sys.stderr.write))
    929 
--> 930             for obj in iterable:
        obj = <pymc3.backends.ndarray.NDArray object>
        iterable = <generator object _iter_sample>
    931                 yield obj
    932                 # Update and possibly print the progressbar.
    933                 # Note: does not call self.update(1) for speed optimisation.
    934                 n += 1

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py in _iter_sample(draws=2000, step=<pymc3.step_methods.hmc.nuts.NUTS object>, start={'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}, trace=None, chain=0, tune=1000, model=<pymc3.model.Model object>, random_seed=496596156)
    647         step.tune = bool(tune)
    648         for i in range(draws):
    649             if i == tune:
    650                 step = stop_tuning(step)
    651             if step.generates_stats:
--> 652                 point, states = step.step(point)
        point = {'chol_cov_cholesky_cov_packed__': array([ -2.08944751,   0.12375549, -18.61648659]), 'theta_interval__': array([-0.15800638, -0.1441874 , -0.19114327, -0...6, -0.12067608, -0.07600247,
       -0.05434058])}
        states = [{'depth': 10, 'diverging': False, 'energy': -3479.6746766908395, 'energy_error': -0.09381699821506118, 'max_energy_error': 3.0620532972125147, 'mean_tree_accept': 0.8530624568360251, 'step_size': 1.4004626207707957e-06, 'step_size_bar': 5.23527895218224e-07, 'tree_size': 1023, 'tune': True}]
        step.step = <bound method GradientSharedStep.step of <pymc3.step_methods.hmc.nuts.NUTS object>>
    653                 if strace.supports_sampler_stats:
    654                     strace.record(point, states)
    655                 else:
    656                     strace.record(point)

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/arraystep.py in step(self=<pymc3.step_methods.hmc.nuts.NUTS object>, point={'chol_cov_cholesky_cov_packed__': array([ -2.08944751,   0.12375549, -18.61648659]), 'theta_interval__': array([-0.15800638, -0.1441874 , -0.19114327, -0...6, -0.12067608, -0.07600247,
       -0.05434058])})
    217     def step(self, point):
    218         self._logp_dlogp_func.set_extra_values(point)
    219         array = self._logp_dlogp_func.dict_to_array(point)
    220 
    221         if self.generates_stats:
--> 222             apoint, stats = self.astep(array)
        apoint = undefined
        stats = undefined
        self.astep = <bound method BaseHMC.astep of <pymc3.step_methods.hmc.nuts.NUTS object>>
        array = array([-1.58006383e-01, -1.44187398e-01, -1.9114...51e+00,
        1.23755491e-01, -1.86164866e+01])
    223             point = self._logp_dlogp_func.array_to_full_dict(apoint)
    224             return point, stats
    225         else:
    226             apoint = self.astep(array)

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/base_hmc.py in astep(self=<pymc3.step_methods.hmc.nuts.NUTS object>, q0=array([-1.58006383e-01, -1.44187398e-01, -1.9114...51e+00,
        1.23755491e-01, -1.86164866e+01]))
    110         """Perform a single HMC iteration."""
    111         p0 = self.potential.random()
    112         start = self.integrator.compute_state(q0, p0)
    113 
    114         if not np.isfinite(start.energy):
--> 115             self.potential.raise_ok()
        self.potential.raise_ok = <bound method QuadPotentialDiagAdapt.raise_ok of...hmc.quadpotential.QuadPotentialDiagAdapt object>>
    116             raise ValueError('Bad initial energy: %s. The model '
    117                              'might be misspecified.' % start.energy)
    118 
    119         adapt_step = self.tune and self.adapt_step_size

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/quadpotential.py in raise_ok(self=<pymc3.step_methods.hmc.quadpotential.QuadPotentialDiagAdapt object>)
    185 
    186         self._n_samples += 1
    187 
    188     def raise_ok(self):
    189         if np.any(self._stds == 0):
--> 190             raise ValueError('Mass matrix contains zeros on the diagonal. '
    191                              'Some derivatives might always be zero.')
    192         if np.any(self._stds < 0):
    193             raise ValueError('Mass matrix contains negative values on the '
    194                              'diagonal.')

ValueError: Mass matrix contains zeros on the diagonal. Some derivatives might always be zero.
___________________________________________________________________________
"""

The above exception was the direct cause of the following exception:

TransportableException                    Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in retrieve(self)
    698                 if getattr(self._backend, 'supports_timeout', False):
--> 699                     self._output.extend(job.get(timeout=self.timeout))
    700                 else:

~/anaconda3/lib/python3.6/multiprocessing/pool.py in get(self, timeout)
    643         else:
--> 644             raise self._value
    645 

TransportableException: TransportableException
___________________________________________________________________________
ValueError                                         Wed Jun 13 16:39:41 2018
PID: 99318       Python 3.6.3: ~/anaconda3/bin/python
...........................................................................
~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _sample>, (0, True, 496596156, {'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}), {'draws': 2000, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'step': <pymc3.step_methods.hmc.nuts.NUTS object>, 'trace': None, 'tune': 1000})]
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _sample>
        args = (0, True, 496596156, {'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])})
        kwargs = {'draws': 2000, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'step': <pymc3.step_methods.hmc.nuts.NUTS object>, 'trace': None, 'tune': 1000}
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py in _sample(chain=0, progressbar=True, random_seed=496596156, start={'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}, draws=2000, step=<pymc3.step_methods.hmc.nuts.NUTS object>, trace=None, tune=1000, model=<pymc3.model.Model object>, live_plot=False, live_plot_kwargs=None, **kwargs={})
    549                             tune, model, random_seed)
    550     if progressbar:
    551         sampling = tqdm(sampling, total=draws)
    552     try:
    553         strace = None
--> 554         for it, strace in enumerate(sampling):
        it = 708
        strace = <pymc3.backends.ndarray.NDArray object>
        sampling =  35%|███▌      | 709/2000 [5:00:17<9:06:48, 25.41s/it]
    555             if live_plot:
    556                 if live_plot_kwargs is None:
    557                     live_plot_kwargs = {}
    558                 if it >= skip_first:
...........................................................................
~/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py in __iter__(self= 35%|███▌      | 709/2000 [5:00:17<9:06:48, 25.41s/it])
    925             except AttributeError:
    926                 raise TqdmDeprecationWarning("""\
    927 Please use `tqdm_gui(...)` instead of `tqdm(..., gui=True)`
    928 """, fp_write=getattr(self.fp, 'write', sys.stderr.write))
    929 
--> 930             for obj in iterable:
        obj = <pymc3.backends.ndarray.NDArray object>
        iterable = <generator object _iter_sample>
    931                 yield obj
    932                 # Update and possibly print the progressbar.
    933                 # Note: does not call self.update(1) for speed optimisation.
    934                 n += 1

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py in _iter_sample(draws=2000, step=<pymc3.step_methods.hmc.nuts.NUTS object>, start={'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}, trace=None, chain=0, tune=1000, model=<pymc3.model.Model object>, random_seed=496596156)
    647         step.tune = bool(tune)
    648         for i in range(draws):
    649             if i == tune:
    650                 step = stop_tuning(step)
    651             if step.generates_stats:
--> 652                 point, states = step.step(point)
        point = {'chol_cov_cholesky_cov_packed__': array([ -2.08944751,   0.12375549, -18.61648659]), 'theta_interval__': array([-0.15800638, -0.1441874 , -0.19114327, -0...6, -0.12067608, -0.07600247,
       -0.05434058])}
        states = [{'depth': 10, 'diverging': False, 'energy': -3479.6746766908395, 'energy_error': -0.09381699821506118, 'max_energy_error': 3.0620532972125147, 'mean_tree_accept': 0.8530624568360251, 'step_size': 1.4004626207707957e-06, 'step_size_bar': 5.23527895218224e-07, 'tree_size': 1023, 'tune': True}]
        step.step = <bound method GradientSharedStep.step of <pymc3.step_methods.hmc.nuts.NUTS object>>
    653                 if strace.supports_sampler_stats:
    654                     strace.record(point, states)
    655                 else:
    656                     strace.record(point)

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/arraystep.py in step(self=<pymc3.step_methods.hmc.nuts.NUTS object>, point={'chol_cov_cholesky_cov_packed__': array([ -2.08944751,   0.12375549, -18.61648659]), 'theta_interval__': array([-0.15800638, -0.1441874 , -0.19114327, -0...6, -0.12067608, -0.07600247,
       -0.05434058])})
    217     def step(self, point):
    218         self._logp_dlogp_func.set_extra_values(point)
    219         array = self._logp_dlogp_func.dict_to_array(point)
    220 
    221         if self.generates_stats:
--> 222             apoint, stats = self.astep(array)
        apoint = undefined
        stats = undefined
        self.astep = <bound method BaseHMC.astep of <pymc3.step_methods.hmc.nuts.NUTS object>>
        array = array([-1.58006383e-01, -1.44187398e-01, -1.9114...51e+00,
        1.23755491e-01, -1.86164866e+01])
    223             point = self._logp_dlogp_func.array_to_full_dict(apoint)
    224             return point, stats
    225         else:
    226             apoint = self.astep(array)

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/base_hmc.py in astep(self=<pymc3.step_methods.hmc.nuts.NUTS object>, q0=array([-1.58006383e-01, -1.44187398e-01, -1.9114...51e+00,
        1.23755491e-01, -1.86164866e+01]))
    110         """Perform a single HMC iteration."""
    111         p0 = self.potential.random()
    112         start = self.integrator.compute_state(q0, p0)
    113 
    114         if not np.isfinite(start.energy):
--> 115             self.potential.raise_ok()
        self.potential.raise_ok = <bound method QuadPotentialDiagAdapt.raise_ok of...hmc.quadpotential.QuadPotentialDiagAdapt object>>
    116             raise ValueError('Bad initial energy: %s. The model '
    117                              'might be misspecified.' % start.energy)
    118 
    119         adapt_step = self.tune and self.adapt_step_size

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/quadpotential.py in raise_ok(self=<pymc3.step_methods.hmc.quadpotential.QuadPotentialDiagAdapt object>)
    185 
    186         self._n_samples += 1
    187 
    188     def raise_ok(self):
    189         if np.any(self._stds == 0):
--> 190             raise ValueError('Mass matrix contains zeros on the diagonal. '
    191                              'Some derivatives might always be zero.')
    192         if np.any(self._stds < 0):
    193             raise ValueError('Mass matrix contains negative values on the '
    194                              'diagonal.')

ValueError: Mass matrix contains zeros on the diagonal. Some derivatives might always be zero.
___________________________________________________________________________

...........................................................................
~/<ipython-input-5-650b8faacecc> in <module>()
     14     thetainit=BoundedNorm.dist(mu=theta_bar,sd=P_bar_sd)
     15     #Theta Updating Process
     16     theta=BoundedGRW('theta', sd=sd, init=thetainit, shape=len(y_obs)) #gives prior on sequence of thetas
     17     #Likelihood function
     18     like = univautoreg('like', theta0=theta, v=v, observed=y_obs) #specifies model from class
---> 19     tracemodel1 = pm.sample(1000,tune=1000,cores=4) #runs sample on posterior using NUTS

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py in sample(draws=2000, step=<pymc3.step_methods.hmc.nuts.NUTS object>, init='auto', n_init=200000, start=[{'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}, {'chol_cov_cholesky_cov_packed__': array([ 0.82467436, -0.80963915,  0.55627807]), 'theta_interval__': array([ 0.29077273,  0.18677657, -0.23636995,  0...1,  0.82364536,  0.82978351,
       -0.22230971])}, {'chol_cov_cholesky_cov_packed__': array([ 0.23220033, -0.8543097 ,  0.08794877]), 'theta_interval__': array([ 0.67284462,  0.2879578 ,  0.23400036,  0...8, -0.62044113,  0.77093992,
        0.81221703])}, {'chol_cov_cholesky_cov_packed__': array([-0.20368703,  0.41536114,  0.06717203]), 'theta_interval__': array([-0.52843847, -0.23372156,  0.03127953,  0...1, -0.76796809, -0.96247491,
       -0.92644218])}], trace=None, chain_idx=0, chains=4, cores=4, tune=1000, nuts_kwargs=None, step_kwargs=None, progressbar=True, model=<pymc3.model.Model object>, random_seed=[496596156, 290684706, 677620699, 399237012], live_plot=False, discard_tuned_samples=True, live_plot_kwargs=None, compute_convergence_checks=True, use_mmap=False, **kwargs={})
    437     parallel = cores > 1 and chains > 1 and not has_population_samplers
    438     if parallel:
    439         _log.info('Multiprocess sampling ({} chains in {} jobs)'.format(chains, cores))
    440         _print_step_hierarchy(step)
    441         try:
--> 442             trace = _mp_sample(**sample_args)
        trace = None
        sample_args = {'chain': 0, 'chains': 4, 'cores': 4, 'draws': 2000, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'progressbar': True, 'random_seed': [496596156, 290684706, 677620699, 399237012], 'start': [{'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}, {'chol_cov_cholesky_cov_packed__': array([ 0.82467436, -0.80963915,  0.55627807]), 'theta_interval__': array([ 0.29077273,  0.18677657, -0.23636995,  0...1,  0.82364536,  0.82978351,
       -0.22230971])}, {'chol_cov_cholesky_cov_packed__': array([ 0.23220033, -0.8543097 ,  0.08794877]), 'theta_interval__': array([ 0.67284462,  0.2879578 ,  0.23400036,  0...8, -0.62044113,  0.77093992,
        0.81221703])}, {'chol_cov_cholesky_cov_packed__': array([-0.20368703,  0.41536114,  0.06717203]), 'theta_interval__': array([-0.52843847, -0.23372156,  0.03127953,  0...1, -0.76796809, -0.96247491,
       -0.92644218])}], ...}
    443         except pickle.PickleError:
    444             _log.warning("Could not pickle model, sampling singlethreaded.")
    445             _log.debug('Pickling error:', exec_info=True)
    446             parallel = False

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py in _mp_sample(**kwargs={'draws': 2000, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'step': <pymc3.step_methods.hmc.nuts.NUTS object>, 'trace': None, 'tune': 1000})
    977             for args in zip(chain_nums, pbars, rseed, start))
    978 
    979     if use_mmap:
    980         traces = Parallel(n_jobs=cores)(jobs)
    981     else:
--> 982         traces = Parallel(n_jobs=cores, mmap_mode=None)(jobs)
        traces = undefined
        cores = 4
        jobs = <generator object _mp_sample.<locals>.<genexpr>>
    983 
    984     return MultiTrace(traces)
    985 
    986 
...
---------------------------------------------------------------------------
Sub-process traceback:
---------------------------------------------------------------------------
ValueError                                         Wed Jun 13 16:39:41 2018
PID: 99318       Python 3.6.3: ~/anaconda3/bin/python
...........................................................................
~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _sample>, (0, True, 496596156, {'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}), {'draws': 2000, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'step': <pymc3.step_methods.hmc.nuts.NUTS object>, 'trace': None, 'tune': 1000})]
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _sample>
        args = (0, True, 496596156, {'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])})
        kwargs = {'draws': 2000, 'live_plot': False, 'live_plot_kwargs': None, 'model': <pymc3.model.Model object>, 'step': <pymc3.step_methods.hmc.nuts.NUTS object>, 'trace': None, 'tune': 1000}
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
...
...........................................................................
~/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py in __iter__(self= 35%|███▌      | 709/2000 [5:00:17<9:06:48, 25.41s/it])
    925             except AttributeError:
    926                 raise TqdmDeprecationWarning("""\
    927 Please use `tqdm_gui(...)` instead of `tqdm(..., gui=True)`
    928 """, fp_write=getattr(self.fp, 'write', sys.stderr.write))
    929 
--> 930             for obj in iterable:
        obj = <pymc3.backends.ndarray.NDArray object>
        iterable = <generator object _iter_sample>
    931                 yield obj
    932                 # Update and possibly print the progressbar.
    933                 # Note: does not call self.update(1) for speed optimisation.
    934                 n += 1

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py in _iter_sample(draws=2000, step=<pymc3.step_methods.hmc.nuts.NUTS object>, start={'chol_cov_cholesky_cov_packed__': array([-0.86697565,  0.06281649,  0.47892356]), 'theta_interval__': array([-0.35244586, -0.86961885,  0.43604781, -0...5, -0.42408685, -0.62923437,
        0.10574297])}, trace=None, chain=0, tune=1000, model=<pymc3.model.Model object>, random_seed=496596156)
    647         step.tune = bool(tune)
    648         for i in range(draws):
    649             if i == tune:
    650                 step = stop_tuning(step)
    651             if step.generates_stats:
--> 652                 point, states = step.step(point)
        point = {'chol_cov_cholesky_cov_packed__': array([ -2.08944751,   0.12375549, -18.61648659]), 'theta_interval__': array([-0.15800638, -0.1441874 , -0.19114327, -0...6, -0.12067608, -0.07600247,
       -0.05434058])}
        states = [{'depth': 10, 'diverging': False, 'energy': -3479.6746766908395, 'energy_error': -0.09381699821506118, 'max_energy_error': 3.0620532972125147, 'mean_tree_accept': 0.8530624568360251, 'step_size': 1.4004626207707957e-06, 'step_size_bar': 5.23527895218224e-07, 'tree_size': 1023, 'tune': True}]
        step.step = <bound method GradientSharedStep.step of <pymc3.step_methods.hmc.nuts.NUTS object>>
    653                 if strace.supports_sampler_stats:
    654                     strace.record(point, states)
    655                 else:
    656                     strace.record(point)

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/arraystep.py in step(self=<pymc3.step_methods.hmc.nuts.NUTS object>, point={'chol_cov_cholesky_cov_packed__': array([ -2.08944751,   0.12375549, -18.61648659]), 'theta_interval__': array([-0.15800638, -0.1441874 , -0.19114327, -0...6, -0.12067608, -0.07600247,
       -0.05434058])})
    217     def step(self, point):
    218         self._logp_dlogp_func.set_extra_values(point)
    219         array = self._logp_dlogp_func.dict_to_array(point)
    220 
    221         if self.generates_stats:
--> 222             apoint, stats = self.astep(array)
        apoint = undefined
        stats = undefined
        self.astep = <bound method BaseHMC.astep of <pymc3.step_methods.hmc.nuts.NUTS object>>
        array = array([-1.58006383e-01, -1.44187398e-01, -1.9114...51e+00,
        1.23755491e-01, -1.86164866e+01])
    223             point = self._logp_dlogp_func.array_to_full_dict(apoint)
    224             return point, stats
    225         else:
    226             apoint = self.astep(array)

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/base_hmc.py in astep(self=<pymc3.step_methods.hmc.nuts.NUTS object>, q0=array([-1.58006383e-01, -1.44187398e-01, -1.9114...51e+00,
        1.23755491e-01, -1.86164866e+01]))
    110         """Perform a single HMC iteration."""
    111         p0 = self.potential.random()
    112         start = self.integrator.compute_state(q0, p0)
    113 
    114         if not np.isfinite(start.energy):
--> 115             self.potential.raise_ok()
        self.potential.raise_ok = <bound method QuadPotentialDiagAdapt.raise_ok of...hmc.quadpotential.QuadPotentialDiagAdapt object>>
    116             raise ValueError('Bad initial energy: %s. The model '
    117                              'might be misspecified.' % start.energy)
    118 
    119         adapt_step = self.tune and self.adapt_step_size

...........................................................................
~/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/quadpotential.py in raise_ok(self=<pymc3.step_methods.hmc.quadpotential.QuadPotentialDiagAdapt object>)
    185 
    186         self._n_samples += 1
    187 
    188     def raise_ok(self):
    189         if np.any(self._stds == 0):
--> 190             raise ValueError('Mass matrix contains zeros on the diagonal. '
    191                              'Some derivatives might always be zero.')
    192         if np.any(self._stds < 0):
    193             raise ValueError('Mass matrix contains negative values on the '
    194                              'diagonal.')

ValueError: Mass matrix contains zeros on the diagonal. Some derivatives might always be zero.

Leading up to this each iteration was going slower and slower. Any ideas what problem might be?


#7

This is an indication that the sampler is in a region that has small gradient, making the leapfrog step very small. Thus the sampler need to take a lot of steps until it hit a u turn and terminates, which make it really slow.

If you upgrade to the master, it should print out which element is causing the problem. But since the sampling is so slow I strongly suggest you to reparameterize your model first.


#8

Sorry for my ignorance, but what do you mean upgrade to the master? And by reparameterize I assume you mean find some way to get rid of the scan, right?


#9

Something like pip3 install git+https://github.com/pymc-devs/pymc3@master

Yes. Or use some mathematics to simplify the computation of the logp.


#10

Thank you so much! It runs much faster now–about 30 iterations per second when tuning and nearly 80 when not. It’s still giving the wrong parameter estimates but I’m going to see if running it for longer solves the problem. I’ll let you know shortly.


#11

Thank you so much! Almost everything is working now! The sole remaining problem is that the covariance matrix is static in all but the first entry. See photo:

The new code is as follows.

#define the likelihood function used
class univautoreg(Continuous):
    def __init__(self,theta0=None, v=None, *args, **kwargs):
        super(univautoreg, self).__init__(*args, **kwargs)
        self.theta0 = theta0
        self.mean = tt.as_tensor_variable(0.)
        self.v=v
    def logp(self,value):
        theta0=self.theta0
        v=self.v
        theta=theta0.flatten()
        valuenew=value.flatten()
        #consider doing tensor version of this to speed up
        this_period=pm.math.stack([valuenew[1:],theta[1:]],axis=0)
        meanval=theta[1:]*valuenew[:-1]
        last_period=pm.math.stack([meanval,theta[:-1]],axis=0)
        err=this_period-last_period
        errnew=err.reshape([valuenew.length()-1,2])
        errinit=pm.math.stack([valuenew[0],theta[0]],axis=0)
        errinitreshape=errinit.reshape([1,2])
        lik= MvNormal.dist(mu=np.zeros(2), cov=v).logp(errnew) #likelihood of each is from multivariate normal #check if possible misspecification here
        liketotal=tt.sum(lik)+MvNormal.dist(mu=np.zeros(2),cov=v).logp(errinitreshape)
        return (liketotal)
theta_bar = .5
P_bar = 2
P_bar_sd = np.sqrt(P_bar)
with pm.Model() as model1:
    #Variance function
    BoundedNorm = pm.Bound(pm.Normal,lower=-1.0,upper=1.0) #defines bounded normal
    BoundedGRW = pm.Bound(pm.GaussianRandomWalk, lower=-1.0, upper=1.0) #defines bounded gaussian random walk
    packed_cov = pm.LKJCholeskyCov('chol_cov', n=2, eta=2., sd_dist=pm.HalfCauchy.dist(beta=1.5,shape=2)) #equivalent of p(V), gives prior on cholesky of COV
    chol = pm.expand_packed_triangular(2,packed_cov, lower=True) #generates cholesky decomp. from packed triangular matrix in previous problem
    v=pm.Deterministic('v',tt.dot(chol,chol.T)) #generates cov matrix from cholesky decomposition
    sdtemp=tt.sqrt(tt.diag(v)) #takes sqrt of each variance to get sd
    sd = sdtemp[1] #recovers sd
    #Theta prior
    thetainit=BoundedNorm.dist(mu=theta_bar,sd=P_bar_sd)
    #Theta Updating Process
    theta=BoundedGRW('theta',mu=0., sd=sd, init=thetainit, shape=len(y_obs)) #gives prior on sequence of thetas
    #Likelihood function
    like = univautoreg('like', theta0=theta, v=v, observed=y_obs) #specifies model from class
    tracemodel1 = pm.sample(4000,tune=4000,cores=4) #runs sample on posterior using NUTS

The key output is

mean sd mc_error hpd_2.5 hpd_97.5 n_eff Rhat
chol_cov__0 0.770837 0.077756 0.000556 0.627246 0.927341 22431.480286 0.999966
chol_cov__1 0.000107 0.002010 0.000013 -0.003785 0.004091 24579.876758 0.999880
chol_cov__2 0.097927 0.001380 0.000009 0.095237 0.100570 21664.973782 0.999887
v__0_0 0.600236 0.123428 0.000904 0.393437 0.859962 20639.333696 0.999978
v__0_1 0.000084 0.001558 0.000010 -0.002938 0.003193 24361.466741 0.999876
v__1_0 0.000084 0.001558 0.000010 -0.002938 0.003193 24361.466741 0.999876
v__1_1 0.009596 0.000271 0.000002 0.009070 0.010115 21603.117935 0.999886

while the true covariance matrix is [[0.4,0.3],[0.3,.5]]. Any suggestions as to what the problem may be? Again thank you so much for all of your help!


#12

Not completely sure, but I would suggest try a lighter tail prior in the cov matrix

sd_dist=pm.HalfNormal.dist(1.) # or even an exponential

#13

Can you elaborate a bit on what univautoreg does? My guess would be that there is a problem in the logp somewhere, since the only entry of the matrix that you use outside is v[0,0], and that is also the one that shows values >> 0.
About the bound variables: You might want to model this on R, and then transform it to (-1, 1) using arctan. Something like this:

init_theta = pm.Normal('init_theta', sd=??)
theta_raw = pm.GaussianRandomWalk(
    'theta_raw', mu=0, sd=sd, init=init_theta, shape=len(y_obs))
theta = 2 / np.pi * tt.arctan(theta_raw)
pm.Deterministic('theta', theta)

Also, you can pass the chol to MvNormal, that way we don’t have to do the cholesky decomposition in each step again.


#14

@junpenglao thanks for the tip… I’ll try that.

@aseyboldt univautoreg takes a vector of random theta_t that are assumed to follow a bounded gaussian random walk and values y_t.

The time series I am estimating is
y_t=y_{t-1}\theta_t + \varepsilon_t
\theta_t=\theta_{t-1}+\nu_t with (\varepsilon_t,\nu_t)\sim \text{MvNormal}(0,V)

The bounded normal and gaussian random walk are to achieve stationarity by ensuring theta always less than 1 in abs value. I am looking for the joint posterior V,\theta^T (sequence of \theta s from 0 to T) conditional on y observed and values used to generate my prior.

Univautoreg tries to simplify the likelihood by taking a random sequence of thetas, a random V and observed y^T and calculating (y_t-\theta_t y_{t-1},\theta_t-\theta_{t-1})=(\hat{\varepsilon}_t,\hat{\nu}_t) and taking the likelihood of this sequence of vectors with respect to its distribution MvNormal(0,V), where V should also be a random variable (e.g. the likelihood is conditional on V.) It then sums over the vector of likelihoods (one per period t) for all t to give the likelihood of that sequence being observed. I think this should give my posterior of interest.

In practice, for some reason I think V is being treated as non-random inside unitautoreg, and so only the first sample is used. As you said, I need the s.d. of \nu_t to generate the sequence of thetas, so I extract that outside the univautoreg, and that seems to be interpreted properly as a random variable. But the other entries seem to be treated deterministically. Any idea why that happens? I know V itself is deterministic, but its parents are random so it should be as well…