Good time of the day.
I’m trying to predict values, using
pymc3.sample_ppc, for which I need shared values, according to notebook.
But if I try to run something like
DISTANCE = theano.shared(BR['mil_km'])
, where BR is a
pandas dataframe, I get the following error during the model specification:
Traceback (most recent call last): File "rail_0.py", line 25, in <module> theta = (A + BD * DISTANCE + BY * YEAR) TypeError: unsupported operand type(s) for *: 'FreeRV' and 'SharedVariable'
print(DISTANCE), the output is simply
<Generic> instead of a column of data, so I assume that is the problem. What is a correct way to convert a column from
theano.shared? Most of the examples I’ve seen simply generate initial data to use as shared predictors.
Here’s the problem I’m trying to solve if it might be of help:
I have a dataset with 2 periods. It contains year, distance and result. I want to train the model on the years and distance from the first period and then predict values for the second period.
Here’s the way I’m trying to implement that:
import pymc3 as pm import pandas as pd import numpy as np import theano RAILS = pd.read_csv('./brit_rail_acc.csv') BR = RAILS[21:47] # Data prior to privatisation, but after the steam engine ABR = RAILS[48:] # Data after privatisation DISTANCE = theano.shared(BR['mil_km']) YEAR = theano.shared(BR['year']) with pm.Model() as MODEL_BR_0: A = pm.Normal('alpha', mu=0, sd=100) BD = pm.Normal('distance', mu=0, sd=10) BY = pm.Normal('year', mu=0, sd=10) theta = (A + BD * DISTANCE + BY * YEAR) y = pm.Poisson('accidents', mu=np.exp(theta), observed=BR['cdo_acc'].values) trace = pm.sample(5000, n_init=5000) DISTANCE.set_values(ABR['mil_km']) YEAR.set_values(ABR['year']) ppc = pm.sample_ppc(trace, model=MODEL_BR_0)
On a semi-related note, should I worry if in about 50% of cases I get the following error, while trying to run the basic model without the shared elements?
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions. Auto-assigning NUTS sampler... Initializing NUTS using jitter+adapt_diag... 0%| | 0/5500 [00:00<?, ?it/s] Traceback (most recent call last): File "rail_0_nt.py", line 24, in <module> trace = pm.sample(5000, n_init=5000) File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/sampling.py", line 285, in sample return sample_func(**sample_args)[discard:] File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/sampling.py", line 332, in _sample for it, strace in enumerate(sampling): File "/home/eichhorn/.local/lib/python3.6/site-packages/tqdm/_tqdm.py", line 955, in __iter__ for obj in iterable: File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/sampling.py", line 430, in _iter_sample point, states = step.step(point) File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/step_methods/arraystep.py", line 175, in step apoint, stats = self.astep(array) File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/step_methods/hmc/nuts.py", line 182, in astep 'might be misspecified.' % start.energy) ValueError: Bad initial energy: nan. The model might be misspecified.