WAIC fails with AttributeError


#1

Good day,

I’m trying to compare some simple binomial models, but python3 throws the following error:

Traceback (most recent call last):
  File "liability_compare.py", line 79, in <module>
    (trace_0, trace_1, trace_2, trace_3))
  File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/stats.py", line 430, in compare
    if len(set([len(m.observed_RVs) for m in models])) != 1:
  File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/stats.py", line 430, in <listcomp>
    if len(set([len(m.observed_RVs) for m in models])) != 1:
  File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/backends/base.py", line 324, in __getattr__
    type(self).__name__, name))
AttributeError: 'MultiTrace' object has no attribute 'observed_RVs'

This happens even if I try to construct the WAIC for a single binomial model.

The code is as follows:

import pymc3 as pm
import pandas as pd
import numpy as nm
import matplotlib.pyplot as plt
plt.style.use('seaborn-darkgrid')

PLA = pd.read_csv('./productliability_award.csv')
DUMMIES = pd.get_dummies(PLA['jury'], prefix='group')
PLA_DUMMIES = pd.concat([PLA, DUMMIES], axis=1)
PLA_T = pd.read_csv('./productliability_award_LD.csv')

# Model with interactions and betas
with pm.Model() as TH_MODEL_0:
    A = pm.Normal('alpha', mu=0, sd=10)
    BL = pm.Normal('betaL', mu=0, sd=10)
    BN = pm.Normal('betaN', mu=0, sd=10)
    BO = pm.Normal('betaO', mu=0, sd=10)
    BLO = pm.Normal('betaLO', mu=0, sd=10)
    BNO = pm.Normal('betaNO', mu=0, sd=10)

    P = pm.math.invlogit(A + BL * PLA['liability'] + BN * PLA['negligence']
            + BO * PLA['oralArg'] + BLO * PLA['liability'] * PLA['oralArg']
            + BNO * PLA['negligence'] * PLA['oralArg'])
    AWARD = pm.Binomial('awards', n=1, p=P, observed=PLA['award'])

    start = pm.find_MAP()
    step = pm.Slice()
    trace_0 = pm.sample(5000, step=step, start=start)


# Model without interactions
with pm.Model() as TH_MODEL_1:
    A = pm.Normal('alpha', mu=0, sd=10)
    BL = pm.Normal('betaL', mu=0, sd=10)
    BN = pm.Normal('betaN', mu=0, sd=10)
    BO = pm.Normal('betaO', mu=0, sd=10)

    P = pm.math.invlogit(A + BL * PLA['liability'] + BN * PLA['negligence']
             + BO * PLA['oralArg'])
    AWARD = pm.Binomial('awards', n=1, p=P, observed=PLA['award'])

    start = pm.find_MAP()
    step = pm.Slice()
    trace_1 = pm.sample(5000, step=step, start=start)

# Model with all the dummies and no intercept
with pm.Model() as TH_MODEL_2:
    B1 = pm.Normal('dummy_1', mu=0, sd=10)
    B2 = pm.Normal('dummy_2', mu=0, sd=10)
    B3 = pm.Normal('dummy_3', mu=0, sd=10)
    B4 = pm.Normal('dummy_4', mu=0, sd=10)
    B5 = pm.Normal('dummy_5', mu=0, sd=10)

    P = pm.math.invlogit(B1 * PLA_DUMMIES['group_1'] + B2 *
            PLA_DUMMIES['group_2'] + B3 * PLA_DUMMIES['group_3'] + B4 *
            PLA_DUMMIES['group_4'] + B5 * PLA_DUMMIES['group_5'])
    AWARD = pm.Binomial('awards', n=1, p=P, observed=PLA_DUMMIES['award'])

    start = pm.find_MAP()
    step = pm.Slice()
    trace_2 = pm.sample(5000, step=step, start=start)

# Model with reduced dummies
with pm.Model() as TH_MODEL_3:
    B1 = pm.Normal('dummy_L', mu=0, sd=10)
    B2 = pm.Normal('dummy_N', mu=0, sd=10)
    B3 = pm.Normal('dummy_0', mu=0, sd=10)

    P = pm.math.invlogit(B1 * PLA_T['group_L']
            + B2 * PLA_T['group_N'] + B3 * PLA_T['group_0'])
    AWARD = pm.Binomial('awards', n=1, p=P, observed=PLA_T['award'])

    start = pm.find_MAP()
    step = pm.Slice()
    trace_3 = pm.sample(5000, step=step, start=start)

# Comparing all the models
WAIC = pm.compare((TH_MODEL_0, TH_MODEL_1, TH_MODEL_2, TH_MODEL_3),
        (trace_0, trace_1, trace_2, trace_3))
print(WAIC)
pm.compareplot(WAIC)
plt.show()

I’m very unexperienced with python, so I’m not sure if the problem is in my code or if this is some sort of bug.


#2

Hmmm what is your PyMC3 version?


#3

3.2, if pip is to be believed.

➤ pip3 show pymc3
Name: pymc3
Version: 3.2
Summary: PyMC3
Home-page: http://github.com/pymc-devs/pymc3
Author: Thomas Wiecki
Author-email: thomas.wiecki@gmail.com
License: Apache License, Version 2.0
Location: /home/eichhorn/.local/lib/python3.6/site-packages
Requires: pandas, six, joblib, patsy, h5py, theano, tqdm

#4

hmm, check your input matrix, is there any missing data?


#5

If you mean the dataset, then it seems to be good. I’m using this one. There are no missing entries or anything suspicious I can see.

The regressions themselves gave no errors and seemed to be behaving as they should. Here are the results of pm.traceplot.

I have tried running WAIC with the model that operates only on the original .CSV file without my manipulations, but the result was the same. tt’s very possible that I’m just making some obvious mistake.

Perhaps I’m missing some python package?

The code I tried to run:

import pymc3 as pm
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('seaborn-darkgrid')

PLA = pd.read_csv('./productliability_award.csv')

with pm.Model() as TH_MODEL:
    A = pm.Normal('alpha', mu=0, sd=10)
    BL = pm.Normal('betaL', mu=0, sd=10)
    BN = pm.Normal('betaN', mu=0, sd=10)
    BO = pm.Normal('betaO', mu=0, sd=10)

    P = pm.math.invlogit(A + BL * PLA['liability'] + BN * PLA['negligence']
            + BO * PLA['oralArg'])
    AWARD = pm.Binomial('awards', n=1, p=P, observed=PLA['award'])

    start = pm.find_MAP()
    step = pm.Slice()
    trace = pm.sample(5000, step=step, start=start)

TH_MODEL_waic = pm.waic(TH_MODEL, trace)
print(TH_MODEL_waic)

And the same exact error:

Traceback (most recent call last):
  File "liability_compare_1.py", line 22, in <module>
    TH_MODEL_waic = pm.waic(TH_MODEL, trace)
  File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/stats.py", line 209, in waic
    log_py = _log_post_trace(trace, model, progressbar=progressbar)
  File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/stats.py", line 150, in _log_post_trace
    cached = [(var, var.logp_elemwise) for var in model.observed_RVs]
  File "/home/eichhorn/.local/lib/python3.6/site-packages/pymc3/backends/base.py", line 324, in __getattr__
    type(self).__name__, name))
AttributeError: 'MultiTrace' object has no attribute 'observed_RVs'

#6

Oh I found it: pm.waic() takes trace then model as input:
pm.waic(trace, model)


#7

Thank you immensely! And sorry for taking your time with such a stupid mistake.

On an unrelated note, can you recommend some forum where I may ask about Poisson modelling in python? I need to figure out how to predict values for a number of years, based on some time series. I have an example in R but don’t quite grasp it.


#8

You are welcome!
You can have a look at https://github.com/RJT1990/pyflux, it’s quite a nice package on time series analysis. If you want to build a probalistic model feel free to open a new post.