Variable name restrictions

I discovered de-facto restrictions on variables names, and I’m wondering 1) if they are intended and 2) perhaps someone will benefit from a topic about them because I couldn’t find any mention of them in the docs, here, or on github.

import numpy as np
import pymc3 as pm
debug_y = np.random.normal(loc=1, scale=1, size=200)

with pm.Model() as DebugModel:
    _a__ = pm.Normal('_a__', mu=0, sigma=1)
    foo = pm.Normal('foo', mu=_a__, sigma=1, observed=debug_y)
    debug_trace = pm.sample()

I found several cases that fail, with the “NUTS” message indicating the names were replaced internally

  • a___ “NUTS: [a]”
  • a_a__ “NUTS: [a]”
  • a____ “NUTS: [a_]”
  • __a__ “NUTS: [_]”
  • _a__ “NUTS: []”
  • _aa__ “NUTS: []”

The following somewhat similar cases don’t fail:

  • a__
  • a_a_
  • __a
  • aa__
  • __a_
  • _a__a

These results are consistent with re.sub(r"_[^_]*?__$", "", original_var_name) happening at some point.

Although the single variable example above indicates something is wrong at the sampling stage (“No posterior samples. Unable to run convergence checks”), I first discovered something was wrong in a model with some valid and some invalid variables name, but only after the sampling stage, when arviz raised a KeyError about missing var names.

Why did I have such odd variable names?
They were originally names with function calls that patsy can handle, converted to replace brackets and commas with underscores.