Using a T.repeat(x, <pattern>) instead of T.repeat(x, <int>) means no NUTS

DanWeitzenfeld · May 20, 2018, 6:31pm

I’m working a model where I need to do some irregular repetition of a model parameter, and it seems that when switching to T.repeat(x, <pattern>) from T.repeat(x, <int>), NUTS fails. Below is a reproducible example.
I know that this example is trivially fixed by using states_repeat = T.repeat(states, 3)[:-1] but the non-toy model I’m working on can’t be so easily fixed. That said, dirty hacks using slicing and concatenating are still able to use NUTS, so it seems to just be this T.repeat(x, <pattern>).
Is this a known issue? Is there a workaround cleaner than slicing and concatenating and slicing and concatenating…?

import numpy as np
import pymc3 as pm
import pandas as pd
from theano import tensor as T

# create a random walk
innovation_sd = 1
n_states = 100
states = [0]
last_state = 0
for i in range(n_states - 1):
    next_state = last_state + np.random.normal(scale=innovation_sd, size=1)[0]
    states.append(next_state)
    last_state = next_state
    
# repeat each step in the walk 3 times, and add some noisy observations
states_rp = np.repeat(states, 3)
df = pd.DataFrame({'states': states_rp})
observation_sd = .1
df['obs'] = np.random.normal(states_rp, observation_sd)
df.plot()

with pm.Model() as model:
    states = pm.GaussianRandomWalk('states', mu=0, sd=innovation_sd, shape=n_states)
    # since we have 3 observations per state, we can just use T.repeat(states, <integer>)
    states_repeat = T.repeat(states, 3)
    observation_sd = pm.HalfCauchy('observation_sd', 2)
    obs = pm.Normal('obs', mu=states_repeat, sd=observation_sd, observed=df.obs.values)

with model:
    # sampling uses NUTS and is fast
    trace = pm.sample()

Output:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [observation_sd_log__, states]
100%|██████████| 1000/1000 [00:04<00:00, 201.66it/s]

Versus:

# now cut off the last observation, so that we might think to 
# use a repeat pattern instead of a constant repeat
df = df[:-1]

repeat_pattern = np.concatenate([np.repeat(3, 99), [2]])

with pm.Model() as model:
    states = pm.GaussianRandomWalk('states', mu=0, sd=innovation_sd, shape=n_states)
    # NOTE: repeat_pattern instead of 3
    states_repeat = T.repeat(states, repeat_pattern)
    observation_sd = pm.HalfCauchy('observation_sd', 2)
    obs = pm.Normal('obs', mu=states_repeat, sd=observation_sd, observed=df.obs.values)

with model:
    trace = pm.sample()

Output:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Initializing NUTS failed. Falling back to elementwise auto-assignment.
Multiprocess sampling (4 chains in 4 jobs)
CompoundStep
>Slice: [states]
>NUTS: [observation_sd_log__]
100%|██████████| 1000/1000 [01:20<00:00, 12.36it/s]

junpenglao · May 20, 2018, 8:40pm

Hmm, cant think of a good way to handle this besides slicing and concatenating. Alternatively you can rewrite the likelihood function but I am not sure if it is easier to do in your usecase.

Topic		Replies	Views
NUTS - FromFunctionOp has no attribute 'grad' Questions theano	2	950	August 3, 2018
Reuse scale matrix from NUTS in HMC Questions	4	772	January 19, 2018
Avoiding non-NUTS samplers when broadcasting a multiplication operation v5	10	96	April 13, 2024
Reusing tuned NUTS steps to save time Questions	0	377	February 19, 2021
NUTS sampling performance when using tensor.dot operator Questions	2	524	May 26, 2019

Using a T.repeat(x, <pattern>) instead of T.repeat(x, <int>) means no NUTS

Related Topics