# Creating a chain of dependent variables without for loop

I want to create a series of normally distributed X random variables such that:

X_0 \sim \mathcal{N}(\mu_0, \sigma)
X_{i} \sim \mathcal{N}(X_{i-1}, \sigma)

This can be implemented with a for loop:

def create_dependent_series(n: int, initial_mu=0, sigma=1):
# Assumes n >= 1 and function is called inside a model context
variables = [pm.Normal('series_0', mu=initial_mu, sigma=sigma)]
for i in range(1, n):
var = pm.Normal(f'series_{i}', mu=variables[i-1], sigma=sigma)
variables.append(var)
return variables


Though I am aware that it is better practice to define collections using the shape parameter, but that creates collections of independent variables. Is there anything I am missing to create a structure similar to this, without the for loop?

In this particular case I think it is equivalent to a gaussian random walk which is at the same time equivalent to a cumulative sum of independent gaussians.

To use the built in GRW you can take a look at Timeseries — PyMC3 3.11.2 documentation and Bayesian Survival Analysis — PyMC3 3.10.0 documentation

1 Like

I am a little unsure on what ‘innovation drift’ means for the mu parameter, i initially thought it would be the initial mean for the GRW, but as this also accepts a vector with the length of the walk n, then it seems like mu is involved at each step of the walk?

Is the innovation drift a deterministic constant that is assumed to be added to the cumulative sum at each step of the walk?

If so to make the code equivalent to the above would this be correct:

def create_dependent_series(initial_mu=0, sigma=1):
return pm.distributions.timeseries.GaussianRandomWalk('series', sigma=sigma) + initial_mu