Beta random walk

alan-carroll · October 15, 2020, 10:16pm

I’m attempting to implement a beta random walk for a parameter in a larger model that I expect to vary over time and am confused about whether I’m taking the right approach or not. I’ve tried copying the design of the GaussianRandomWalk class and implementing a logp function, and also explicitly creating each distribution in a loop through the number of steps. They both “work”, but I get different results with each. If my logic is completely wrong, or I’m missing something obvious, please let me know! I’m still learning about all of this. Otherwise if someone can explain the discrepancies between approaches I’d appreciate it.

For testing I just have a dummy dataset of binomial samples with variable probabilities of success at each timepoint (I expect autocorrelation and the real parameter will be part of a larger model, so I don’t want beta-binomial distribution).

Beta Random walk:

class BetaRandomWalk(pm.distributions.continuous.UnitContinuous):
    def __init__(self, delta=0.8, theta=20, *args, **kwargs):
        kwargs.setdefault("shape", 1)
        super().__init__(*args, **kwargs)
        
        self.delta = tt.as_tensor_variable(delta)
        self.theta = tt.as_tensor_variable(theta)
        
        self.mode = 0.5
        
    def logp(self, x):
        x_im1 = x[:-1]
        x_i = x[1:]
        
        alpha = tt.max(tt.stack([self.theta * x_im1 + self.delta, tt.ones_like(x_im1)]), axis=0)
        beta = tt.max(tt.stack([self.theta * (1 - x_im1) + self.delta, tt.ones_like(x_im1)]), axis=0)
        
        innov_like = pm.Beta.dist(alpha=alpha, beta=beta).logp(x_i)
        
        return tt.sum(innov_like)

The models

delta = 0.8
theta = 20

with pm.Model() as brw_model:
    p = BetaRandomWalk("p", delta=delta, theta=theta, shape=num_steps)
    likelihood = pm.Binomial("likelihood", p=p, n=50, observed=dummy_df["nHit"].values)


with pm.Model() as bloop_model:
    arr_p = []
    arr_p.append(pm.Beta("arr_p0", alpha=2, beta=2))
    for s in np.arange(1, num_steps):
        a = pm.math.maximum(theta * arr_p[s-1] + delta, 1)
        b = pm.math.maximum(theta * (1 - arr_p[s-1]) + delta, 1)
        arr_p.append(pm.Beta(f"arr_p{s}", alpha=a, beta=b))
        
    p = pm.Deterministic("p", tt.as_tensor_variable(arr_p))
    
    likelihood = pm.Binomial("likelihood", p=p, n=50, observed=dummy_df["nHit"].values)

True p is:

[0.32, 0.32, 0.32, 0.32, 0.32, 0.32, 0.32, 0.32, 0.32, 0.32, 
 0.35, 0.38, 0.42, 0.45, 0.48, 0.51, 0.54, 0.58, 0.61, 0.64]

So they aren’t crazy off, but I don’t know if I’ve done either one correctly.

If anyone can help me understand, or point out how wrong I am about everything, I’d really appreciate it!

nkaimcaudle · October 15, 2020, 11:52pm

I didn’t look into your code but one, possibly easier, implementation is to use the built in GaussianRandomWalk and then wrap it inside tt.nnet.sigmoid() to squish between (0, 1)

Usage of maximum is discouraged as it causes the gradient to be zero below the cut, this causes problems for gradient based sampling like NUTS.

alan-carroll · October 16, 2020, 3:19am

You’re right that does appear to work quite well and matches up nicely with the loop model results. I checked what happens with more extreme values to see if the nonlinearity with step size would be an issue on the tails – it doesn’t look like it, but it highlighted that my BetaRandomWalk class definitely is incorrectly implemented.

Would this be a general way to implement AR’s for non-gaussians? I would also like to include an exponential AR(1), and it would certainly be simpler to wrap a GRW with pm.math.exp and feed that into pm.Exponential. I haven’t been able to find much information for estimating non-normal random walk parameters that I really understand!

In the case of maximum, would there be the same problem using tt.switch(a > 1, a, 1)? Maximum was just simpler to write, but isn’t necessary obviously.

nkaimcaudle · October 16, 2020, 11:43pm

It’s perfectly valid to do a GRW and then wrap it inside pm.math.exp().

tt.switch is very similar to maximum so it too will have zero gradient below 1. The issue is that if a is at -5 or 0.8 the model cannot “see” how close it is to being activated. One alternative approach is to use a soft rather than hard boundary.

2020-10-17_07-40

twiecki · October 17, 2020, 6:39am

Another approach is to do pm.Beta('x', ...).cumsum(), the only issue is that the innoviation is also a beta but you can append this random vector to a pm.Flat to get the same thing.

alan-carroll · October 17, 2020, 6:59pm

Any chance you could write a quick example model? I’m not understanding how to implement this, and I keep getting dimension errors trying to append to pm.Flat.

Topic		Replies	Views
Is it possible to embed a deterministic variable as the mean of a Gaussian Random Walk Questions	7	854	November 19, 2018
Irregularly Spaced Observations for Gaussian Random Walk Questions	3	640	July 23, 2019
Multivariate Random Walk Regression v5	3	339	December 8, 2023
Porting stan model to PyMC version agnostic	14	858	August 5, 2022
Combining AR/Negative Binomial with Gaussian Random Walk Questions	15	1340	August 9, 2019

Beta random walk

Related topics