Adding constraints with pm.Potential

knoahcud · January 16, 2024, 11:00am

Hi,
I’m quite new to pymc and want to add strict constraints to a linear regression model.

import pymc as pm
import pandas as pd
import numpy as np
import pytensor.tensor as at

if __name__ == "__main__":

    df = pd.read_csv("BLR_lib_GU48-1234.csv", header=None, index_col=False)

    X = df.iloc[:, :-1]
    # Standardize the features
    stds = X.std().values
    X -= X.mean()
    X /= X.std()

    y_obs = df.iloc[:, -1]

    with pm.Model(coords={"predictors": X.columns.values}) as model:
        # Prior on error SD
        sigma = pm.HalfNormal("sigma", sigma=0.1)

        # coefficients
        betas = pm.Normal("beta", sigma=1, dims="predictors")
        expected_value = at.dot(X.values, betas) 

        # constraints
        con_1 = pm.math.ge(betas/stds[0] + betas/stds[1], 0.0)
        con_2 = pm.math.ge(betas[2]/stds[2], 0.0)
        con_3 = pm.math.le(betas[4]/stds[4], 0.0)
        con_4 = pm.math.ge(betas[2]/stds[2] + betas[3]/stds[3] + betas[4]/stds[4], 0.0)
        con_5 = pm.math.ge(betas[5]/stds[5], 0.0)
        con_6 = pm.math.le(betas[8]/stds[8], 0.0)
        con_7 = pm.math.ge(betas[5]/stds[5] + betas[6]/stds[6] + betas[7]/stds[7] + betas[8]/stds[8], 0.0)

        potential_1 = pm.Potential("con_1", pm.math.log(pm.math.switch(con_1, 1, 0.0)))
        potential_2 = pm.Potential("con_2", pm.math.log(pm.math.switch(con_2, 1, 0.0)))
        potential_3 = pm.Potential("con_3", pm.math.log(pm.math.switch(con_3, 1, 0.0)))
        potential_4 = pm.Potential("con_4", pm.math.log(pm.math.switch(con_4, 1, 0.0)))
        potential_5 = pm.Potential("con_5", pm.math.log(pm.math.switch(con_5, 1, 0.0)))
        potential_6 = pm.Potential("con_6", pm.math.log(pm.math.switch(con_6, 1, 0.0)))
        potential_7 = pm.Potential("con_7", pm.math.log(pm.math.switch(con_7, 1, 0.0)))

        y = pm.Normal('y', mu=expected_value, sigma=sigma, observed=y_obs.values)

        start = {"sigma": 1.0, "beta": np.zeros(len(X.columns))}
        idata = pm.sample(draws=2000, tune=1000, chains=4, step=[pm.NUTS(target_accept=0.99, max_treedepth=30)])

The regression is performed with standardized input data. The restrictions are specified for the unstandardized coefficients, which is why they need to be transformed. During execution, when strict constraints are applied (using pm.math.log(pm.math.switch(constraint, 1, 0.0))), I receive the error message:

pymc.exceptions.SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'sigma_log__': array(-2.67328751), 'beta': array([ 0.95591867,  0.33871937,  0.00713485,  0.73745527,  0.74412467,
       -0.82814385,  0.69393661,  0.62493708,  0.95804281, -0.80145671,
       -0.51079383,  0.11153434, -0.66721516])}

This error disappears when I use a low number for the false state in pm.math.switch. Does anyone know why this is happening and whether the constraints have been integrated into the model?

cluhmann · January 16, 2024, 11:18pm

Welcome!

I cannot run your code because it relies on the external csv file. But I assume that the starting point[s] is[are] running afoul of your constraints. If the sampler cannot find valid points to begin its initialization, it may well give up with the error you’re seeing.

ricardoV94 · January 17, 2024, 8:21am

Hard constraints like that are unlikely to play well with NUTS, even if it finds a valid initial point. It’s better if you can softly penalize them instead.

Some of those also seem avoidable altogether by using positive priors

jessegrabowski · January 17, 2024, 10:02am

ZeroSumNormal prior might be useful, combined with something like pm.Potential('positive_penalty', -(betas[4] + betas[8]) to punish draws with positive values in the positions of the beta vector you want to be negative.

I don’t think dividing by the standard deviation is relevant here, because it’s always strictly positive.

knoahcud · January 18, 2024, 8:31am

Thank you for all of your advices!

I have tried using the initval argument in pm.sample and pm.Normal. Both doesn’t seem to work. When I check for the inital point with model.initial_point() before pm.sample it shows me the set initial values. Once the pm.sample() starts, other values are used. What could be the problem here?

I changed some priors resulting which leaves only 3 constraints. Is there a threshold where you would say that using hard constraints is still acceptable with NUTS or should i just soft threshold the remaining 3? When soft thresholding those constraints, which small number would be sufficient to emulate a hard constraint?

I don’t understand how ZeroSumNormal can be used here. The constraints are all less or greater equal. Can you please explain the basic idea a bit further.
The constraints are given in the unstandardized formulation to ensure numerical stability in the resulting equation.

ricardoV94 · January 18, 2024, 8:55am

NUTS will never behave well with hard constraints if the posterior is close enough to “violating them”. By soft-contraints I actually mean a continuous (differentiable) function that increasingly penalizes the posterior as it violates the constraints more and more. The shape and strength of this penalty will be something you have to investigate as it depends on the model/data.

The typical example from ML is to penalize very large coefficients with something like sum(beta ** 2) * penalty_multiplier.

In a bayesian setting, the ideal is to define constraints as part of the prior. But that is sometimes not possible or too cumbersome, in which case a Potential is a nice tool to just throw at the problem.

knoahcud · January 18, 2024, 10:10am

So for example con_1 = pm.math.ge(betas/stds[0] + betas/stds[1], 0.0) could be penalized with pm.Potential("con_1", pm.math.switch(con_1, 0, -((betas/stds[0] + betas/stds[1])**2*penalty_multiplier) instead of a fixed penalty as done beforehand.

ricardoV94 · January 18, 2024, 10:34am

Something like that, but probably the negative of the expression, as the Potential is added to the model logp. You want the logp to be lower when the constraint is violated more

knoahcud · January 18, 2024, 12:03pm

Isn’t the negative in the switch of pm.Potential("con_1", pm.math.switch(con_1, 0, -((betas/stds[0] + betas/stds[1])**2*penalty_multiplier) enough or am I missing something?

jessegrabowski · January 18, 2024, 12:45pm

My logic with ZeroSum was that you want some of the coefficients to be negative and some to be positive. If they all sum to zero, satisfying one set of constraints (say the negative ones) would automatically push the others parameter values towards satisfying the other constraints.

ricardoV94 · January 18, 2024, 12:48pm

Sorry I didn’t see the minus before. That should be enough

Topic		Replies	Views
Define constraint on the priors Questions	12	2092	April 19, 2018
How to put prior knowledge in to student-t model Questions	3	516	May 19, 2020
Constraint or prior on product of pm.MutableData and Random Variable version agnostic	4	544	February 14, 2023
Set constraints on probability distributions Questions	5	1767	October 5, 2019
Sampling from simplex with inequality constraints Questions	7	2654	May 20, 2019

Adding constraints with pm.Potential

Related topics