Gradient Issues During Budget Optimization

fountainpen · October 2, 2025, 9:41pm

This is a follow up to my previous post at Budget recommendations don't align with saturation curves - #3 by cetagostini

I believe that the budget optimizer is failing to create sensible budgets due to extremely small gradients at initial conditions. I’ve tried a couple methods such as SLSQP, trust-constr, and others from this page, along with different iterations, step and gradient tolerances. While my optimizer succeeds, my gradients are about -1e-10 within 1-2 iterations when using SLSQP. Here is an example of a minimize_kwarg option I used

allocation_strategy, optimization_result = mmm.optimize_budget(
    response_variable='total_contribution',
    budget=total_budget,
    num_periods=12,
    budget_bounds=budget_bounds,
    minimize_kwargs={
        "method": "SLSQP",
        "options": {
            "ftol": 1e-12,
            "eps": 1e-4,
            "maxiter": 10000
        }
    }
)

With output

     message: Optimization terminated successfully
     success: True
      status: 0
         fun: -2.6441645213969007e-05
           x: [ 9.493e+03  9.725e+03  2.761e+03  6.144e+03  9.493e+03]
         nit: 2
         jac: [-3.105e-10 -6.196e-10 -3.665e-09 -5.130e-10 -4.423e-10]
        nfev: 3
        njev: 2
 multipliers: [ 8.562e-08]

I tried setting various initial conditions with the x0 param such as giving channel A 30-50% of budget, channel B 30-50% of budget, ect…and I still converge in 1-2 iterations with vastly different budget allocations each time. There doesn’t seem to be any meaningful direction in the gradients.

If I don’t supply budget bounds it gives uniform spends:

message: Optimization terminated successfully
     success: True
      status: 0
         fun: -6.262813287729734e-05
           x: [ 7.523e+03  7.523e+03  7.523e+03  7.523e+03  7.523e+03]
         nit: 1
         jac: [-4.657e-10 -9.294e-10 -5.497e-09 -7.696e-10 -6.635e-10]
        nfev: 1
        njev: 1
 multipliers: [-1.665e-09]

allocation_strategy: [7522.95181563 7522.95181563 7522.95181563 7522.95181563 7522.95181563]

cetagostini · October 3, 2025, 6:33am

Hey, could you take out eps, and ftol or tol parameters and try out the proposed options here?

fountainpen · October 3, 2025, 8:13pm

@cetagostini

I did the following

# Step 1: Create allocator normally
allocator = BudgetOptimizer(
    num_periods=12,
    response_variable='total_contribution',
    model=mmm,
)

# Step 2: Get the original objective function
original_objective = allocator._compiled_functions[allocator.utility_function]["objective_and_grad"]

# Step 3: Create scaled wrapper
SCALE_FACTOR = 1e10

def scaled_objective_and_grad(x):
    obj, grad = original_objective(x)
    return obj * SCALE_FACTOR, grad * SCALE_FACTOR

# Step 4: Replace the compiled function
allocator._compiled_functions[allocator.utility_function]["objective_and_grad"] = scaled_objective_and_grad

# Step 5: Run optimization with the scaled objective
allocation_strategy, optimization_result = allocator.allocate_budget(
    total_budget=total_budget,
    budget_bounds=budget_bounds,
    minimize_kwargs={
        "method": "SLSQP",
        "options": {
            "maxiter": 10000,
            "disp": True
        }
    }
)

     message: Optimization terminated successfully
     success: True
      status: 0
         fun: -441567.3512655523
           x: [ 3.562e+03  2.327e+04  2.761e+03  3.504e+03  4.522e+03]
         nit: 19
         jac: [-4.657e+00 -9.294e+00 -5.497e+01 -7.696e+00 -6.635e+00]
        nfev: 30
        njev: 19
 multipliers: [-7.696e+00]

These values look more reasonable at first glance. Why are there scaling issues at play here? I thought that the MMM class internally scales the channels and target variables when fitting?

cetagostini · October 8, 2025, 8:09pm

Hey indeed, scalers are handle under the hood. Nevertheless output and input in your case move in original scale, meaning, optimizer see the information in original scale.

Because your input is probably to far in scale than output this can happen. Your modification make sense, but the easy way will be to say something like:

def average_response(
    samples: pt.TensorVariable, budgets: pt.TensorVariable
) -> pt.TensorVariable:
    """Compute the average response of the posterior predictive distribution."""
    return (
pt.mean(_check_samples_dimensionality(samples)) / pt.max(samples) 
) * pt.sum(budgets)

No need to go directly into the grad, just adjust your response function to be in the same scale than your budgets. A way to do it will be as above, this will provide a augmented gradient by slsqp not to tiny or under the ftol after one iteration.

Note: Scale models (inputs or outputs) which are multi-dimensional in a optimization is not easy given the internal iterations from the model in the calls. If any scale wants to be apply, because magnitudes are different is up to the user to decide the scaling process, given the fact a general one for all cases can be complicated and not beneficial in certain types of optimization problems.

Topic		Replies	Views
Budget recommendations don't align with saturation curves v5 pymc-marketing	10	281	December 17, 2025
Mmm.allocate_budget_to_maximize_response generating weird decimal places results version agnostic pymc-marketing	18	528	November 7, 2024
Help on how mmm budget optimizer works v5 pymc-marketing	7	240	November 25, 2025
Allocate_budget_to_maximize_response Takes a lot of time to work v5 pymc-marketing	5	267	April 25, 2025
Response curve Coefficient's I have them, Want to apply the Budget Optimization Api to perform Scenario Planning version agnostic modeling	2	537	December 6, 2023

Gradient Issues During Budget Optimization

Related topics