Help with Out of Sample Predictions

I am defining the model using MutableData and setting the dimensions and coords within each data container. I was trying to follow PyMC 4.0 with labeled coords and dims — Oriol unraveled.

with pm.Model() as base_model: 
    

    # --- data containers ---
    fs_ = pm.MutableData(name="fs", value=fm.values, dims= ("date", "fourier_mode"), coords = {"date": mmm_dat['date'].values, "fourier_mode": np.arange(2 * n_order)})
    adstock_sat_media_ = pm.MutableData(name="asm", value = media_scaled, dims= ("date", "channel"),coords = {"date": mmm_dat['date'].values, "channel" : np.arange(len(media_variables))} )
    target_ = pm.MutableData(name="target", value=target_scaled, dims = "date", coords = {"date": mmm_dat['date'].values})
    t_ = pm.MutableData(name="t", value=t_scaled, dims = "date",coords = {"date": mmm_dat['date'].values} )

When I make out of sample predictions using the same size data (i.e. the exact training data), sample_posterior_predictive works as expected. However, when i attempt to change the input variables to be different sizes - fewer rows - 20 from 104 as follows:

test_coords = {"date": new_mmm_dat['date'], "fourier_mode": np.arange(2 * n_order), "channel" : np.arange(4) }

            
pm.set_data({"asm": new_media_scaled, "t":  new_t_scaled, "fs": new_fm.values},model = base_model, coords=test_coords)
pred_oos = pm.sample_posterior_predictive(trace = base_model_trace, model = base_model, predictions=True, extend_inferencedata=False, random_seed=rng )

I get the following error:

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along 
dimension 1, the array at index 0 has size 20 and the array at index 1 has size 21

All the new input data have 20 rows. I also see that when the function sample_posterior_predictive is called, instead of showing the likelihood being sampled, another coeff from the model is as well:

Sampling: [b_fourier, likelihood]

Questions:

  1. How can I produce out of sample predictions?
  2. Can sample_posterior_predictive return not just the likelihood but can it return intermediate values (deterministic) from the model generated with the new data?

I tried removing the dims and coords completely. So in the model


with pm.Model(coords=coords) as base_model: 
    
   
    fs_ = pm.MutableData(name="fs", value=fm.values)
    adstock_sat_media_ = pm.MutableData(name="asm", value = media_scaled)
    target_ = pm.MutableData(name="target", value=target_scaled)
    t_ = pm.MutableData(name="t", value=t_scaled)

and then in the out of sample:

pm.set_data({"asm": new_media_scaled, "t":  new_t_scaled, "fs": new_fm.values},model = base_model)
pred_oos = pm.sample_posterior_predictive(trace = base_model_trace, model = base_model, predictions=True, extend_inferencedata=False, random_seed=rng )

The same error occurs.

I have looked at the input shapes in the original model and the data in set_data and they seem fine.

media_scaled: (104,4)
new_media_scaled: (20,4)

t_scaled : (104,)
new_t_scaled : (20,)

fm.values : (104,4)
new_fm.values : (20,4)

I did find the answer to #2 above is to use var_names

The error is coming from a concatenation Op, where does that happen in your model? The data is being set correctly, but something that depends on the data is not.

Hey @jessegrabowski ,

The only apparent place I have a concat operation is here, which is a transformation used to generate
a deterministic variable in the model. ‘x’ is one of the input vectors from the scaled media being pass into the model. alpha and theta are parameters learned (would be taken from the trace) and the other arguments are fixed in the code. There is a stack operation as well…

def pt_adstock(x, alpha, l_max, normalize = True, delayed = False, theta = None):
    
    L_media = x.shape.eval()[0]
    
    newx = pt.concatenate([pt.zeros(l_max-1),x]) # append L-1 zeros to the start of the series
    L_newx = newx.shape.eval()[0]
    
    if delayed:
        w = pt.as_tensor_variable([pt.power(alpha,(i-theta)**2) for i in range(l_max)]) # weights
        
    else:
        w = pt.as_tensor_variable([pt.power(alpha,i) for i in range(l_max)]) # weights 
    
    w = w / pt.sum(w) if normalize else w
    
    # stack the rolling window arrays
    xmat = pt.stack([newx[(L_newx -i) -L_media : (L_newx -i)] for i in range(l_max)])
    
    adstockx = pt.dot(w,xmat) 
        
   
    return(adstockx)

Having eval inside your model is definitely going to be a source of errors, because it won’t dynamically update when the inputs change. But that doesn’t seem to be the problem here. Where does l_max come from?

l_max comes from a constant elsewhere in the notebook and doesnt change for model training or testing.

Is it possible with this code to not eval the shape?

With only a partial error message and partial code, my suspicion is that the error is coming from the newx line. I’d add some debug prints to your code to show you shape information to check if that is indeed true. pytensor.printing.Print is handy for this, see here.

To avoid the evals, I think you will have to use a scan Op to construct xmat.

I extracted more of the error, it is the stacked step it appears:

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along 
dimension 1, the array at index 0 has size 20 and the array at index 1 has size 21
Apply node that caused the error: Join(0, ExpandDims{axis=0}.0, ExpandDims{axis=0}.0, ExpandDims{axis=0}.0, 
ExpandDims{axis=0}.0, ExpandDims{axis=0}.0, ExpandDims{axis=0}.0, ExpandDims{axis=0}.0, ExpandDims{axis=0}.0)
Toposort index: 171
Inputs types: [TensorType(int8, shape=()), TensorType(float64, shape=(1, None)), TensorType(float64, shape=(1, 
None)), TensorType(float64, shape=(1, None)), TensorType(float64, shape=(1, None)), TensorType(float64, shape=(1, 
None)), TensorType(float64, shape=(1, None)), TensorType(float64, shape=(1, None)), TensorType(float64, shape=(1, 
None))]
Inputs shapes: [(), (1, 20), (1, 21), (1, 22), (1, 23), (1, 24), (1, 25), (1, 26), (1, 27)]
Inputs strides: [(), (160, 8), (168, 8), (176, 8), (184, 8), (192, 8), (200, 8), (208, 8), (216, 8)]
Inputs values: [array(0, dtype=int8), 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown',
'not shown', 'not shown']
Outputs clients: [[Transpose{axes=[1, 0]}(Join.0)]]

Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
  File 
"/home/ec2-user/SageMaker/custom-miniconda/miniconda/envs/pymc_env/lib/python3.11/site-packages/IPython/core/intera
ctiveshell.py", line 3009, in run_cell
    result = self._run_cell(
  File 
"/home/ec2-user/SageMaker/custom-miniconda/miniconda/envs/pymc_env/lib/python3.11/site-packages/IPython/core/intera
ctiveshell.py", line 3064, in _run_cell
    result = runner(coro)
  File 
"/home/ec2-user/SageMaker/custom-miniconda/miniconda/envs/pymc_env/lib/python3.11/site-packages/IPython/core/async_
helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
  File 
"/home/ec2-user/SageMaker/custom-miniconda/miniconda/envs/pymc_env/lib/python3.11/site-packages/IPython/core/intera
ctiveshell.py", line 3269, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File 
"/home/ec2-user/SageMaker/custom-miniconda/miniconda/envs/pymc_env/lib/python3.11/site-packages/IPython/core/intera
ctiveshell.py", line 3448, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File 
"/home/ec2-user/SageMaker/custom-miniconda/miniconda/envs/pymc_env/lib/python3.11/site-packages/IPython/core/intera
ctiveshell.py", line 3508, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_32323/1104102667.py", line 53, in <module>
    var_effect = pm.Deterministic(name=f'effect_{var}', var = beta * logistic_saturation_(pt_adstock(x, alpha, 
l_max, True , True, theta ), mu_log_sat))
  File "/tmp/ipykernel_32323/2543389627.py", line 24, in pt_adstock
    xmat = pt.stack([newx[(L_newx -i) -L_media : (L_newx -i)] for i in range(l_max)])

Trying to figure out how to print shape information but so far I am not sure - very new to Pymc and Pytensor…

A version without eval would look like this:

def pt_adstock(x, alpha, l_max, normalize = True, delayed = False, theta = None):    
    if delayed:
        w = pt.power(alpha, (pt.arange(l_max) - theta) ** 2)        
    else:
        w = pt.power(alpha, pt.arange(l_max))
    
    w = w / pt.sum(w) if normalize else w

    # stack the rolling window arrays
    def rolling_window_stack_step(i, x, L_newx, L_media):
        start = L_newx - i - L_media
        stop = L_newx - i
        return x[start:stop]

    newx = pt.concatenate([pt.zeros(l_max-1), x]) # append L-1 zeros to the start of the series
    xmat, _ = pytensor.scan(rolling_window_stack_step,
                            sequences=pt.arange(l_max),
                            non_sequences=[newx, newx.shape[0], x.shape[0]],
                            strict=True)
    
    adstockx = pt.dot(w,xmat) 
   
    return adstockx

I also removed the list comprehension to compute w, because pt.power is happy to do broadcasting for you

1 Like

This is so great, thank you! It works now!

Curious, does scan always return tensors stacked vertically like this?
Do you have any examples of how to add a print by chance to see how that is done and get a better feel for what exactly happens in scan?

Scan always “loops” over the left-most dimension, and the returns are also stacked that way.

Here’s an example that simulates a trajectory from an AR2 model, using the print statements:

import pytensor 
import pytensor.tensor as pt
import pymc as pm
from pymc.pytensorf import collect_default_updates

A = pt.as_tensor_variable([[0.99, -0.5],
                           [1.0,  0.00]])
x0 = pt.zeros((2,1))
R = pt.as_tensor_variable([[1.0], [0.0]])
n_steps = pt.iscalar('n_steps')

def ar2_step(x, A, R):
    innovation = pm.Normal.dist()
    innovation = pytensor.printing.Print('innov_draw')(innovation)
    next_x = A @ x + R @ innovation 
    next_x = pytensor.printing.Print('next_x')(next_x)
    return next_x, collect_default_updates(outputs=[next_x], inputs=[x, A, R])

output, updates = pytensor.scan(ar2_step,
                                outputs_info=[x0],
                                non_sequences=[A, R],
                                n_steps=n_steps)

f = pytensor.function([n_steps], output, updates=updates)

Things to note:

  • The input shapes are: A.shape = (2, 2), x0.shape = (2,1), and R.shape = (2, 1).

  • You can see that I just over-wrote the variables with printing versions of themselves. You can also give them new names, but then you need to use the printing version is subsequent computation.

  • This function doesn’t have a sequences argument, so you have to tell it how many steps to loop for. I made this a variable. This is not going to be the usual use case for you as a PyMC user

  • I compiled the function in the end with pytensor.function. You never want to do this as a PyMC user. PyMC will automatically compile your model when you call pm.sample. It’s for illustration purposes only.

So if we call f, the compiled function, we get the following output:

y = f(3)
>>>Out: 
innov_draw __str__ = 0.9230099481704481
next_x __str__ = [[0.92300995]
 [0.        ]]
innov_draw __str__ = 0.3474986043040847
next_x __str__ = [[1.26127845]
 [0.92300995]]
innov_draw __str__ = 0.3165087583179684
next_x __str__ = [[1.10366945]
 [1.26127845]]

So you can see it gives us debug output for each node as it computes. Checking the shape of y confirms that scan always stacks on the left-most axis:

y.shape
>>> (3, 2, 1)

The (2, 1) comes from the initial value x0, and the 3 comes from n_steps. If we scan for 100 steps:

y = f(100)
y.shape
>>> (100, 2, 1)
4 Likes