# What is the correct format/shape of the initial values for Sequential Monte Carlo (SMC)?

How do I properly format the initial values passed to the parameter `start` in the function `pymc.smc.sample_smc` for Sequential Monte Carlo (SMC)?

To give a concrete example, let me consider the following code for Bayesian linear regression:

``````import pymc as pm
import numpy as np

def basic_model(observed_data):
array_sizes = np.array([size for (size, _) in observed_data])
array_costs = np.array([cost for (_, cost) in observed_data])
coefficient_sigma = 5

with pm.Model() as model:
coefficient0 = pm.HalfNormal(
"coefficient0", sigma=coefficient_sigma)
coefficient1 = pm.HalfNormal(
"coefficient1", sigma=coefficient_sigma)

predicted_bounds = coefficient0 + coefficient1 * array_sizes
observed_costs = pm.Normal("observed_costs", mu=predicted_bounds,
sigma=10, observed=array_costs)
return model

observed_data = [[1, 1], [2, 2], [4, 3], [8, 4], [
16, 7], [32, 10], [64, 13], [128, 17], [256, 18]]

init_nuts = {"coefficient0": 10,
"coefficient1": 10}

num_draws = 1000
num_chains = 4

init_smc = {"coefficient0_log__": np.full((num_draws, num_chains), 10),
"coefficient1_log__": np.full((num_draws, num_chains), 10)}

with basic_model(observed_data):
idata = pm.sample_smc(num_draws, start=init_smc,
chains=num_chains, random_seed=42)
``````

When I run this code with PyMC v5.13.1, I get a warning `UserWarning: More chains (4) than draws (1). Passed array should have shape (chains, draws, *shape)`. I believe the warning stems from the improper shape/format of the initial values `init_smc` I pass to the function `pm.sample_smc`. Here, `init_smc` is a dictionary mapping variable names (i.e., `coefficient0_log__` and `coefficient1_log__`) to numpy arrays of shape `(num_draws, num_chains)`.

I couldnâ€™t find documentation on what the initial valuesâ€™ format should be. Does anyone know by any chance how to resolve this issue? Thanks a lot in advance!

For some reason, if we replace

``````init_smc = {"coefficient0_log__": np.full((num_draws, num_chains), 10),
"coefficient1_log__": np.full((num_draws, num_chains), 10)}
``````

with randomized initial values

``````init_smc = {"coefficient0_log__": np.random.normal(size=(num_draws, num_chains)),
"coefficient1_log__": np.random.normal(size=(num_draws, num_chains))}
``````

the warning is suppressed, even though the shape of the numpy arrays inside `init_smc` remains the same.

Shouldnâ€™t it be chains x draws instead of the other way around?

@ricardoV94 Thatâ€™s what I initially thought as well. I read somewhere that the shape should be `(num_chains, num_draws)` although I donâ€™t recall where exactly I saw it. But when I tried the shape `(num_chains, num_draws)` in my code, it crashed. So I instead had to use the shape `(num_draws, num_chains)`. It would be great if someone could revise the documentation of `sample_smc` to clarify this point.

That may actually be a bug / untested feature? I see the only test we have is for a single chain: pymc/tests/smc/test_smc.py at 6761c0c73ba07cc9dc51ec3adab7f1aa5f76b23d Â· pymc-devs/pymc Â· GitHub

Do you mind opening an issue to check if this is behaving as expected and/or update the docs in our github repository?

Sure. Let me open an issue on PyMCâ€™s GitHub repository.

@ricardoV94 Iâ€™ve reported this issue on GitHub, asking for the clarification in the PyMC documentation and updating the unit test.

1 Like

Thanks!