Labeled coords and dims in hierarchical group setting with different sizes

Dear all,
I have read already this tutorial

and this Coordinates in PyMC & InferenceData Objects – Computational Behavior Lab – Experiments, decision-making, Python, math, etc. in addition to many block posts, but I just can’t wrap my head around the coords and shape parameters and what format the data needs to be in.

I have experimental data with two different groups (aka conditions) with a different amount of participants in each (~30, labled consicutively i.e. 1-60). Then I have a different amount of observations for each participant (~25).

I set up an hierarchical model: each participants parameter should be drawn from the respective group distribution, the problem is, that I get 60 parameters, i.e. parameters for condition 1 from a participant that has not even been in group 1. I also tried to lable the participants per condition starting from 1 but it didn’t work either. I also don’t want to assume some common distributions between participant 1 of condition 1 and participant 1 of condition 2. If I use the “old” shape parameter to force the dimensions that I want to have i.e. (two conditions,participants) I get broadcasting errors.

This is my code:

exp_idx, experiment = pd.factorize(df["exp_idx"], sort=True)
participant_idx, participant_codes = pd.factorize(df["participant"], sort=True)
coords = {"Experiment": [1,2], "obs_id": np.arange(df.shape[0]), "participants": participant_codes.tolist(),}
with pm.Model(coords=coords) as hier_model_repara:
    a = pm.Data("a", df.a.values, dims="obs_id")
    a_obs = pm.Data("a_obs", df.a_obs.values, dims="obs_id")
    b = pm.Data("b", df.b.values, dims="obs_id")
    b_obs = pm.Data("b_obs", df.b_obs.values, dims="obs_id")
    level = pm.MutableData('level', df.level_scaled.values, dims='obs_id')
    nr_o = pm.Data('nr_o', df.nr_o.values, dims="obs_id")
    nr = pm.Data('nr', df.nr_trials.values, dims="obs_id")   
    
    # experiment dimensions -> 2 
# 3 hyper priors for group level (adaptation, intercept, level)
    mu_adap = pm.Normal("mu_adap", 0.0, sigma=10.0, dims="Experiment")
    sigma_adap = pm.HalfCauchy('sigma_adap', 5, dims="Experiment")
    mu_i = pm.Normal('mu_i', mu=0, sigma=2, dims = "Experiment")
    sigma_i = pm.HalfCauchy('sigma_i', 5, dims="Experiment")
    mu_l = pm.Normal('mu_l', mu=0, sigma=2, dims = "Experiment")
    sigma_l = pm.HalfCauchy('sigma_l', 5, dims="Experiment")
    
    
# individual participant parameters
    adaptation_offset = pm.Normal('a_offset', mu=0, sigma=1,  dims=("participants","Experiment"))
    adaptation = pm.Deterministic("adaptation", mu_ adap + adaptation_offset * sigma_ adap)
    
    intercept_offset = pm.Normal('i_offset', mu=0, sigma=1, dims=("participants","Experiment"))
    intercept = pm.Deterministic("intercept", mu_i + intercept_offset *sigma_i)
    
    level_offset = pm.Normal('m_offset', mu=0, sigma=1, dims=("participants","Experiment"))
    levelscale = pm.Deterministic("levelscale", mu_l + level_offset *sigma_l)
           
    A_diff = a - a_obs
    B_diff = b_obs - b 
    diff = A_diff + B_diff 

    diff_adapt = diff * adaptation[participant_idx,exp_idx]

    p = pm.Deterministic('p', pm.math.sigmoid(intercept[participant_idx,exp_idx] + neural_response + level*levelscale[participant_idx,exp_idx]), dims='obs_id')
    n_os = pm.Binomial('respone',n=nr, p=p, observed = nr_o, dims='obs_id')

In addition to that, the hierarchical model does perform worse (tested with loo) than the simple pooled model, is that even possible? After all I read, everyone pleads to use a hierarchical model as it should explain more variance and also fits the actual design better. What am I doing wrong?
Help is highly appreciated!

Just looking through your model quickly, it looks like you have have a separate adaptation parameter for every combination of participant and experiment (i.e., dims=("participants","Experiment")). So 2 per participant. I assume this is not what you want. I would assume you want 1 per participant, but have these participant-level parameters drawn from separate hyper parameters. But you have a single list of participant IDs and no way to figure out which condition/group/Experiment each is associated with. So you can do something like this?

adaptation_offset = pm.Normal('a_offset',
                              mu=0,
                              sigma=1,
                              dims="participants"
)
adaptation = pm.Deterministic("adaptation",
                              mu_adap[exp_idx] +
                              (adaptation_offset * sigma_adap[exp_idx])
)

If I didn’t screw that up, it should create 1 adaptation_offset per participant, and then creates the per-participant adaptation by selecting the appropriate, experiment-specific value of mu_adap and sigma_adap. Similar sorts of things can be done for the other parameters.

Thank you for your response!

I tried this already before but when I adjust the code after the lines you posted like this:

diff_adapt = pm.Deterministic("adaptation_diff", adaptation_diff * adaptation[exp_idx])

    p = pm.Deterministic('p', pm.math.sigmoid(intercept[exp_idx] + diff_adapt + level*levelscale[exp_idx]))
    n_os = pm.Binomial('respone',n=nr, p=p, observed = nr_o, dims='obs_id')

the model compiles but when sampling I get the ValueError: Input dimension mismatch. One other input has shape[0] = 1650, but input[1].shape[0] = 60. (I have 1650 observations and 60 participants).
I also tried with different versions of adaptation[exp_idx] without any index and with adaptation[exp_idx,participant] but doesn’t work either.
If I don’t specify the dims of the code that follows at all I get the AssertionError: Could not broadcast dimensions even before sampling.

I’m not sure what adaptation_diff is as it wasn’t in your original model and I’m not sure what line is throwing the error without a more complete error message/traceback. Can you provide the model you are now running?

I just re-wrote this line in order that I can see the adaptation-difference values after sampling from the model. Everything else stayed the same except that I removed “experiment” as a dim for the offset parameters and adjusted the mus/sigmas with the index - as you proposed.

So when I specify the parameters intercept, adaptation and level to be participant specific, I would assume I also need to index this when using the varaiables later in the model to calculate another parameter with it, don’t I?

That line is going to break because this line

adaptation = pm.Deterministic("adaptation",
                              mu_adap[exp_idx] +
                              (adaptation_offset * sigma_adap[exp_idx])
)

creates 1 value of adaptation per participant. Asking for adaptation[participant_idx,exp_idx] still assumes you have 2 values per participant.

I agree to what you say, but also with diff_adapt = diff * adaptation[exp_idx] I get the error about non-matching shapes.
(of course also changing [exp_idx] for intercept and level)

It’s difficult to do much without example code and/or full error messages.

Ok sorry, I didn’t want to spam. Here is everything:

The coords:

participant_idx,participant_codes = pd.factorize(df["participant"])
exp_idx,exp_codes = pd.factorize(df["exp_idx"])
coords = {'participants': participant_codes.tolist(),
          'experiment': [1,2],
          'obs_id': np.arange(df['nr_o'].shape[0])
         }

The model:

with pm.Model(coords=coords) as hier_model_repara:
    a= pm.Data("a", df.a.values, dims="obs_id")
    a_obs= pm.Data("stimulus_f0", df.a_obs.values, dims="obs_id")
    b= pm.Data("b", df.b.values, dims="obs_id")
    b_obs = pm.Data("b_obs", df.b_obs.values, dims="obs_id")
    level= pm.MutableData('level', df.level_scaled.values, dims='obs_id')
    nr_o = pm.Data('nr_o', df.nr_o.values, dims="obs_id")
    nr = pm.Data('nr', df.nr_trials.values, dims="obs_id")   
    
    # experiment dimensions -> 2 
    mu_a = pm.Normal("mu_a", 0.0, sigma=10.0, dims="experiment")
    sigma_a = pm.HalfCauchy('sigma_a', 5, dims="experiment")
    mu_i = pm.Normal('mu_i', mu=0, sigma=2, dims = "experiment")
    sigma_i = pm.HalfCauchy('sigma_i', 5, dims="experiment")
    mu_l = pm.Normal('mu_l', mu=0, sigma=2, dims = "experiment")
    sigma_l = pm.HalfCauchy('sigma_l', 5, dims="experiment")
    
    
    adaptation_offset = pm.Normal('a_offset', 
                                  mu=0, sigma=1,  
                                  dims=("participants"))
    adaptation = pm.Deterministic("adaptation", 
                                  mu_a[exp_idx] + (adaptation_offset * sigma_a[exp_idx]))
    
    intercept_offset = pm.Normal('i_offset', 
                                 mu=0, sigma=1, 
                                 dims=("participants"))
    intercept = pm.Deterministic("intercept", 
                                 mu_i[exp_idx] + (intercept_offset *sigma_i[exp_idx]))
    
    level_offset = pm.Normal('l_offset', 
                             mu=0, 
                             sigma=1, 
                             dims=("participants"))
    levelscale = pm.Deterministic("levelscale", 
                                  mu_l[exp_idx] + (level_offset *sigma_l[exp_idx]))
           
    a_diff = a - a_obs
    b_diff = b_obs - b
adaptation_diff = a_diff + b_diff 

    diff_adapt = pm.Deterministic("adaptation_diff", adaptation_diff * adaptation[exp_idx])

    p = pm.Deterministic('p', pm.math.sigmoid(intercept[exp_idx] + diff_adapt  + level*levelscale[exp_idx]))
    n_os = pm.Binomial('respone',n=nr, p=p, observed = nr_o, dims='obs_id')

The error:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/aesara/compile/function/types.py:975, in Function.__call__(self, *args, **kwargs)
    973 try:
    974     outputs = (
--> 975         self.vm()
    976         if output_subset is None
    977         else self.vm(output_subset=output_subset)
    978     )
    979 except Exception:

ValueError: Input dimension mismatch. One other input has shape[0] = 1650, but input[1].shape[0] = 60.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Input In [645], in <cell line: 1>()
      1 with pooled_model_experiment_repara:
----> 2     adaptation_pooled_model_experiment_repara = pm.sample(draws=4000, chains=4, tune=1000, cores=16, return_inferencedata=True)

File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/pymc/sampling.py:540, in sample(draws, step, init, n_init, initvals, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, **kwargs)
    538         [kwargs.setdefault(k, v) for k, v in nuts_kwargs.items()]
    539     _log.info("Auto-assigning NUTS sampler...")
--> 540     initial_points, step = init_nuts(
    541         init=init,
    542         chains=chains,
    543         n_init=n_init,
    544         model=model,
    545         random_seed=random_seed_list,
    546         progressbar=progressbar,
    547         jitter_max_retries=jitter_max_retries,
    548         tune=tune,
    549         initvals=initvals,
    550         **kwargs,
    551     )
    553 if initial_points is None:
    554     # Time to draw/evaluate numeric start points for each chain.
    555     ipfns = make_initial_point_fns_per_chain(
    556         model=model,
    557         overrides=initvals,
    558         jitter_rvs=filter_rvs_to_jitter(step),
    559         chains=chains,
    560     )

File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/pymc/sampling.py:2494, in init_nuts(init, chains, n_init, model, random_seed, progressbar, jitter_max_retries, tune, initvals, **kwargs)
   2487 _log.info(f"Initializing NUTS using {init}...")
   2489 cb = [
   2490     pm.callbacks.CheckParametersConvergence(tolerance=1e-2, diff="absolute"),
   2491     pm.callbacks.CheckParametersConvergence(tolerance=1e-2, diff="relative"),
 2492 ]
-> 2494 initial_points = _init_jitter(
   2495     model,
   2496     initvals,
   2497     seeds=random_seed_list,
   2498     jitter="jitter" in init,
   2499     jitter_max_retries=jitter_max_retries,
   2500 )
   2502 apoints = [DictToArrayBijection.map(point) for point in initial_points]
   2503 apoints_data = [apoint.data for apoint in apoints]

File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/pymc/sampling.py:2388, in _init_jitter(model, initvals, seeds, jitter, jitter_max_retries)
   2386 if i < jitter_max_retries:
   2387     try:
-> 2388         model.check_start_vals(point)
   2389     except SamplingError:
   2390         # Retry with a new seed
   2391         seed = rng.randint(2**30, dtype=np.int64)

File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/pymc/model.py:1791, in Model.check_start_vals(self, start)
   1785     valid_keys = ", ".join(self.named_vars.keys())
   1786     raise KeyError(
   1787         "Some start parameters do not appear in the model!\n"
   1788         f"Valid keys are: {valid_keys}, but {extra_keys} was supplied"
   1789     )
-> 1791 initial_eval = self.point_logps(point=elem)
   1793 if not all(np.isfinite(v) for v in initial_eval.values()):
   1794     raise SamplingError(
   1795         "Initial evaluation of model at starting point failed!\n"
   1796         f"Starting values:\n{elem}\n\n"
   1797         f"Initial evaluation results:\n{initial_eval}"
   1798     )

File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/pymc/model.py:1832, in Model.point_logps(self, point, round_vals)
   1826 factors = self.basic_RVs + self.potentials
   1827 factor_logps_fn = [at.sum(factor) for factor in self.logp(factors, sum=False)]
   1828 return {
   1829     factor.name: np.round(np.asarray(factor_logp), round_vals)
   1830     for factor, factor_logp in zip(
   1831         factors,
-> 1832         self.compile_fn(factor_logps_fn)(point),
   1833     )
   1834 }

File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/pymc/aesaraf.py:695, in PointFunc.__call__(self, state)
    694 def __call__(self, state):
--> 695     return self.f(**state)

File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/aesara/compile/function/types.py:988, in Function.__call__(self, *args, **kwargs)
    986     if hasattr(self.vm, "thunks"):
    987         thunk = self.vm.thunks[self.vm.position_of_error]
--> 988     raise_with_op(
    989         self.maker.fgraph,
    990         node=self.vm.nodes[self.vm.position_of_error],
    991         thunk=thunk,
    992         storage_map=getattr(self.vm, "storage_map", None),
    993     )
    994 else:
    995     # old-style linkers raise their own exceptions
    996     raise

File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/aesara/link/utils.py:534, in raise_with_op(fgraph, node, thunk, exc_info, storage_map)
    529     warnings.warn(
    530         f"{exc_type} error does not allow us to add an extra error message"
    531     )
    532     # Some exception need extra parameter in inputs. So forget the
    533     # extra long error message in that case.
--> 534 raise exc_value.with_traceback(exc_trace)

File /projects/crunchie/miniconda3/envs/pymc_env/lib/python3.10/site-packages/aesara/compile/function/types.py:975, in Function.__call__(self, *args, **kwargs)
    972 t0_fn = time.time()
    973 try:
    974     outputs = (
--> 975         self.vm()
    976         if output_subset is None
    977         else self.vm(output_subset=output_subset)
    978     )
    979 except Exception:
    980     restore_defaults()

ValueError: Input dimension mismatch. One other input has shape[0] = 1650, but input[1].shape[0] = 60.
Apply node that caused the error: Elemwise{Composite{(i0 + (i1 * i2))}}[(0, 0)](AdvancedSubtensor1.0, i_offset, AdvancedSubtensor1.0)
Toposort index: 29
Inputs types: [TensorType(float64, (None,)), TensorType(float64, (None,)), TensorType(float64, (None,))]
Inputs shapes: [(1650,), (60,), (1650,)]
Inputs strides: [(8,), (8,), (8,)]
Inputs values: ['not shown', 'not shown', 'not shown']
Outputs clients: [[AdvancedSubtensor1(Elemwise{Composite{(i0 + (i1 * i2))}}[(0, 0)].0, TensorConstant{[0 0 0 ... 1 1 1]})]]

HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.
HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

Your code runs for me, but I had to assume what your data looks like. So maybe provide the output of df.head()?