Naively I’d think the right shape is 1, 1000, 591k, not the 591k twice. What dimensions do you have on the constant data group?
It might help to go through PyMC 4.0 with labeled coords and dims | Oriol unraveled. In xarray the important thing is the dimension name, not its length.
Also, not sure if that was clear already. The data you’ll have in inferencedata will be stored as numpy arrays. You will need to chunk it (aka convert it to dask arrays) before computing the log likelihood. If tge 591k dim is called “obs_id” you can do:
idata = idata.chunk(obs_id=5000)
Important: the chunk size is critical for efficiency. To small will be too slow and too big won’t allow proper parallelization (that needs loading multiple chunks into the ram).
Useful references: