When you execute pm.sample, you should get a logging message that tells you what samples are being assigned to each random variable. By default, all continuous variables are given the NUTS sampler, and discrete variables are given either Metropolis or BinaryMetropolis. PyMC allows you to use an ensemble of different samplers, so if you have a mixture of continuous and discrete variables, it will get proposals for each variable from the appropriate sampler.
It’s true that marginalization can be a pain. I don’t know anything about your specific likelihood function, but unless it involves recursive computation or expensive matrix operations, I would guess that marginalization will always end up being faster than using the ensemble sampler, and will offer more stable/efficient sampling to boot. But I stress that this is pure speculation.