If there’s a relationship between missingness and a covariate that you are interested in, you have to model that explicitly. A model cannot give you more than the structure you put in its prior.
You could for instance model y and x as a mvnormal with partially missing y, and x. A covariance could be learned from the data. There’s an old pymc3 talk that touches on this: Partial Missing Multivariate Observation and What to Do With Them by Junpeng Lao
Another example, is a soft truncation process: Modeling with soft truncation