Hi everyone,
I am a Bayesian beginner, and I am having some difficulty with a high-dimensional problem.
My goal is to fit a multivariate distribution (the r largest order statistics (GEVr) model) called f to an observational dataset drawn from this distribution + some Gaussian random noise. This distribution depends on 3 free parameters (noted \Theta) and takes as input a vector \mathbf{u} = (u_1,u_2,...,u_r) of size r whose components are ranked i.e., u_1>u_2>...>u_r.
I have already managed to fit it to an idealistic observational dataset (without random noise), but now I’d like to move on to a more realistic case by adding some random Gaussian noise. In this scenario, the likelihood is obtained by convolving the GEVr f with a Gaussian distribution G, which is very costly and does not seem feasible in a reasonable amount of time. In that case the likelihood is given by;
\mathcal{L}(\mathbf{U}^{obs}|\Theta) = \prod_{i = 1}^{N}\int G(\mathbf{u}^{obs}_i|\mathbf{u}) f(\mathbf{u}|\Theta)\rm{d}\mathbf{u}
I have read that a Bayesian way to solve this problem would be to avoid doing the integral and treat the vector \mathbf{u} as a latent variable and that one need to sample \mathbf{u} in addition to the rGEV free parameters. In that case the posterior would just be the product of this rGEV with the Gaussian distribution and the prior i.e.
P(\theta|\mathbf{u}) \propto \prod_{i = 1}^{N}G(\mathbf{u}^{obs}_i|\mathbf{u}) f(\mathbf{u}|\Theta)\Pi(\Theta)
Given that I have N = 60 measurements of size r = 5, it means that the dimension of the problem is = 303. Can PyMC handle this dimensionality?
And my other difficulty is to take into account the fact that when sampling \mathbf{u}, I must take into account that u_1>u_2>...>u_r, , is there a way to efficiently do that with PyMC?
Thank you very much for your time.