I’m using PyMC’s Metropolis to sample from a 200-dimensional probability distribution and I want it to converge faster by setting the scale of the proposal distribution “just right”. Here’s what I did:

- Lots of trial and error and got an estimation of the variance of the target distribution.
- I used it to derive a proposal distribution with balanced scales among dimensions. (So the proposal distribution mimics the proportions of the target distribution).
- I calculated the radius of the typical set of a 200 dimensional normal pdf (turns out to be ~14 sigmas)
- and used it to normalize the scale of the proposal distribution to compensate for the dimensionality curse pushing my drawed samples too far where the probability is almost zero.
- After some more trial and error I figured that the proposal distribution’s scale should be further reduced by a factor of 2-5 to get a rejection rate of about 40-50%

Here’s a plot of the traces of 7 independent chains (projected onto the first two principal components of the whole set of samples):

which clearly shows that convergence was not achieved as the 7 chains have not mixed. But also seems that it is exploring the space, just not very fast.

Report:

```
number of samples = 20000
number of chains = 7
100%|██████████| 20000/20000 \[55:56\<00:00, 5.96it/s\]
The gelman\-rubin statistic is larger than 1.4 for some parameters.
The sampler did not converge.
The estimated number of effective samples is smaller than 200 for some parameters.
44% rejections.
```

I don’t know if 44% rejections is too high or too low. I just have the feeling that maybe there is a way to make this converge faster.

Also, adaptive methods give weird results, so I fixed the proposal distribution and it’s scale.

**Here’s my questions:**

**Is there a recommended rejection rate?**

What do you guys think about the sample traces? are they acceptable (qualitatively speaking)?

PS: I run Metropolis again with exactly the same parameters except for the ad hoc scalig factors of item 5 which I removed. I get this traces:

Excuse that the axes metrics can’t be compared mong the two images. And this report

```
starting metropolis
number of samples = 20000
number of chains = 7
100%|██████████| 20000/20000 [54:42<00:00, 6.09it/s]
The gelman-rubin statistic is larger than 1.4 for some parameters. The sampler did not converge.
The estimated number of effective samples is smaller than 200 for some parameters.
75% rejections.
```

So now I get a much higher rejection rate. And It seems to me that the solution space was poorly explored compared to results above.