if the new and old values of the variable are far apart, the gradient is zero. ok. But if the variables are little enough apart for the sampler to converge, the resulting mean is still dependent on how far the variables were apart. Why is there a correlation here? I am really not able to use the theano shared variables the way i am hoping to 
Using the second suggested solution by non-center parametrize the model, I have no way of introducing new shared varibles for use in the sd? I was hoping to loop through a dataset with means and sds by the help of theano shared.