Wrong result when using strong potential

Hi, I want to estimate a model where I have a strong prior to the value of some internal variable in order to remove strong noise from the data.

As the first approach, I thought of specifying a potential which measures the distance between the average value of this deterministic parameter and my prior. However, this quadratic potential seems to create problems and to make the estimate converge to totally absurd values. In practice, the result seems to depend on the value of POT_STREGHT too high values will cause the sampler to return a wrong result.

Am I doing something wrong here?
How can I set the POT_STREGHT parameter correctly?
Any suggestion on how to reparametrize the model to make it converge better ?

import numpy as np
import pandas as pd
import pymc3 as pm
import theano

x = np.random.randn(100)

y = 0.1 * x

#add strong noise which will confuse the estimator
y += np.random.randn(y.shape[0])*2

df = pd.DataFrame(np.vstack([x, y]).T, columns=[ 'b', 'y'])

# R should be more or less this value
prior_R = 10.0

model = pm.Model()


def some_complex_nonlinear_function(x):
   #The example seems silly as we could impose a prior directly on b if this is linear, 
   # but I needed to simplify the example, in reality, this is a complex parametric function
    return x

with model:

    b = pm.Normal('beta',0,1)

    #This is the definition for R
    R = theano.tensor.mean(1/some_complex_nonlinear_function(x)*x)/b
    #quadratic potential
    pm.Potential('error',-(R - prior_R)**2*POT_STREGHT)

    y = pm.Normal('y',mu=b*x,sd=1,observed=y)

    trace = pm.sample(1000, cores=2, tune=500)

# I would expect b to be very close to 0.1 despite the noise with such an high potential  
# but this does not happen

If I read it correctly, your model has only 1 parameter b, in that case, regularization is better placed by changing the prior of b.
The model you wrote above should work in principle as well, maybe you can try replacing the Potential with pm.Normal('error', mu=R, tau=POT_STREGHT, observed=prior_R), which equivalent to a L2 regularization.