Theano shared variable error

BjornHartmann · April 22, 2020, 6:04pm

Hi
I have a problem that has been totally eating me up all day. I believe there is a simple solution to the question…

If I specify a theano shared variable for use in a pymc3 model so that i can loop without re-specifying the model, it seems the initial value of the shared variable effects the result. If the initial value is too far from the new value, I get a “The derivative of RV test1.ravel()[0] is zero.” error. Why is this??

See my code below. changing the nf value to 10 or change the new value to 100000 would fix the error…

def test():

nf = 10000000000000000000
sharednf = shared(nf)

with pm.Model() as model:
    test = pm.Normal("test"+str(1), mu=sharednf, sd=1**100)
    sum = test + 1
    pm.Deterministic("sum"+str(1), sum)
 
for i in range(10):
    sharednf.set_value(float(1))
    with model:
        trace = pm.sample(5000)
        trace1 = trace
        print(pm.summary(trace))
        pm.traceplot(trace)
        plt.show()

OriolAbril · April 22, 2020, 7:28pm

It looks like this GitHub issue. I don’t know the reason behind it, but it looks like modifying the priors or using init="adapt_diag" have solved the issue in some similar situations.

Another option would be to stop using GitHub master and install 3.8 version instead if you do not need any recently added feature.

BjornHartmann · April 22, 2020, 9:14pm

Thank you for getting back to me! These modifications did sadly not make a difference… I am already using pymc3 3.8. other suggestions?

junpenglao · April 23, 2020, 8:38am

PyMC3 assigns a default value to each Random Variable, for Normal distribution it is the mean. So in this case the underlying default value is sharednf, when it is super huge and the new set_value is too far away from it, the gradient is essentially 0.
There are a few way to fix it:
1, providing starting value to pm.sample:

pm.sample(..., start={"test"+str(1): float(1)})

2, non-center parameterization

with pm.Model() as model:
    test = pm.Normal("test"+str(1), mu=0., sd=1**100)
    sum = test + 1 + sharednf
    pm.Deterministic("sum"+str(1), sum)

BjornHartmann · April 23, 2020, 12:00pm

Hi! thank you so much for getting back to me! I see the root cause of the problem now. However, setting start value = new shared variable value is not doing the trick. Am i misunderstanding something?

junpenglao · April 23, 2020, 12:39pm

Maybe there is error cause by caching? Try clearing out your theano cache…

BjornHartmann · April 23, 2020, 1:42pm

i tried that without success. this is what i did:
trace = pm.sample(5000, start={“start”+str(1):float(sharednf.get_value())})

BjornHartmann · April 23, 2020, 2:07pm

if the new and old values of the variable are far apart, the gradient is zero. ok. But if the variables are little enough apart for the sampler to converge, the resulting mean is still dependent on how far the variables were apart. Why is there a correlation here? I am really not able to use the theano shared variables the way i am hoping to

Using the second suggested solution by non-center parametrize the model, I have no way of introducing new shared varibles for use in the sd? I was hoping to loop through a dataset with means and sds by the help of theano shared.

junpenglao · April 23, 2020, 4:50pm

start needs to be a dictionary-like object that each str name correspondent to the random variable name, ie pm.sample(..., start={"test1": float(1)})

junpenglao · April 23, 2020, 4:54pm

Likely not enough burning/tuning, increase the tuning should help

location-scale distribution are all be able to reparameterized like rv_you_want = unit_rv*sd + mu. A bit more information could be find in: https://docs.scipy.org/doc/scipy/reference/tutorial/stats.html

BjornHartmann · April 23, 2020, 5:52pm

awesome! thank you so much for getting back to me!!

BjornHartmann · April 24, 2020, 7:01pm

I have a quick followup question. Can one use theano shared variables to change a distribution?

From what I have understood, theano shared can only be used with np arrays.
Is there a way to do something like:
distList = [pm.Normal, pm.Lognormal,…]
sharedVar = theano.shared(0)
with pm.Model as model:
distList[sharedVar](“name”, mu=0, sd=1)

for i in range(10):
    shareVar = i
    with model:
        pm.sample()

This would allow the change of what distribution is used inside the “loop”

junpenglao · April 26, 2020, 5:33am

I dont think so - we cannot index a list with theano variables

BjornHartmann · April 26, 2020, 3:50pm

ok. thank you. I have now implemented the method of non center parametrization as suggested. I am just curious, Why is it so that doing
…
pm.Normal(“test”, mu=1e-10, sd=1e-20)
…
gives a “derivative is zero” error for the sampling, while shifting a the distribution by using the below works? As far as I can tell, the resulting distributions are equal.

....
test = pm.Normal("test", mu=0, sd=1)
shift = test*1e-20 + 1e-10
shifted = pm.Deterministic("shifted", shift)
....

AlexAndorra · April 27, 2020, 8:12am

Yeah, the reparametrization trick feels like magic
Basically, it’s because it presents HMC with a much better geometry to sample from. Here is a blog post where it’s explained in details.

Topic		Replies	Views
Test_value shape errors with theano.shared Questions	13	2555	March 27, 2018
Theano shared and prediction not working as expected Questions	2	656	July 13, 2019
Problem with shared variable for means Questions theano	2	1120	June 14, 2019
Testval for gp.Latent cannot be theano.shared? Questions	4	606	November 11, 2017
Shapes of shared variables Questions theano	3	976	June 29, 2018

Theano shared variable error

Related topics