Optimization failure when running model

byblian · January 31, 2019, 4:16pm

I am trying to implement the following model, which is intended to represent an elections cycle. Thus, the Dirichlet models the expected support of various parties on election day, and then the gaussian walk models the changes during the cycle counting back from election day.

But I am getting an optimization failure -

ERROR (theano.gof.opt): Optimization failure due to: graph_merge_softmax_with_crossentropy_softmax
ERROR (theano.gof.opt): node: SoftmaxWithBias(w, v1)
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
  File "C:\Users\yitzhak.sapir\AppData\Local\Continuum\miniconda3\envs\pymc3-theano\lib\site-packages\theano\gof\opt.py", line 2034, in process_node
    replacements = lopt.transform(node)
  File "C:\Users\yitzhak.sapir\AppData\Local\Continuum\miniconda3\envs\pymc3-theano\lib\site-packages\theano\tensor\nnet\nnet.py", line 1937, in graph_merge_softmax_with_crossentropy_softmax
    if x_client[0].op == crossentropy_softmax_argmax_1hot_with_bias:
AttributeError: 'str' object has no attribute 'op'

The following is a minimal code that reproduces the issue with comments explaining their purpose:

with pm.Model() as poll_model:
  # dirichlet model of support
  v  = pm.Dirichlet('v', np.ones(3), shape=3, testval=[0.2, 0.3, 0.7])
  # transform to log-ratio 
  v1 = pm.Deterministic('v1', T.log(v[0:-1]/v[-1]))
  # model walk in log-ratio space
  lkj = pm.LKJCholeskyCov('lkj', eta=50, n=2, sd_dist=pm.HalfCauchy.dist(2.5))
  chol = pm.expand_packed_triangular(2, lkj)
  w_ = pm.MvGaussianRandomWalk('w_', chol=chol, shape=[4, 2])
  # recover supports including walk
  w = pm.Deterministic('w', w_ - w_[0]) #.reshape((1, w_.shape[1])).repeat(4, axis=0))
  ea = pm.Deterministic('ea', T.exp(v1 + w))
  # normalize (intentionally leaving out the base used for log-ratio)
  mu = pm.Deterministic('mu', ea / ea.sum(axis=1).reshape((ea.shape[0],1)))
  # model observed polls
  x = pm.MvNormal('p', chol=chol, mu=mu, observed = [
                    [ 0.1, 0.9 ],
                    [ 0.2, 0.8 ],
                    [ 0.4, 0.6 ],
                    [ 0.8, 0.2 ]
                ])
        
  samples = pm.sample(10, n_init=8, njobs=1, init='advi')

I thought it might be my subtraction w_ - w_[0] but even without it, I get the failure. In any case, it is important to me that the gaussian walk start at exactly 0.
My model also runs very slowly on my machine, but first I’d like to complete the model before tackling that.

junpenglao · January 31, 2019, 9:11pm

Hmmm there might be some theano optimization error - rescaling and centering latent variable usually make inference very difficult because your model becomes unidentifiable.

Here the problem is

mu = pm.Deterministic('mu', ea / ea.sum(axis=1).reshape((ea.shape[0],1)))

havent find a way to fix it yet tho.

byblian · February 1, 2019, 2:19pm

Thank you for the response.

I managed to resolve it using:

        ea_sum = T.repeat(ea.sum(axis=1), num_parties, axis=0).reshape([num_days,num_parties])

I’m guessing the “str” in the error message may possibly refer to “x” in a dimshuffle argument.
As for the centering/rescaling, if that is referring to the division by the sum, I was basing myself on the log-ratio transform suggested here (with code in Stan):
http://www.marcel-neunhoeffer.com/publication/pa_forecast-multiparty/

I’m trying now to consider other methods to represent the progress through the elections cycle.

Topic		Replies	Views
ERROR (theano.gof.opt) & slowness in execution in sampling Questions	3	1151	March 27, 2019
SamplingError: Initial evaluation failed at starting point version agnostic theano , modeling	2	820	March 9, 2022
GP marginal likelihood - Theano compilation error Questions theano	3	1091	October 22, 2020
Error in check_key (Theano) only when running multiple jobs on cluster v3 bug	0	536	May 15, 2022
ADVI example failing to run Questions theano	2	711	April 4, 2020

Optimization failure when running model

Related topics