Key error when using MCMC

sindy · September 21, 2021, 10:27pm

Hi experts!

I’m new to using pymc3 and I’m trying to use MCMC to get estimates for two distributions s_t (which is an array made up of pm Dirichlet objects) and s, which I then use to calculate another distribution.

However, I’m getting an error:

Traceback (most recent call last):
File “/home/home4/.local/lib/python3.6/site-packages/xarray/core/dataset.py”, line 1259, in _construct_dataarray
variable = self._variables[name]
KeyError: ‘w2’

for this line in the code block below: w = idata.posterior["w"+str(j1)].sel(chain=0).data[i].

I’m not sure why this error is happening since I did define a pm object called ‘w2’ in the loop on the line s_t.append(pm.Dirichlet('w'+str(j), a=w)).

This error doesn’t happen when I run the code on Jupyter notebook, but when I run it using Python in terminal, it gives the key error, so I’m not sure if this is an issue with the environment rather than the way I’m using the package.

Are there any suggestions on how to debug this, or what might be going wrong? Thanks in advance!

p_mat = np.ones((len(all_docs), K))

#calculate estimates of components of posterior
for i, doc in enumerate(all_docs):
    basic_model = pm.Model()
    with basic_model:
        s_t = [] #supertopic to topic dist
        for j in range(0,K,int(K/K_s)):
            w=0.001*np.ones(K)
            w[j]=10
            w[j+1]=10
            s_t.append(pm.Dirichlet('w'+str(j), a=w))

        s = pm.Dirichlet('s', a=(1/K_s)*np.ones(K_s)) #supertopic dist
        Y_obs=pm.Categorical("likelihood",p=(s.dot(s_t)).dot(topics), observed=doc)
        idata = pm.sample(3, tune=3,chains=1, return_inferencedata=True,target_accept=0.95)
    

    #use mcmc components to calculate posterior
    p = np.zeros(20)
    s2=idata.posterior["s"].sel(chain=0).data
    for i in range(len(s2)):
        s1 = idata.posterior["w0"].sel(chain=0).data[i]
        for j1 in range(2,K,int(K/K_s)):
            w = idata.posterior["w"+str(j1)].sel(chain=0).data[i] #error is occurring here
            s1 = np.vstack((s1,w))
        p+=np.array(s2[i].dot(s1))
    p_mat[i,:] = p/len(s2)

cluhmann · September 21, 2021, 10:49pm

You can call az.summary() and pass it your trace to get some summary statistics for all your parameters. You could also just print trace['posterior'].data_vars or the full trace['posterior'] to see what all is in there. Hopefully that helps.

Topic		Replies	Views
Value error of pm.model Questions	8	599	May 28, 2019
MCMC SamplingError: Initial evaluation of model at starting point failed! v5 bug , modeling	6	434	October 3, 2023
Theano errors when using complex python functions inside PyMC3 models Questions theano	0	586	May 10, 2020
Problem with PyMC3 and CategoricalData Questions	2	445	February 9, 2021
Theano error in demonstration DPMM notebook Questions theano , doc , bug	0	483	February 23, 2021

Key error when using MCMC

Related topics