Key error when using MCMC

Hi experts!

I’m new to using pymc3 and I’m trying to use MCMC to get estimates for two distributions s_t (which is an array made up of pm Dirichlet objects) and s, which I then use to calculate another distribution.

However, I’m getting an error:

Traceback (most recent call last):
File “/home/home4/.local/lib/python3.6/site-packages/xarray/core/dataset.py”, line 1259, in _construct_dataarray
variable = self._variables[name]
KeyError: ‘w2’

for this line in the code block below: w = idata.posterior["w"+str(j1)].sel(chain=0).data[i].

I’m not sure why this error is happening since I did define a pm object called ‘w2’ in the loop on the line s_t.append(pm.Dirichlet('w'+str(j), a=w)).

This error doesn’t happen when I run the code on Jupyter notebook, but when I run it using Python in terminal, it gives the key error, so I’m not sure if this is an issue with the environment rather than the way I’m using the package.

Are there any suggestions on how to debug this, or what might be going wrong? Thanks in advance!

p_mat = np.ones((len(all_docs), K))

#calculate estimates of components of posterior
for i, doc in enumerate(all_docs):
    basic_model = pm.Model()
    with basic_model:
        s_t = [] #supertopic to topic dist
        for j in range(0,K,int(K/K_s)):
            w=0.001*np.ones(K)
            w[j]=10
            w[j+1]=10
            s_t.append(pm.Dirichlet('w'+str(j), a=w))

        s = pm.Dirichlet('s', a=(1/K_s)*np.ones(K_s)) #supertopic dist
        Y_obs=pm.Categorical("likelihood",p=(s.dot(s_t)).dot(topics), observed=doc)
        idata = pm.sample(3, tune=3,chains=1, return_inferencedata=True,target_accept=0.95)
    

    #use mcmc components to calculate posterior
    p = np.zeros(20)
    s2=idata.posterior["s"].sel(chain=0).data
    for i in range(len(s2)):
        s1 = idata.posterior["w0"].sel(chain=0).data[i]
        for j1 in range(2,K,int(K/K_s)):
            w = idata.posterior["w"+str(j1)].sel(chain=0).data[i] #error is occurring here
            s1 = np.vstack((s1,w))
        p+=np.array(s2[i].dot(s1))
    p_mat[i,:] = p/len(s2)

You can call az.summary() and pass it your trace to get some summary statistics for all your parameters. You could also just print trace['posterior'].data_vars or the full trace['posterior'] to see what all is in there. Hopefully that helps.

1 Like