# Model doesn’t contain any free variables error in 2D - GMM

I am adapting and old two dimensional GMM to the new `PyMC5` syntax, and I found an strange sampling error similar to this. Without really understanding what the error is about, I try the solution proposed there (to eliminate the shape argument that was redundant) but the error remains. Here goes the code snippet,

data generation,

``````import pymc as pm
import numpy as np
import matplotlib.pyplot as plt
import arviz as az
from pytensor.tensor.nlinalg import matrix_inverse

## Generate some data for three groups.
# Means, variances, covariances, and proportions

mus = np.array([[-4,2],[0,1],[6,-2]])
variance1 = [1,.4,1.5]
variance2 = [1,.8,5 ]
covariances = [.5,0, -1]
ps = np.array([0.2, 0.5, 0.3])
D = mus[0].shape[0]

# Total amount of data
N = 1000

# Number of groups
K = 3

# Form covariance matrix for each group
sigmas = [np.array([[variance1[i],covariances[i]],[covariances[i],variance2[i]]]) for i in range(K)]
# Form group assignments
zs = np.array([np.random.multinomial(1, ps) for _ in range(N)]).T
xs = [z[:, np.newaxis] * np.random.multivariate_normal(m, s, size=N) for z, m, s in zip(zs, mus, sigmas)]

# Stack data into single array
data = np.sum(np.dstack(xs), axis=2)
``````

model construction,

``````testvals = [[-2,-2],[0,0],[2,2]]

# Model structure
with pm.Model() as mvgmm:
# Prior over component weights
p = pm.Dirichlet('p', a=np.array([1.]*K))

# Prior over component means
mus = [pm.MvNormal('mu_%d' % i, mu=pm.floatX(np.zeros(D)), tau=pm.floatX(0.1 * np.eye(D)), shape=(D,), initval=pm.floatX(testvals[i])) for i in range(K)]

# Cholesky decomposed LKJ prior over component covariance matrices

# chol, corr, stds = [pm.LKJCholeskyCov('packed_L_%d' % i, n=D, eta=2., sd_dist=pm.HalfCauchy.dist(1)) for i in range(K)]
L = [pm.LKJCholeskyCov('packed_L_%d' % i, n=D, eta=2., sd_dist=pm.HalfCauchy.dist(1))[0] for i in range(K)]

# Unpack packed_L into full array
# L = [pm.expand_packed_triangular(D, packed_L[i]) for i in range(K)]

# Convert L to sigma and tau for convenience
sigma = [pm.Deterministic('sigma_%d' % i ,L[i].dot(L[i].T)) for i in range(K)]
tau = [pm.Deterministic('tau_%d' % i,matrix_inverse(sigma[i])) for i in range(K)]

# Specify the likelihood
mvnl = [pm.MvNormal.dist(mu=mus[i],chol=L[i]) for i in range(K)]
Y_obs = pm.Mixture('Y_obs',w=p, comp_dists=mvnl, observed=data)
``````

And this is the sampling,

``````with pm.Model() as mvgmm:
trace = pm.sample(chains=8)
``````

And the error is

`SamplingError: Cannot sample from the model, since the model does not contain any free variables.`

You are overriding the original model here. You should change to `with mvgmm:`

Ahhh what an idiot. I was looking in the most implausible places.
Thank you.

By the way, which is the recommended way to write these models? because it is easy to confuse dimensions and components. I was reading how to better label coordinates and dimensions here, but is really different to a list of random variables?

1 Like

The main difference is that using dimensions/coordinates allows you to slice and dice the resulting inferenceData object using the features available in xarray. If you instead instantiate N unrelated parameters, it is much more difficult to select, filter, and otherwise work with the results. But if it’s easier for you to work with what you have, that’s ok!

1 Like

Thank you!