# Simple linear model fails

I’m trying to run a simple linear model on the famous “diamonds” dataset, but having trouble with the HMC. What am I doing wrong?

``````import pymc3
import numpy as np
import pandas as pd

)
diamonds = diamonds.sample(n=1_000_000, replace=True)

print(diamonds)

#        carat        cut color clarity  depth  table  price     x     y     z
# 38867   0.40    Premium     F     VS1   61.4   58.0   1050  4.75  4.73  2.91
# 18067   2.01       Fair     F      I1   58.7   66.0   7294  8.30  8.19  4.84
# 1507    0.71  Very Good     F     VS2   59.6   56.0   2994  5.84  5.88  3.49
# 6618    0.90    Premium     H     VS2   60.7   58.0   4082  6.21  6.17  3.76
# 39269   0.38    Premium     G     VS1   61.9   58.0   1069  4.66  4.62  2.87
# ...      ...        ...   ...     ...    ...    ...    ...   ...   ...   ...
# 5749    0.98  Very Good     E     SI2   61.1   60.0   3895  6.31  6.36  3.87
# 20508   1.70    Premium     I     VS1   61.5   58.0   8840  7.74  7.64  4.73
# 52531   0.72      Ideal     I    VVS2   61.7   55.0   2530  5.71  5.76  3.54
# 12146   1.00       Good     E     SI1   63.7   60.0   5174  6.29  6.24  3.99
# 23398   0.36      Ideal     E     SI1   62.0   57.0    631  4.53  4.57  2.82

model = pymc3.glm.GLM.from_formula(
"price ~ C(cut) + C(color) + C(clarity) + carat + depth + table + x + z",
data=diamonds,
)
fit = pymc3.sample(model=model, tune=20005)

# ValueError: Mass matrix contains zeros on the diagonal.
# The derivative of RV `z`.ravel() is zero.
# The derivative of RV `x`.ravel() is zero.
# The derivative of RV `table`.ravel() is zero.
# The derivative of RV `depth`.ravel() is zero.
# The derivative of RV `carat`.ravel() is zero.
# The derivative of RV `C(clarity)[T.VVS2]`.ravel() is zero.
# The derivative of RV `C(clarity)[T.VVS1]`.ravel() is zero.
# The derivative of RV `C(clarity)[T.VS2]`.ravel() is zero.
# The derivative of RV `C(clarity)[T.VS1]`.ravel() is zero.
# The derivative of RV `C(clarity)[T.SI2]`.ravel() is zero.
# The derivative of RV `C(clarity)[T.SI1]`.ravel() is zero.
# The derivative of RV `C(clarity)[T.IF]`.ravel() is zero.
# The derivative of RV `C(color)[T.J]`.ravel() is zero.
# The derivative of RV `C(color)[T.I]`.ravel() is zero.
# The derivative of RV `C(color)[T.H]`.ravel() is zero.
# The derivative of RV `C(color)[T.G]`.ravel() is zero.
# The derivative of RV `C(color)[T.F]`.ravel() is zero.
# The derivative of RV `C(color)[T.E]`.ravel() is zero.
# The derivative of RV `C(cut)[T.Very Good]`.ravel() is zero.
# The derivative of RV `C(cut)[T.Premium]`.ravel() is zero.
# The derivative of RV `C(cut)[T.Ideal]`.ravel() is zero.
# The derivative of RV `C(cut)[T.Good]`.ravel() is zero.
# The derivative of RV `Intercept`.ravel() is zero.
# """
``````

Hi Mina,
Is your venv running with PyMC master? If yes, your issue is probably related to this one and you should find you answer there.
PS: I think the GLM module uses flat priors, which can also be a cause of the error you’re experiencing.
Hope this helps Changing the prior fixes the issue. Thanks.

1 Like