ValueError: Mass matrix contains zeros on the diagonal.
The derivative of RV TV_SYFY_log__.ravel()[0] is zero.
The derivative of RV TV_SMITHSONIANNETWORK_log__.ravel()[0] is zero.
The derivative of RV TV_LIFETIME_log__.ravel()[0] is zero.
The derivative of RV TV_IFC_log__.ravel()[0] is zero.
The derivative of RV TV_BETHER_log__.ravel()[0] is zero.
The derivative of RV TV_ANE_log__.ravel()[0] is zero.

Do I need to change Constraint on priors ? And How ? Thank you.

Often, you are going to want to use a HalfStudentT (or a HalfNormal or HalfCauchy) when you want to constrain a distribution away from a boundary. This boundary is often zero. So all of these “half” distributions are specified such that their “full” counterparts would have a location (e.g., mean) of zero.

So I assume you are dealing with one of two situations. First, maybe you are using a Student’s t distribution to allow for observations far from the central “pile” of observations and you also want to constrain the likelihood to be positive (e.g., if negative observations are impossible). Second, maybe you are confident that your data is distributed according to a half-Student’s t, but aren’t sure where the boundary is. I assume the former, because the latter seems implausible to me (correct me if I am mistaken).

If it is indeed the former, then I would suggest that a HalfStudentT isn’t what you are looking for. Maybe something more like Gamma? Or even exGaussian? Those both get you positive support and some control over the location. There are others as well.

Thank you cluhmann!
I updated my question just now while you were replying.
So I’m using GLM module and specifying HalfStudent.
Now it gives me this:

ValueError: Mass matrix contains zeros on the diagonal.
The derivative of RV TV_SYFY_log__.ravel()[0] is zero.
The derivative of RV TV_SMITHSONIANNETWORK_log__.ravel()[0] is zero.
The derivative of RV TV_BETHER_log__.ravel()[0] is zero.

Yes. Using HalfStudent but getting derivatives of RV features zero:

The derivative of RV TV_WETV_log__.ravel()[0] is zero.
The derivative of RV TV_TRUTV_log__.ravel()[0] is zero.
The derivative of RV TV_SYFY_log__.ravel()[0] is zero.
The derivative of RV TV_SMITHSONIANNETWORK_log__.ravel()[0] is zero.
The derivative of RV TV_POP_log__.ravel()[0] is zero.
The derivative of RV TV_OPRAHWINFREYNETWORK_log__.ravel()[0] is zero.
The derivative of RV TV_LIFETIME_log__.ravel()[0] is zero.
The derivative of RV TV_IFC_log__.ravel()[0] is zero.
The derivative of RV TV_GAMESHOWNETWORK_log__.ravel()[0] is zero.
The derivative of RV TV_COMEDYCENTRAL_log__.ravel()[0] is zero.
The derivative of RV TV_BRAVO_log__.ravel()[0] is zero.
The derivative of RV TV_BETHER_log__.ravel()[0] is zero.
The derivative of RV TV_BBCAMERICA_log__.ravel()[0] is zero.
The derivative of RV TV_AMC_log__.ravel()[0] is zero.
The derivative of RV TV_ANE_log__.ravel()[0] is zero.
The derivative of RV Intercept.ravel()[0] is zero

Ok, then I am confused. Now you expect that all of your predictors/features will have positive marginal associations? And your outcome variable can take on any value, including negative values?

I would sort out what you are trying to accomplish first and only then start building your model (and do it bit by bit). Dumping everything you have into a model that hasn’t been well-designed isn’t going to get you very far.

The basic problem is that gamma only has support on x \in (0, \infty) so your likelihood isn’t going to be able to accommodate observed values of zero. Your intercept is very likely negative (mean=1, but SD=10), which could (depending on the values of your predictors and coefficients) yield negative (i.e., invalid) means for your likelihood.

As I stated here, outcomes that are counts are commonly modeled as Poisson, ZIP, and negative binomial.

Then I look at r2 score:
az.r2_score(data.TOTAL_CONV.values, ppc_poisson[‘conversion’])

r2 0.101319
r2_std 0.023821
dtype: float64

My r2 is so low. Is it normal? What does it mean? Why did it happened as ppc mean align with Observed values as I can see from the Graph?
Please advise If there any other metrics for model evaluation. Thank you.

The topic of model checking can be a bit involved because it is often very dependent on your goals (which I am still a bit confused about here). I would check the Diagnostics and Model Criticism notebooks available here. You can also check out several talks from last year’s PyMCon, like this one and this one. For a deeper dive, you can check out Aki’s keynote.

My target variable range is from 0 to 20. But predictions from posterior pred check goes from 0 to 60 as shown on graph below:

I found in pymc3 documentation following:

Bounds cannot be given to variables that are observed . To model truncated data, use a Potential() in combination with a cumulative probability function. See this example notebook.

The links to example doesn’t work. How can I use Potential() to set limit on my likelihood or predictions ?
Thank you.