I think the reason that wasn’t working was because the value supplied for the shape argument is (3,1) while you supplied mu and sd values with shape of (3). Try the following:
mu = np.asarray([[0.], [10.], [0.]])
sd = np.asarray([[10.], [100.], [10.]])
with pm.Model() as model:
β = pm.Normal("β", mu=mu, sd=sd, shape=(3,1))
A bit late to the topic, but what is you experience with this approach in pymc? For me sampling slows down a lot compared to just defining separate priors for β1-3, and not using pm.math.dot… As I see it pm.math.dot slows sampling a lot compared multplying the predictors and the parameters individually… thanks
@jjarato could you share some details about your example? A couple of things could be happening
Objects don’t have appropriate shape. For example, X must be of shape nxp and beta must be of shape px1. Result must be of shape nx1.
Your X contains a lot of dummy (or one hot encoded) variables. Then you’re making lots of unnecessary multiplications that result in 0.
In my experience, both approaches have similar timings unless you’re under one of the two scenarios I mentioned above.
If you share an example I could understand better what’s going on.
thanks for the response!
it is true that I have dummy coded predictors, but I think something else is off as well…
#sampling about 7 seconds/chain:
with pm.Model() as Model1:
Intercept=pm.Normal('Intercept',3,1)
B1=pm.Normal('B1',0,.1)
B2=pm.Normal('B2',0,.1)
B3=pm.Normal('B3',0,.1)
Pred=Intercept+B1*xx[:,1]+B2*xx[:,2]+B3*xx[:,3]
sigma = pm.HalfNormal("sigma",1)
LL=pm.Normal('LL',mu=Pred,sigma=sigma,observed=yy)
trace=pm.sample(1000,cores=1)
# this one is really slow for some reason:
mu = np.asarray([[3],[0],[0],[0]])
sd = np.asarray([[1],[.1],[.1],[.1]])
with pm.Model() as Model2:
Means=pm.Normal('Means',mu=mu,sd=sd,shape=(4,1))
Pred=pm.math.dot(xx,Means)
sigma = pm.HalfNormal("sigma", 1)
LL=pm.Normal('LL',mu=Pred,sigma=sigma,observed=yy)
trace2=pm.sample(1000,cores=1)
This is sampling slowly because your shapes are all messed up. I recommend two things. First, always do a prior predictive check before you run pm.sample, so you can spot shape problems. In this case, LL is Draws x Obs x Obs, rather than Draws x Obs. Second, use coordinates instead of passing shapes. PyMC has a lot of very nice internal machinery that can handle quite hairy shapes, so just let it.
Here’s how I set up the coordinates for your Model2:
N = 1000
k = 4
var_names = ['x{i}' for i in range(k)]
xx = pd.DataFrame(np.random.normal(size=(N, k)), columns=var_names)
yy = np.random.normal(size=(N))
mu = np.array([3, 0, 0, 0])
sd = np.array([1, .1, .1, .1])
coords = {'covariates':var_names}
with pm.Model(coords=coords) as Model2:
Means=pm.Normal('Means',mu=mu,sd=sd, dims=['covariates'])
Pred=pm.math.dot(xx,Means)
sigma = pm.HalfNormal("sigma", 1)
LL=pm.Normal('LL',mu=Pred,sigma=sigma,observed=yy)
prior = pm.sample_prior_predictive(var_names=['Means', 'LL'])
Thanks to the prior check, we can now verify that the loc and scale parameters are set properly:
You could also add a coordinate for the index and give it to LL, this is nice if you have missing data to interpolate, for example. This model now samples very quickly.
In general, there is a first rule of PyMC problems I saw from a YouTube vid once: 99% of problems come from shapes. Always check them!
No, his code is fine. But you have to be careful when you declare singleton dimensions – (4,1) instead of (4,). If you check Pred.tag.test_value.shape in your original code, the shape is (N, 1), which is correct. But I suspect that your yy variable is not a column vector, so numpy broadcasts shape (N, ) yy and shape (N, 1) Pred together into (N, N). You could also change your original code to add an additional dimension to yy and get a correct result:
mu = np.asarray([[3],[0],[0],[0]])
sd = np.asarray([[1],[.1],[.1],[.1]])
with pm.Model() as Model2:
Means=pm.Normal('Means',mu=mu,sd=sd, shape=(4,1))
Pred=pm.math.dot(xx,Means)
sigma = pm.HalfNormal("sigma", 1)
LL=pm.Normal('LL',mu=Pred,sigma=sigma,
observed=yy[:, None] #None creates a new dimension, so yy is now (N, 1))
This code samples just fine.
But again, I recommend using coords and dims instead of shape because you get nicer arviz output, the dimensions are easier to reason about, and your code is more readable. As an added bonus, it prevents you from passing around these singleton dimensions when you don’t need them.