Integrating new distribution (wrapped c++ functions) - Problems with broadcasting and shapes

Dear all,
I’m still working on making “Biased Urn” distributions (like hypergeometric, but with biases to certain colors) available for parameter estimation in PyMC3. Thanks to @lucianopaz I made some progress and can run PyMC3’s samplers to estimate parameters, but only for single observations and not with any parameter shapes other than a single scalar.

Right now I’m stuck with shapes and broadcasting not working. I’m new to working with theano on that level and any help would be really appreciated. The situation is now as follows:

  • I have wrapped the relevant functions from a c++ library with the xtensor+python bindings. Hence, I have logp, random and mode available as function in python. On the c++ side xtensor takes care of vectorization, so they all exhibit broadcasting as expected.

  • To use them for a pymc3 distribution (a Discrete), I wrapped the functions mentioned above as theano ops using as_op. If everything is of unitary shape, everything works fine I can sample
    (NUTS reports that initialization fails; probably because the ops lack gradients? But sampling goes fine anyway)

  • What is not working: When I set up the distribution with different shapes or in a model connect RVs of different shapes to it, I get type errors from theano, which makes sense, because I specified the itypes and otypes of the op as scalars. If I specify them as row, vectors or matrices, the simple-shaped stuff leads to type errors, too.

Here are some links to the problematic code:

This is the new pymc3 distribution and the ops for the wrapped functions:

This (maybe less of interest) is the xtensor-python-based wrapper:
(The c++ implementation of the distribution is from

I’m not really sure how to go on; but here are a few questions I have:

  • Could an explicitly implemented theano op (not using as_op) help with the shape problem? Are there any examples?

  • I understand that C++ code can be associated directly with theano ops. Could this also be linked with an external library? Then I could maybe skip the xtensor-python wrapping …

  • While I find it very intuitive to implement pymc3 models with broadcasting and automatic propagation of the shapes, it still is somewhat nebulous to me what is expected from a distribution’s functions in that regard. When I look at existing functions, the magic appears to happen mostly automatically by theano or numpy behavior. Any hints / documentation in that regard would also be very helpful.

Many thanks in advance

1 Like

Yes and yes. You have to check out the extending theano sections and in particular the scalar ops.

You made a very good question and it deserves a longer answer. When I get some time, if you haven’t figured out how to write the custom op yourself before, I’ll try to write up an example distribution.

1 Like

Thanks a lot. That’s very kind.

I already made some progress (by modifying one of the c op examples). After adding all my external includes to c_headers (and setting the CPLUS_PATH environment variable) I was able to use the desired functions in c_code.

Now again the shape stuff gives me headaches. I have a rough idea on how to work on the arrays inside the c code, but still are confused about what is expected of the op so that it will work with a PyMC3 distribution.

Here are a few (vague) questions (numbered for convenience)

  1. Should the op deal with brocading and shape stuff at all? Or should the op just work on scalars and some wrapper can vectorize?

  2. If I implement dealing with shapes in the c code, do I also have to take care of broadcasting? Or are 1-sized dimensions expanded at some other stage?

  3. For my logp op, can the distribution parameters have arbitrary shape? And the “value” (whose logp is determined) and consequently the output can have the same shape or one more dimension, if several values are requested at the same time?

Many thanks