Dear all,
I’m still working on making “Biased Urn” distributions (like hypergeometric, but with biases to certain colors) available for parameter estimation in PyMC3. Thanks to @lucianopaz I made some progress and can run PyMC3’s samplers to estimate parameters, but only for single observations and not with any parameter shapes other than a single scalar.
Right now I’m stuck with shapes and broadcasting not working. I’m new to working with theano on that level and any help would be really appreciated. The situation is now as follows:
-
I have wrapped the relevant functions from a c++ library with the xtensor+python bindings. Hence, I have logp, random and mode available as function in python. On the c++ side xtensor takes care of vectorization, so they all exhibit broadcasting as expected.
-
To use them for a pymc3 distribution (a Discrete), I wrapped the functions mentioned above as theano ops using as_op. If everything is of unitary shape, everything works fine I can sample
(NUTS reports that initialization fails; probably because the ops lack gradients? But sampling goes fine anyway) -
What is not working: When I set up the distribution with different shapes or in a model connect RVs of different shapes to it, I get type errors from theano, which makes sense, because I specified the itypes and otypes of the op as scalars. If I specify them as row, vectors or matrices, the simple-shaped stuff leads to type errors, too.
Here are some links to the problematic code:
This is the new pymc3 distribution and the ops for the wrapped functions:
https://paste.ofcode.org/vbEVMMacRCZudFTJj5NSNk
This (maybe less of interest) is the xtensor-python-based wrapper:
https://paste.ofcode.org/qRPbRTgdeP3nEAgYLPHrAD
(The c++ implementation of the distribution is from https://www.agner.org/random/stocc.zip)
I’m not really sure how to go on; but here are a few questions I have:
-
Could an explicitly implemented theano op (not using as_op) help with the shape problem? Are there any examples?
-
I understand that C++ code can be associated directly with theano ops. Could this also be linked with an external library? Then I could maybe skip the xtensor-python wrapping …
-
While I find it very intuitive to implement pymc3 models with broadcasting and automatic propagation of the shapes, it still is somewhat nebulous to me what is expected from a distribution’s functions in that regard. When I look at existing functions, the magic appears to happen mostly automatically by theano or numpy behavior. Any hints / documentation in that regard would also be very helpful.
Many thanks in advance
Jan