Reconciling two datasets using PyMC

This is a specific example about a more general question, regarding how to reconcile data from different datasets.

I have 2 datasets related to fishing. One tells me how many hours of fishing was done by each vessel in each area, and another one tells me how much fish each vessel has caught in each area. I also know the gear type and size of most of these vessels.

The datasets are both dirty, with various types of bias and error. For example, I know which type of area/vessel to trust less than others.

I would like to reconcile these two datasets, come up with an estimate of how much fish each vessel has caught, in how many hours, and compute a metric of how much each vessel can catch in 1 hour, which depends on gear and size.

I have a feeling I am in the right place with PyMC, but I have very little experience. I looked for some tutorials on your website, for example I am pretty sure this is relevant for me A Primer on Bayesian Methods for Multilevel Modeling — PyMC example gallery

But I need a nudge in the right direction, many thanks!

It sounds like you should start with the binomial regression example. If you need more advice, I would recommend posting the header to your data, or a description if sharing is off the table.

1 Like