Hi!,
This is a real question from a person that had a meeting with producers of feminine hygiene products. We answered her question using count data from beach-litter surveys conducted on Lake Geneva, Switzerland from 2015-2021. There were 247 beach-litter surveys at 38 different locations. In total there were 78,104 objects removed and identified, of which 358 objects were FHP. There are 3 sampling periods.
We keep this initial model simple because there is every intention of developing the model further to do things like map overlays and recommendations.
The model looks like this:
def model_a_sampling_period(observed, FHP, pk, names=["pLake", "thetaLake"], model_name=["yeartwo"], draws=5000):
with pm.Model() as a:
plake = pm.Beta(names[0], alpha=FHP, beta=pk)
thetaLake = pm.Binomial(names[1], n=1, observed=observed, p=plake)
trace = pm.sample(draws=draws, cores=4, tune=1000, return_inferencedata=False, progressbar = False, start={names[1]:[np.mean(observed)]})
postpred = pm.sample_posterior_predictive(trace, progressbar = False, var_names=names)
prior = pm.sample_prior_predictive()
priordata = az.from_pymc3(
trace = trace,
posterior_predictive = postpred,
prior = prior
)
return {model_name[0]:priordata, "model": a}
This gives the difference between five locations that are in the same city.
In this case we are testing just the incidence of finding at least one at a survey.
This work is being done here:
The data is from a national survey, soon to be officially released:
Anybody who is interested in Marine-litter and observational data may enjoy the challenges of making predictions. We are looking to integrate the process into our new app and the next report (using the negative binomial instead).
at your service
roger at hammerdirt