Thanks for the answers, yes i guess the map is doing multi stage optimization. Optically it should be a fairly simple problem, that should be convex in the only free variable p, so i was hoping find_map would just be iterating over various p values every time evaluating the potential on a sample.
The more generic problem is the following: how can we solve a bayesian network parameter inference problem (structure is known), where you can draw N samples but you can only observe a deterministic function of the samples?
In my case the network is just one Bernouilli and the deterministic function is the sum of outcomes.
I have seen variants of this problem asked on the forum and i understand sampling cannot really work for non trivial deterministic functions, but i was wondering if pymc3 has other (non sampling) optimization tools that might help