This is very interesting
It’s clear to me why solution #1 produces OK predictions but poor parameter estimates. Essentially there is too much prior density on low values of b (perhaps the prior was a half-normal or something). When b is close to 0, the value of k1 needs to be close to what k2 should be.
Could someone provide an intuitive explanation for why solution #2 works better?