Many Divergences for a simple model

How you code your outcome variable is up to you, but quite a lot is tied up with how you decide to do so (so it’s not a choice to be taken lightly). When you have a outcome that is a proportion, the binomial is pretty standard (e.g., it’s the basis of logistic and probit regressions), but some people use standard linear regression to tackle proportions (e.g., using a normal distribution for the likelihood). The latter may not have all the desirable properties one might wish (e.g., predictions may fall outside the valid [0,1] range of proportions), but may reflect the most natural interpretation of a particular application. I have seen linear regression suggested in cases where the proportions cannot be thought of as a set of discrete dichotomous outcomes (e.g., proportion of some total amount of liquid).

Regardless, if you don’t have a whole lot of confidence about this central aspect of your model, I would suggest backing up and thinking through these issues. They are pretty fundamental to the entire process.

Once I have this, how can I use the MAP to predict on test data?

This one is easy: don’t. Using MAP estimates and modes ignores all that sweet, sweet uncertainty you worked so hard to get a handle on in the first place and are thus pretty unnatural within a Bayesian approach. MAPs, like MLEs, are convenient because dealing with scalars is always easier than dealing with full distributions. But distributions are just what happens when you acknowledge uncertainty. When asking questions of models, expect to get answers in the form of distributions. What is the MSE of my model? The answer will be a distribution. What is the R^2 of my model? The answer will be a distribution. Taking the mean/median/model of a distribution and pretending like it represents an exhaustive summary of the full distribution will likely get you into trouble.

With Bayesian methods, how do people generally asset predictive ability?

This will have many answers. AIC is not generally used as it quantifies model complexity in ways that aren’t a natural fit for Bayesian models, but there are information criteria that are used in Bayesian contexts (see cite below). Some calculate quantities such as MSE. Some use validation (e.g., leave-one-out). Here again, I would strongly suggest that you think through what you want before figuring out how to try and get it.

In the meantime, you can check out a couple of the presentations from PyMCon, which took place a couple of weeks ago:

Posterior Predictive Sampling in PyMC3 by Luciano Paz

A Tour of Model Checking techniques by Rob Zinkov

For more information about WAIC/AIC/DIC/deviance, I would suggest this paper:
Gelman, A., Hwang, J., & Vehtari, A. (2014). Understanding predictive information criteria for Bayesian models. Statistics and Computing , 24 (6), 997-1016.