Modelling sales for two cities

Just eyeballing it, it looks like the errors of the model have serial correlation – it produces “runs” of under-estimates and over-estimates. Another way to think about it is that the exogenous “innovations” that cause the shared common behavior to move exhibit some time persistence. You could do a more formal test by looking at the (partial) auto-correlations of the model residuals.

This is a long-winded way of saying that you should consider an AR component in your model. You can do it by just including one or two lags of sales in each city as a regression component of mu.