I’ve been fiddling with the source code “smc.py” recently and discovered something which bugs me. When a summary statistic called “sum_stat=sort” is used, then np.sort is applied to observations and sim_data as follows:
self.observations = self.sum_stat(observations) (...) elemwise = self.distance(self.epsilon, self.observations, self.sum_stat(sim_data))
Shouldn’t sorting be done in a way which sorts both of the arrays based on one of them (on observations)? In many cases this does not make a difference, but if for example for observations = [3.04,2.99,1.5] we have corresponding sim_data = [3.02, 3.05, 1.56], then the distance function will not calculate distance between correct pairs of values.
I will greatly appreciate if someone explains if this sorting procedure is intended to work the way it does or if it is an error in the source code.