1 Introduction
The Clustered Traveling Salesman Problem (CTSP), originally proposed by Chisman [7], is an extension of the classic Traveling Salesman Problem (TSP) where the cities are grouped into clusters and each cluster of cities must be visited contiguously. Formally, the problem is defined on a symmetric complete weighted graph with a set of vertices and a set of edges . The vertex set is partitioned into disjoint clusters . Let be an symmetric distance matrix such that represents the travel cost between two corresponding vertices and , and satisfies the triangle inequality rule. The objective of the CTSP is to find a minimum cost Hamiltonian circuit over all the vertices, where the vertices of each cluster must be visited consecutively.
Fig. 1 shows a feasible solution for a CTSP instance, where the solution corresponds to a Hamiltonian cycle such that the vertices of each cluster are visited contiguously.
The CTSP can be formally modelled as the following integer programming model [7], where without loss of generality, the salesman is assumed to leave origin city and return to .
(1) 
subject to
(2) 
(3) 
(4) 
(5) 
(6) 
(7) 
where if city is visited immediately after city ; otherwise.
Objective function (1) seeks to minimize the total distance traveled by the salesman. Constraints (2) and (3) ensure that each city is visited exactly once. Constraints (4) eliminate subtours, while constraints (5) guarantee that the cities of each cluster are visited contiguously. The remaining constraints are related to the decision variables.
One notices that the CTSP is equivalent to the TSP when there is a single cluster or when each cluster contains exactly one vertex. Therefore, the CTSP is NPhard, and thus computationally challenging in the general case. From a practical perspective, the CTSP is a versatile modeling tool for several operational research applications arising in a wide variety of areas, including automated warehouse routing [7], emergency vehicle dispatching [33], production planning [22], disk defragmentation [19], and commercial transactions with supermarkets, shops and grocery suppliers [11]. As a result, effective solution methods for the CTSP can help to solve these practical problems. In fact, the computational challenge and wide range of applications of the problem have motivated a variety of approaches that are reviewed in Section 2. However, unlike the classic TSP problem for which many powerful methods have been introduced in the past decades, studies on the CTSP are still quite limited.
In this work, we investigate a problem transformation approach mentioned in [7] (1975), which converts the CTSP to the TSP and assess the interest of popular modern TSP solvers for solving the converted instances. To our knowledge, this is the first large computational study testing modern TSP solvers on solving the CTSP. The work is motivated by the following considerations. First, this transformation was tested in [18] (1985) and [22] (1979). Many powerful modern TSP solvers have not been tested for solving the CTSP. Second, intensive researches on the TSP have led to the development of very powerful solvers. Thus, it is interesting to know whether we can take advantage of these solvers to effectively solve the CTSP. Third, the TSP instances converted from the CTSP are characterized by their cluster structures. These instances constitute interesting test cases for existing TSP solvers. This work aims thus to answer the following questions.

How do stateoftheart exact TSP solvers perform on clustered instances converted from the CTSP?

How do stateoftheart inexact (heuristic) TSP solvers perform on clustered instances converted from the CTSP?

Do stateoftheart TSP solvers compete well with the best performing methods specifically designed for the CTSP?
Answering these questions helps to enrich the stateoftheart of solving the CTSP and gain novel knowledge on using powerful TSP methods to solve new problems.
The remainder of this paper is organized as follows. Section 2 reviews existing solution methods for the CTSP. Section 3 presents the transformation of the CTSP to the TSP and three popular TSP methods (solvers). Section 4 shows computational studies of the TSP solvers applied to the clustered instances and comparisons with existing algorithms dedicated to the CTSP. Finally, concluding remarks are provided in Section 5.
2 Literature review on existing solution methods
There are several dedicated solution algorithms for solving the CTSP that are based on exact, approximation, and metaheuristic approaches.
Along with the introduction of the CTSP, Chisman [7] proposed a branchandbound algorithm to solve the integer programming model presented in Section 1. Jongens and Volgenant [18] developed an algorithm based on the 1tree relaxation to provide lower bounds as well as a heuristic to find satisfactory upper bounds. Mestria et al. [23] used the mathematical formulation of [7] and IBM Parallel CPLEX solver (version 11.2) to obtain lower bounds for medium CTSP instances ().
Various approximation algorithms [5, 10, 12] have been developed for the CTSP. These approximation algorithms require either the starting and ending vertices in each cluster or a prespecified order of visiting the clusters in the tour as inputs, and solve the intercluster and intracluster problems independently. Recently, Bao and Liu [6] presented a new approximation algorithm where no starting and ending vertices were specified.
Given that the CTSP is a NPhard problem, a number of heuristic and metaheuristic algorithms have also been investigated, which aim to provide highquality solutions in acceptable computation time, but without provable optimal guarantee of the attained solutions. For example, Laporte et al. [20] presented a tabu search algorithm to solve a particular case of the CTSP, where the clusters are visited in a prespecified order. Potvin and Guertin [30]
developed a genetic algorithm for the CTSP that finds intercluster paths and then intracluster paths. Later, Ding et al.
[8] proposed a twolevel genetic algorithm for the CTSP. In the first level, a genetic algorithm is used to find the shortest Hamiltonian cycle for each cluster. In the second level, a modified genetic algorithm is applied to merge the Hamiltonian cycles of all the clusters into a complete tour.In addition to these early heuristic algorithms, Mestria et al. [23] investigated GRASP (Greedy Randomized Adaptive Search Procedure) with pathrelinking. Among the six proposed heuristics, one heuristic corresponds to the traditional GRASP procedure whereas the other heuristics include different path relinking procedures. In [24], Mestria studied a hybrid heuristic, which is based on a combination of GRASP, Iterated Local Search (ILS) and Variable Neighborhood Descent (VND). Recently, Mestria [25] presented another complex hybrid algorithm (VNRDGILS) which mixes GRASP, ILS, and Variable Neighborhood Random Descent to explore several neighborhoods. According to the computational results reported in [23, 24, 25], these GRASPbased algorithms are among the best performing heuristics specially designed for the CTSP currently available in the literature.
In this work, we explore the uncharted problem transformation approach that converts the CTSP to the conventional TSP and employs popular (exact and inexact) TSP solvers to solve the TSP instances converted from the CTSP benchmark instances.
3 Solving the CTSP via TSP methods
3.1 Transformation of the CTSP to the TSP
The basic idea of this transformation of the CTSP to the TSP is to add a large artificial cost to all intercluster edges in order to force the salesman to visit all the cities within each cluster before leaving it.
Given a CTSP instance with distance matrix , we define a TSP instance with distance matrix as follow.

Define and .

Define the travel distance in by
Obviously, if the value of is sufficiently large, then the best Hamiltonian cycle in is a feasible CTSP solution in , in which the vertices of each cluster are visited contiguously.
Property. An optimal solution to the TSP instance corresponds to an optimal solution to the original CTSP instance.
Proof. Let and be the optimal solutions of the TSP instance and the original CTSP instance , respectively. Let be the number of clusters of . To minimize the total travel cost, there are only intercluster edges in . Therefore, is a feasible CTSP solution for and satisfies the following relation:
Obviously, corresponds to by subtracting the constant .
3.2 Solution methods for the TSP
There are numerous solution methods for the TSP [3]. In this work, we adopt three very popular TSP solvers whose codes are publicly available, including one exact solver (Concorde [1]) and two inexact (heuristic) solvers (LHK2 [14] and GAEAX [29]).
Notice that the TSP instance converted from a CTSP instance has a particular feature that the vertices are grouped into clusters and the distance between each pair of vertices within a same cluster is in general small, while this distance is large for two vertices from different clusters. Along with the presentation of the TSP solvers, we discuss their suitability for solving such clustered instances each time this is appropriate.
3.2.1 Exact Concorde solver
Concorde is an advanced exact TSP solver for the symmetric TSP based on BranchandBound and problem specific cutting plane methods [1]
. It makes use of a specifically designed QSopt linear programming solver. According to
[16], Concorde is the best performing exact algorithm for the TSP. As shown in [3], Concorde can solve benchmark instances from TSPLIB with up to 1000 vertices to optimality within a reasonable computation time and it also solves large TSP instances at the cost of a long computation time.The run time behavior of Concorde has been investigated essentially on random uniform instances. For instance, in [3], Applegate et al. investigated the run time required by Concorde for solving random uniform instances and indicated that the run time increases as an exponential function of instance size . In [16], Hoos and Stützle further demonstrated that the median run time required by Concorde scales with instance size of the form () on the widely studied class of uniform random TSP instances. To our knowledge, no study has been reported concerning the behavior of Concorde on sharply clustered instances. As a result, the current study will provide useful information on this issue.
3.2.2 LinKernighan based heuristic solver
According to the TSP literature, a majority of the best performing TSP heuristic algorithms is based on the LinKernighan (LK) heuristic [21] and its extensions. The LK heuristic is a variabledepth opt local search procedure, where the opt neighborhood is partially searched with a smart pruning strategy. LK explores the most promising neighbors within the opt neighborhood, that is, the set of feasible tours obtained by removing edges and adding other edges such that the resulting tour is feasible. Several improved versions of the basic LK heuristic have been introduced within the iterated local search framework (e.g., [4, 13, 14, 26]).
Among these iterated LK algorithms, Helsgaun’s LKH [13, 14] is the uncontested stateoftheart heuristic TSP solver. In [13], Helsgaun developed an iterated version of LK together with an efficient implementation of the LK algorithm, known as the LinKernighanHelsgaun (LKH1) heuristic, where a 5opt move is used as the basic move to broaden the search and an measure method based on sensitivity analysis of minimum spanning trees is used to restrict the search to relative few of the nearest neighbors of a vertex to speed up the search process. Later, in [14], Helsgaun further extended LKH1 by developing a highly effective implementation of the opt procedure (called LKH2), which eliminated many of the limitations and shortcomings of LKH1. Furthermore, LKH2 specially extended the data structures of LKH1 to solve very large TSP instances. The main features of LKH2 include (1) using sequential and nonsequential opt moves, (2) using several partitioning procedures to partition a large TSP instance into smaller subproblems, (3) using a tour merging procedure to generate a better solution from two or more local optimum solutions, and (4) applying a backboneguided search to guide the local search to make biased local perturbations. LKH2 is considered to be one of most effective heuristic methods for finding very highquality solutions for various large TSP instances.
However, the LK algorithm and any LKbased algorithms are unsuitable for clustered instances of the TSP because they require much longer running times on such instances than on uniformly distributed instances
[28]. The main reason why the LK heuristic stumbles on clustered instances is that relatively large intercluster edges serve as bait edges. When removing such a bait edge, the LK heuristic is tricked into long and often fruitless searches. More precisely, each time an edge bridging two clusters is removed, the cumulative gain rises enormously, and the procedure is encouraged to perform very deep searches. To alleviate the problem, a cluster compensation technique was proposed in [28] for the LinKernighan heuristic to limit its performance degradation. In [14], Helsgaun showed that the LKH2 algorithm performs significantly worse on sharply clustered instances than on uniform random instances. However, no effective method was proposed in [14] to remedy this difficulty.3.2.3 Edge assembly crossover based genetic algorithm
Populationbased evolutionary algorithms are another wellknown approach for the TSP. A popular example is the powerful genetic algorithm introduced by Nagata and Kobayashi in
[29]. This algorithm (called GAEAX, see Algorithm 1) is characterized by its powerful edge assembly crossover (EAX) operator introduced in [27] with an efficient implementation and a costeffective selection strategy for maintaining population diversity.The key EAX operator generates, from two highquality tours (parents), one offspring tour by first inheriting the edges from the parents to construct disjoint subtours and then connecting the subtours with new edges in a greedy fashion (similar to building a minimal spanning tree). Let and be the parents, EAX operates as follows (see Fig. 2 for an example):

Generate an undirected multigraph defined as , where and are the sets of edges of parents and , respectively.

Extract all ABcycles from . An ABcycle is defined as a cycle in , such that edges of and edges of are alternately linked.

Construct an Eset by selecting ABcycles according to a given selection strategy (e.g., single, kmultiple, block and block2 [29]), where an Eset is a set of ABcycles.

Copy parent to an intermediate solution . Then, remove the edges of in the Eset from and add those of in the Eset to . This leads to an intermediate solution with one or more subtours.

Connect all the subtours in with new short edges to generate a complete tour (a feasible offspring solution) by using a greedy heuristic.
Note that different versions of EAX can be developed by using different selection strategies of ABcycles for constructing Esets. The GAEAX algorithm employs the single and block2 strategies to generates offspring solutions from parent solutions. To maintain a healthy population diversity, GAEAX also uses an edge entropy measure to select the solution to be used to replace a parent in the population.
Other studies (e.g., [15]) also indicated the usefulness of edgeassemblylike crossovers for solving clustered instances of the TSP. As shown in the next section, the EAXbased genetic algorithm performs remarkably well on all the clustered instances transformed from the CTSP.
4 Computational experiments
In this section, we evaluate the capacity of the TSP solvers presented in Section 3.2 to solve the CTSP via its transformation to the TSP. For this purpose, we examine their qualitative performances and run time efficiencies on various CTSP benchmark instances and make comparisons with the best dedicated CTSP algorithms in the literature.
4.1 Benchmark instances
Our computational assessments are based on three sets of 45 CTSP benchmark instances with 101 to 5000 vertices. Sets 1 and 2 include 20 medium instances () and 15 large instances (), which are classical and widely used in the CTSP literature (e.g., [23, 24, 25]). Set 3 is a new set of 10 very large instances with 3000 and 5000 vertices.
Sets 1 and 2 (35 instances): These instances belong to the following six types: (1) instances taken from the TSPLIB [32]
where the clusters are generated by using a kmeans clustering algorithm; (2) instances created from a selection of classic TSP instances
[17], where the clusters are created by grouping the vertices in geometric centers; (3) instances generated by using the Concorde interface [2]; (4) instances generated using the layout proposed in [19]; (5) instances similar to type 2, but generated with different parameters; (6) instances adapted from the TSPLIB [32], where the rectangular floor plan is divided into several quadrilaterals and each quadrilateral corresponds to a cluster.Set 3 (10 instances): These instances were created from 10 very large TSP instances [17] with 3000 and 5000 vertices. Following [24], for these instances, geometric centers are selected and the clusters are created by grouping the vertices in the geometric centers, where the coordinates of geometric centers are selected uniformly in the interval [0,1000) and is the number of clusters.
All these instances are available at https://github.com/lyldft/ctsp.
4.2 TSP solvers and experimental protocol
For our study, we employed three popular TSP solvers presented in Section 3.2, which are among the most powerful methods for the TSP in the literature.

Exact Concorde TSP solver^{1}^{1}1http://www.math.uwaterloo.ca/tsp/concorde/index.html: We used version Concorde03.12.19 and ran the solver with its default parameter setting with a cutoff time of 24 CPU hours per instance.

Inexact LKH2 TSP solver^{2}^{2}2http://akira.ruc.dk/~keld/research/LKH/: LKH2 is an iterated local search procedure and typically terminates after a fixed number of iterations (default is ). We observed that LKH2 with this default stopping condition becomes too time consuming on our clustered instances (see discussion in Section 3.2.2). In our experiment, we used a shorter number of iterations of 0.1* and 0.2* while using the default values for the other parameters of LKH2.

Inexact GAEAX TSP solver^{3}^{3}3https://github.com/sugia/GAforTSP: We used GAEAX with its default parameter setting given in [29]: , and GAEAX terminates if the difference between the average tour length and the shortest tour length in the population is less than .
The experiments were carried out on a computer running Windows 7 with an Intel Core i74790 processor (3.60 GHz and 8 GB of RAM). Given the stochastic nature of LKH2 and GAEAX, we ran each algorithm 10 times for each instances while the deterministic Concorde TSP solver was run one time to solve each instance.
4.3 Computational results and comparison of popular TSP solvers
Our computational studies aim to answer the following questions: How do stateoftheart exact TSP solvers perform on clustered instances converted from the CTSP? How do stateoftheart inexact (heuristic) TSP solvers perform on clustered instances converted from the CTSP?
The results of the three TSP solvers (Concorde, LKH2, GAEAX) on the 20 medium and 15 large CTSP benchmark instances are summarized in Tables 1 and 2. Columns 1 to 3 show the basic information of each instance: the instance name (Instance), the number of vertices () and the number of clusters (). Column 4 gives the optimal objective value reported by the exact Concorde TSP solver, followed by the required run time in seconds. For both the LKH2 and GAEAX solvers, we show the best (BErr) and average (AErr) results over 10 independent runs in the form of the percentage gap to the optimal solution, as well as the average run time in seconds. If the best solution over 10 independent runs equals the optimal solution obtained with the exact Concorde TSP solver, the corresponding cell in column BErr shows ‘=’ along with the number of runs that succeeded in finding the optimal solution. Finally, row ‘Avg.’ provides the average run time in seconds for each approach, and the average gap between the average objective values obtained with LKH2/GAEAX and the optimal values obtained with the Concorde TSP solver.
Exact TSP solver  LKH2(10 runs)  GAEAX(10 runs)  

Concorde  No. of iterations=0.1*  No. of iterations=0.2*  Default configuration  
Instance  Opt.  Time  BErr  AErr  Time  BErr  AErr  Time  BErr  AErr  Time  
i50gil262  262  50  135431  2.4  =(10)  0.0000  1.1  =(10)  0.0000  1.8  =(10)  0.0000  1.8 
10lin318  318  10  529584  2.6  =(1)  0.0020  13.1  =(1)  0.0020  22.4  =(10)  0.0000  1.9 
10pcb442  442  10  537419  28.9  =(8)  0.0032  65.2  =(8)  0.0014  112.0  =(10)  0.0000  6.4 
C1k.0  1000  10  132521027  34.0  =(3)  0.0063  78.7  =(1)  0.0056  168.5  =(10)  0.0000  15.2 
C1k.1  1000  10  129128125  27.5  0.0002  0.0013  82.6  =(3)  0.0007  148.9  =(10)  0.0000  13.7 
C1k.2  1000  10  142784000  122.7  0.0009  0.0092  221.4  0.0009  0.0092  371.2  =(10)  0.0000  16.7 
3006  300  6  8934  4.4  =(10)  0.0000  25.9  =(10)  0.0000  52.6  =(10)  0.0000  1.9 
4006  400  6  9045  6.8  =(10)  0.0000  47.6  =(10)  0.0000  102.9  =(10)  0.0000  4.5 
70020  700  20  41425  19.5  =(4)  0.0092  289.4  =(9)  0.0017  590.8  =(10)  0.0000  10.6 
2004h  200  4  62777  0.7  =(10)  0.0000  4.0  =(10)  0.0000  6.5  =(10)  0.0000  1.0 
2004x1  200  4  60574  1.7  =(10)  0.0000  4.2  =(10)  0.0000  5.9  =(10)  0.0000  1.0 
6008z  600  8  128891  5.8  =(8)  0.0063  48.0  =(6)  0.0049  84.9  =(10)  0.0000  5.6 
6008x2  600  8  128891  4.7  =(8)  0.0063  48.1  =(6)  0.0049  84.8  =(10)  0.0000  5.6 
3005108  300  5  67760  1.7  =(10)  0.0000  15.2  =(10)  0.0000  23.5  =(10)  0.0000  2.3 
30020111  300  20  309739  2.4  =(9)  0.0002  12.4  =(10)  0.0000  19.9  =(10)  0.0000  2.0 
50015306  500  15  194818  3.6  =(10)  0.0000  18.9  =(10)  0.0000  32.8  =(10)  0.0000  4.6 
50025308  500  25  365447  9.1  =(2)  0.0112  11.7  =(2)  0.0098  19.8  =(9)  0.0001  5.4 
25eil101  101  25  23671  0.3  =(10)  0.0000  0.4  =(10)  0.0000  0.6  =(10)  0.0000  0.8 
42a280  280  42  129645  2.2  =(4)  0.0079  1.9  =(10)  0.0000  3.7  =(10)  0.0000  1.9 
144rat783  783  144  914228  287.4  =(3)  0.0011  8.3  =(4)  0.0003  13.1  =(10)  0.0000  9.5 
Avg.  28.4  0.0032  49.9  0.0020  93.3  0.0000  5.6 
Exact TSP solver  LKH2 (10 runs)  GAEAX (10 runs)  

Concorde  No. of iterations=0.1*  No. of iterations=0.2*  Default configuration  
Instance  Opt.  Time  BErr  AErr  Time  BErr  AErr  Time  BErr  AErr  Time  
49pcb1173  1173  49  61600  9663.2  =(1)  0.1432  140.1  =(1)  0.0994  206.8  =(6)  0.0031  28.7 
100pcb1173  1173  100  63382  835.9  0.0063  0.1426  50.5  0.0063  0.1186  77.8  =(10)  0.0000  27.4 
144pcb1173  1173  144  62142  67.0  0.0016  0.2092  19.2  0.0016  0.2079  38.5  =(10)  0.0000  16.4 
10nrw1379  1379  10  58783  958.4  =(3)  0.0129  336.0  =(7)  0.0043  420.9  =(4)  0.0061  23.2 
12nrw1379  1379  12  59129  112.3  =(2)  0.0027  33.5  =(5)  0.0008  90.7  =(10)  0.0000  22.8 
150010503  1500  10  11116  80.4  0.0540  0.1367  447.5  =(1)  0.0387  897.6  =(10)  0.0000  21.6 
150020504  1500  20  15698  75.5  =(2)  0.0803  319.7  =(4)  0.0032  569.4  =(4)  0.0344  29.3 
150050505  1500  50  22900  66.9  =(1)  0.2000  150.5  =(1)  0.1865  299.1  =(7)  0.0013  30.9 
1500100506  1500  100  29799  127.8  =(2)  0.0718  49.5  =(3)  0.0302  87.8  =(9)  0.0010  36.6 
1500150507  1500  150  34068  137.6  =(1)  0.0361  42.5  =(1)  0.0302  78.2  =(10)  0.0000  28.9 
200010a  2000  10  105368  7957.5  0.0294  0.0793  2139.4  0.0019  0.0789  4071.9  =(1)  0.0877  45.5 
200010h  2000  10  33708  1283.1  0.0059  0.0311  700.7  0.0059  0.0311  1314.9  =(9)  0.0012  33.9 
200010z  2000  10  33509  296.4  0.0030  0.0406  279.6  =(1)  0.0164  430.0  =(9)  0.0003  36.0 
200010x1  2000  10  33792  1412.7  0.0355  0.0462  581.8  0.0296  0.0394  1387.4  =(8)  0.0059  36.0 
200010x2  2000  10  33509  226.0  0.0030  0.0406  283.5  =(1)  0.0164  430.0  =(9)  0.0003  36.7 
Avg.  1553.4  0.0849  371.6  0.0601  693.4  0.0094  30.3 
First, the exact Concorde TSP solver performs very well for these 35 instances and is able to solve all of them exactly. Specifically, the 20 medium instances can be solved easily in a short run time (an average of about 30 seconds). The 15 large instances are more difficult and the run time needed to solve these instances increases considerably (an average of 1553 seconds, reaching 9663 seconds for the most difficult instance).
Second, the inexact LKH2 TSP solver does not performs as well as Concorde. With the stopping condition of 0.1* iterations, LKH2 misses respectively 2 and 8 optimal solutions for the medium and large instances with an average run time of 49.9 and 371.6 seconds. LKH2 obtains improved results (optimal solution for one more medium instance and 3 large instances) with the relaxed condition of 0.2* iterations. However, in this case, LKH2 requires roughly doubled its run time.
Third, the GAEAX solver performs remarkably well by attaining the optimal values for all 35 instances. For the 20 medium instances, GAEAX consistently hits the optimal solutions for each of its 10 run (except for one instance for which it has a hit of 9 out of 10). For the 15 large instances, except 3 cases, GAEAX hits the optimum of each instance at least 6 times out of 10 runs. The average run time is only 5.6 seconds for the medium instances and 30.3 seconds for the large instances. Compared to the Concorde TSP solver and the LKH2 TSP solver, the GAEAX algorithm is thus extremely time efficient. Moreover, contrary to the Concorde and LKH2 solvers, the computation time required by GAEAX remains very stable across the instances of the same set, indicating a high robustness and scalability of this solver.
Exact TSP solver  LKH2 (10 runs)  GAEAX (10 runs)  

Concorde  No. of iterations=0.1*  No. of iterations=0.2*  Default configuration  
Instance  Opt.  Time  BErr  AErr  Time  BErr  AErr  Time  BErr  AErr  Time  
i1  3000  50  23571  1110.2  =(1)  0.2719  1738.4  =(1)  0.2232  2652.0  =(2)  0.1786  94.6 
i2  3000  100  32750  9899.2  0.0641  0.1658  640.5  0.0641  0.1463  1212.1  =(3)  0.0134  100.2 
i3  3000  150  36898  1367.0  =(3)  0.0740  269.5  =(4)  0.0279  468.6  =(9)  0.0003  107.5 
i4  3000  200  41825  1557.4  =(1)  0.0904  256.7  =(3)  0.0550  439.5  =(3)  0.0033  105.2 
i5  5000  150  (50587)  24h  0.6306  0.5905  1421.6  0.6306  0.6134  2617.1  0.6306  0.6292  216.9 
i6  5000  200  57185  5943.4  =(1)  0.0371  965.1  =(1)  0.0344  1858.3  =(3)  0.0044  242.8 
i7  5000  250  62464  2912.2  =(3)  0.0155  575.8  =(5)  0.0120  1261.9  =(8)  0.0005  278.2 
i8  5000  300  (65990)  24h  0.0788  0.0339  751.3  0.0833  0.0511  1305.7  0.0955  0.0899  294.0 
i9  5000  350  (70794)  24h  1.0425  0.9841  613.4  1.0594  1.0033  1419.7  1.0580  1.0519  308.3 
i10  5000  400  74459  25639.4  0.0013  0.0283  420.6  =(1)  0.0200  691.6  =(1)  0.0056  259.7 
Avg.  761.3  1392.7  200.7 
Table 3 presents the results of the three TSP solvers on the 10 new very large CTSP instances of Set 3. Notice that if an instance cannot be solved exactly by the Concorde TSP solver, the percentage gaps (BErr and AErr) are calculated using the Concorde’s best upper bound. In this case, column ‘Opt.’ corresponds to the best upper bound from Concorde, and a negative (positive) gap indicates a better (worse) result compared to this bound.
From Table 3, we can make the following observations. First, Concorde manages to optimally solve 7 out of these 10 very large instances with a run time ranging from 1100 seconds to more than 25000 seconds. For these 7 instances, LKH2 attains the optimal solutions for 6 instances while GAEAX reaches all optimal solutions. Second, for the three instances that cannot be solved exactly by Concorde, both LKH2 and GAEAX report better results than the best upper bounds of Concorde. However, LKH2 has a worse performance both in terms of solution quality and computation time compared with GAEAX. Third, GAEAX has an excellent time efficiency across the instances of this set and scales very well with the increase of instance sizes. These observations are consistent with those from Tables 12.
To sum, the exact Concorde TSP solver is very efficient for the CTSP instances with up to 1000 vertices and becomes time consuming for larger instances. The inexact LKH2 TSP solver has troubles to solve these clustered instances, which is consistent with previous studies such as [14, 28]. The EAXbased genetic algorithm performs remarkably well both in terms of solution quality and computational efficiency and scales well with the instance sizes.
To deepen our computational study, we call upon to the performance profile, a analytic tool for evaluating the performances of multiple compared optimization algorithms [9]
. The performance profile uses a cumulative distribution function for a performance metric, such as run time, objective function values, number of iterations, and so on. For a given metric, the performance profile associated to an algorithm
indicates the probability
that the algorithm attains results which are within a factor of the best result attained by all compared algorithms over a set of problem instances. A higher probability indicates a better algorithmic performance under the given metric. The value of is the probability that the algorithm will win over the rest of the compared algorithms.To make a fair and meaningful comparison with this tool, we focus on the two inexact solvers LKH2 and GAEAX and run each solver 10 times on each of the 45 instances. We use the software ‘perprofpy’ [31] to draw the performance profiles (see Figure 3) where the quality of the solution is measured by the average objective value and average run time. These performance profiles shows a clear dominance of GAEAX over LKH2 both in terms of solution quality and run time efficiency.
4.4 TSP solvers v.s. stateoftheart CTSP heuristics
In Section 4.3, we identified GAEAX as the most suitable method for solving clustered instances converted from the CTSP. We now answer the following question: Does GAEAX compete well with stateoftheart CTSP heuristics specially designed for the problem?
Algorithm name  Reference  Search strategy 

VNRDGILS  [25](2018)  A hybrid heuristic based on GRASP, ILS and VNRD 
HHGILS  [24](2016)  A hybrid heuristic based on GRASP, ILS and VND 
GPR1R2  [23](2013)  A GRASP with Path Relinking PR1 and PR2 
GPR1  [23](2013)  A GRASP with Path Relinking PR1 
GPR2  [23](2013)  A GRASP with Path Relinking PR2 
GPR3  [23](2013)  A GRASP with Path Relinking PR3 
GPR4  [23](2013)  A GRASP with Path Relinking PR4 
GRASP  [23](2013)  A traditional GRASP heuristic 
TLGA  [8](2007)  A twolevel genetic algorithm 
GAEAX  VNRDGILS  HHGILS  GPR1R2  

Instance  
i50gil262  262  50  135431  135431.0  1.8  135483  135510.2  720.0  135510  135578  720.0       
10lin318  318  10  529584  529584.0  1.9  530604  530871.4  720.0  530283  530817.9  720.0  530443  532697.9  720.0 
10pcb442  442  10  537419  537419.0  6.4  538309  538903.4  720.0  538958  539988.3  720.0  540043  543104.2  720.0 
C1k.0  1000  10  132521027  132521027.0  15.2  133260549  133490775.9  720.0  133287594  133776274.1  720.0  133490776  133708187.6  720.0 
C1k.1  1000  10  129128125  129128125.0  13.7  129877874  130035540.2  720.0  129825403  130206778.3  720.0  130193590  130391693.5  720.0 
C1k.2  1000  10  142784000  142784000.0  16.7  143321630  143481489.6  720.0  143278093  143525149.6  720.0       
3006  300  6  8934  8934.0  1.9  8935  8941.1  720.0  8934  8942.9  720.0  8959  8985.3  720.0 
4006  400  6  9045  9045.0  4.5  9053  9062.3  720.0  9051  9063.2  720.0       
70020  700  20  41425  41425.0  10.6  41456  41489.7  720.0  41452  41485.6  720.0  41540  41573.3  720.0 
2004h  200  4  62777  62777.0  1.0  62867  63058.3  720.0  62804  63058.3  720.0  62994  63710.2  720.0 
2004x1  200  4  60574  60574.0  1.0  60637  60796.2  720.0  60931  61378.5  720.0       
6008z  600  8  128891  128891.0  5.6  129468  129862.7  720.0  129416  129928.6  720.0  130459  131235.1  720.0 
6008x2  600  8  128891  128891.0  5.6  129246  129533.9  720.0  129246  129691.5  720.0       
3005108  300  5  67760  67760.0  2.3  67766  67868.7  720.0  67814  67930.5  720.0       
30020111  300  20  309739  309739.0  2.0  310146  310270.9  720.0  310209  310427  720.0  309928  310551.9  720.0 
50015306  500  15  194818  194818.0  4.6  194946  195201.5  720.0  195202  195438.1  720.0       
50025308  500  25  365447  365447.2  5.4  365717  365937.8  720.0  365828  366085  720.0  366232  366785.7  720.0 
25eil101  101  25  23671  23671.0  0.8  23673  23685.2  720.0  23678  23690  720.0  23676  23711.3  720.0 
42a280  280  42  129645  129645.0  1.9  129729  129755.2  720.0  129716  129833.2  720.0       
144rat783  783  144  914228  914228.0  9.5  915088  915179.8  720.0  915180  915363.2  720.0  915547  915913.7  720.0 
49pcb1173  1173  49  61600  61601.9  28.7  65750  66487.7  1080.0  67043  68260.7  1080.0  70651  73311.9  1080.0 
100pcb1173  1173  100  63382  63382  27.4  68708  69383.2  1080.0  68786  70640.8  1080.0  72512  74871.7  1080.0 
144pcb1173  1173  144  62142  62142  16.4  68414  68941.4  1080.0  66830  69084.3  1080.0  72889  74621.6  1080.0 
10nrw1379  1379  10  58783  58786.6  23.2  63951  64895.9  1080.0  63620  64643.9  1080.0  66747  68955.8  1080.0 
12nrw1379  1379  12  59129  59129  22.8  62893  63532.3  1080.0  63558  64741.6  1080.0  66444  69141.2  1080.0 
150010503  1500  10  11116  11116  21.6  11969  12103.0  1080.0  11986  12109.5  1080.0  12278  12531.4  1080.0 
150020504  1500  20  15698  15703.4  29.3  16678  16867.4  1080.0  17107  17315.7  1080.0  17252  17589.1  1080.0 
150050505  1500  50  22900  22900.3  30.9  24631  24803.6  1080.0  25264  25558.9  1080.0  25124  25761.5  1080.0 
1500100506  1500  100  29799  29799.3  36.6  32474  32616.8  1080.0  32260  33760.6  1080.0  33110  33692.7  1080.0 
1500150507  1500  150  34068  34068  28.9  37357  38251.1  1080.0  37658  38433.1  1080.0  38767  39478.0  1080.0 
200010a  2000  10  105368  105460.4  45.5  115779  116897.3  1080.0  116254  116881.4  1080.0  116473  118297.5  1080.0 
200010h  2000  10  33708  33708.4  33.9  36806  38351.8  1080.0  36447  37305.1  1080.0  37529  38861.8  1080.0 
200010z  2000  10  33509  33509.1  36.0  36815  38035.7  1080.0  37059  37443.7  1080.0  37440  38765.9  1080.0 
200010x1  2000  10  33792  33794  36.0  36783  37488.6  1080.0  36752  37704.0  1080.0  37262  39253.1  1080.0 
200010x2  2000  10  33509  33509.1  36.7  37132  38240.6  1080.0  36660  37117.1  1080.0  37704  38699.5  1080.0 
pvalue  2.477e7  2.477e7  3.651e7  2.477e7 
For this purpose, we adopt three best performing CTSP heuristics: VNRDGILS [25], HHGILS [24], and GPR1R2 [23]. Indeed, according to the experimental studies reported in [23, 24, 25], these three heuristics perform the best among the recent CTSP heuristics available in the literature (see Table 4). This study is based on the 35 medium and large instances of Sets 1 and 2 (no results for the three CTSP heuristics are available on the 10 very large instances of Set 3).
Table 5 provides a summary of the results of the GAEAX TSP solver along with the results reported by the three CTSP algorithms on the medium and large instances. For each instance and algorithm, columns ‘’, ‘’ and ‘’ show respectively the best objective value over 10 independent runs, the average objective value and the average run time in seconds. To determine whether there exists a statistically significant difference in performance between the GAEAX TSP solver and each CTSP algorithm in terms of best and average results, the values from the Wilcoxon signedrank tests are given in the last row of the tables. Entries with “” mean that the corresponding results are not available in the literature. The best objective values obtained by the compared algorithms are indicated in bold if they attain the optimal solution. Notice that the results of the CTSP algorithms (VNRDGILS, HHGILS and GPR1R2) correspond to 10 executions per instance on a computer with 2.83 GHz Intel Core 2 CPU and 8 GB RAM and the time limit per run was set to 720 seconds for medium instances and 1080 seconds for large instances.
From Table 5, we observe that compared to the three CTSP algorithms, the GAEAX TSP solver attains consistently the optimal solutions for all 35 medium and large CTSP instances. However, among the three CTSP algorithms, the optimal result is obtained only for 1 instance by HHGILS. Although the experimental platforms are different, we further observe that the GAEAX TSP solver is more than an order of magnitude faster than the CTSP algorithms while reporting much better results.
Figure 4 provides boxplot graphs to compare the distribution and range of the average results for each compared algorithm, except GPR1R2 for the medium instances since its results on several medium instances are not available. In this figure, the average objective value of a given algorithm is normalized according to the relation , where is the optimal value. The plots in Figure 4 show clear differences in the distributions of the average results between GAEAX and each compared CTSP heuristic, which further confirms the efficiency of the GAEAX TSP solver with respect to these dedicated CTSP heuristics.
GAEAX  VNRDGILS  HHGILS  GPR1R2  

Medium instance set  Optimal solutions  20/20  0/20  1/20  0/20 
Average /(%)  0.00/0.00  0.18/0.30  0.21/0.40  0.39/0.73  
Average time (s)  5.6  720.0  720.0  720.0  
Large instance set  Optimal solutions  15/15  0/15  0/15  0/15 
Average /(%)  0.00/0.01  8.61/10.39  11.04/15.51  12.24/15.49  
Average time (s)  30.3  1080.0  1080.0  1080.0 
Table 6 summarizes the statistical results for each compared algorithm on the two sets of medium and large instances. The first row indicates the number of optimal solutions found by each approach. The average percentage gap of the best/average result from the optimal result is provided in row ‘Average /’. Finally, row ‘Average time (s)’ provides the average run time in seconds for each algorithm. From Table 6, we observe that the GAEAX solver significantly outperforms the three CTSP algorithms on the medium and large instances in terms of both the best and the average results. For the large instance set, the improvement gaps between the results of GAEAX and those of the CTSP methods are very high, ranging from 10.39% to 15.49%. Furthermore, in terms of the average run time, GAEAX is about 30 to 130 times faster than the CTSP algorithms. The above results thus indicate that the GAEAX TSP solver has a strong dominance over current best performing CTSP approaches in the literature. Finally, the results of the Concorde TSP solver and the LKH2 solver reported in Section 4.3 indicate that these TSP solvers also dominate the current best CTSP algorithms in the literature.
5 Conclusions
This paper presents the first large computational study on testing modern TSP solvers for solving the CTSP. According to the computational results from the exact Concorde TSP solver and the inexact LKH2 and GAEAX TSP solvers on two sets of medium and large CTSP benchmark instances available in the literature (with up to 2000 vertices) and a new set of very large CTSP instances (with up to 5000 vertices), we can make the following conclusions.

The exact Concorde TSP solver can optimally solve all medium and large CTSP instances, but fails to solve three very large instances with 5000 vertices in 24 hours. Its solution time increases considerably with the instance sizes.

Due to the clustering nature of the transformed instances, the powerful inexact LKH2 TSP solver does not perform well. LKH2 reports a worse performance both in terms of solution quality and computation time, compared with GAEAX.

The GAEAX solver performs remarkably well both in terms of solution quality and computational efficiency, with a very high scalability. It can stably attain the optimal solutions for all medium and large CTSP instances available in the literature with a short time.

The TSP solvers all dominate the current best performing CTSP heuristics specially designed for the problem. This is particular true for the GAEAX solver, which is 30 to 130 times faster than the stateoftheart CTSP heuristics to find much better results.
Finally, this study also indicates that the existing CTSP benchmark instances in the literature are not challenging for modern TSP solvers even if they remain difficult for the current CTSP algorithms.
Acknowledgment
This work is partially supported by the National Natural Science Foundation Program of China [Grant No. 71401059, 71771099] and the China Postdoctoral Science Foundation [Grant No. 2019M662649].
References
 [1] Applegate, D., Bixby, R., Chvatal, V., & Cook, W. (2001). Concorde TSP solver http://www.math.uwaterloo.ca/tsp/concorde/index.html.
 [2] Applegate, D., Bixby, R., & Chvatal, V. (2007). Concorde TSP Solver. William Cook, School of Industrial and Systems Engineering, Georgia Tech.
 [3] Applegate, D., Bixby, R., Chvatal, V., & Cook, W. (2006). The traveling salesman problem: a computational study. Princeton University Press.
 [4] Applegate, D., Cook, W., & Rohe, A. (2003). Chained LinKernighan for large traveling salesman problems. INFORMS Journal on Computing, 15(1), 8292.
 [5] Anily, S., Bramel, J., & Hertz, A. (1999). A 53approximation algorithm for the clustered traveling salesman tour and path problems. Operations Research Letters, 24(12), 2935.
 [6] Bao, X., & Liu, Z. (2012). An improved approximation algorithm for the clustered traveling salesman problem. Information Processing Letters, 112(23), 908910.
 [7] Chisman, J. A. (1975). The clustered traveling salesman problem. Computers & Operations Research, 2(2), 115119.
 [8] Ding, C., Cheng, Y., & He, M. (2007). Twolevel genetic algorithm for clustered traveling salesman problem with application in largescale TSPs. Tsinghua Science and Technology, 12(4), 459465.
 [9] Dolan, E. D., & Moré, J. J. (2002). Benchmarking optimization software with performance profiles. Mathematical Programming, 91(2), 201213.
 [10] Gendreau, M., Laporte, G., & Hertz, A. (1997). An approximation algorithm for the traveling salesman problem with backhauls. Operations Research, 45(4), 639641.

[11]
Ghaziri, H., & Osman, I. H. (2003). A neural network algorithm for the traveling salesman problem with backhauls. Computers & Industrial Engineering, 44(2), 267281.
 [12] GuttmannBeck, N., Hassin, R., Khuller, S., & Raghavachari, B. (2000). Approximation algorithms with bounded performance guarantees for the clustered traveling salesman problem. Algorithmica, 28(4), 422437.
 [13] Helsgaun, K. (2000). An effective implementation of the LinKernighan traveling salesman heuristic. European Journal of Operational Research, 126(1), 106130.
 [14] Helsgaun, K. (2009). General kopt submoves for the LinKernighan TSP heuristic. Mathematical Programming Computation, 1(23), 119163.
 [15] Hains, D., Whitley, D., & Howe, A. (2012). Improving LinKernighanHelsgaun with crossover on clustered instances of the TSP. In International Conference on Parallel Problem Solving from Nature (pp. 388397). Springer, Berlin, Heidelberg.
 [16] Hoos, H. H., & Stützle, T. (2014). On the empirical scaling of runtime for finding optimal solutions to the travelling salesman problem. European Journal of Operational Research, 238(1), 8794.
 [17] Johnson, D. S., & McGeoch, L. A. (2007). Experimental analysis of heuristics for the STSP. In The Traveling Salesman Problem and its Variations (pp. 369443). Springer, Boston, HEA.
 [18] Jongens, K., & Volgenant, T. (1985). The symmetric clustered traveling salesman problem. European Journal of Operational Research, 19(1), 6875.
 [19] Laporte, G., & Palekar, U. (2002). Some applications of the clustered travelling salesman problem. Journal of the Operational Research Society, 53(9), 972976.
 [20] Laporte, G., Potvin, J. Y., & Quilleret, F. (1997). A tabu search heuristic using genetic diversification for the clustered traveling salesman problem. Journal of Heuristics, 2(3), 187200.
 [21] Lin, S., & Kernighan, B. W. (1973). An effective heuristic algorithm for the travelingsalesman problem. Operations Research, 21(2), 498516.
 [22] Lokin, F. C. J. (1979). Procedures for travelling salesman problems with additional constraints. European Journal of Operational Research, 3(2), 135141.
 [23] Mestria, M., Ochi, L. S., & de Lima Martins, S. (2013). GRASP with path relinking for the symmetric Euclidean clustered traveling salesman problem. Computers & Operations Research, 40(12), 32183229.
 [24] Mestria, M. (2016). A hybrid heuristic algorithm for the clustered traveling salesman problem. Pesquisa Operacional, 36(1), 113132.
 [25] Mestria, M. (2018). New hybrid heuristic algorithm for the clustered traveling salesman problem. Computers & Industrial Engineering, 116, 112.

[26]
Martin, O., Otto, S. W., & Felten, E. W. (1991). Largestep Markov chains for the traveling salesman problem. Oregon Graduate Institute of Science and Technology, Department of Computer Science and Engineering.
 [27] Nagata, Y.,& Kobayashi, S. (1997) Edge assembly crossover: A highpower genetic algorithm for the traveling salesman problem. Proc. 7th Internat. Conf. Genetic Algorithms (Morgan Kaufmann,San Francisco), 450457.
 [28] Neto, D. (1999). Efficient Cluster Compensation For LinKernighan Heuristics. PhD thesis, University of Toronto.
 [29] Nagata, Y., & Kobayashi, S. (2013). A powerful genetic algorithm using edge assembly crossover for the traveling salesman problem. INFORMS Journal on Computing, 25(2), 346363.
 [30] Potvin, J. Y., & Guertin, F. (1996). The clustered traveling salesman problem: A genetic approach. In MetaHeuristics (pp. 619631). Springer, Boston, HEA.

[31]
Siqueira, A. S., da Silva, R. C., & Santos, L. R. (2016). Perprofpy: A python package for performance profile of mathematical optimization software. Journal of Open Research Software, 4(1).
 [32] Reinelt, G. (1991). TSPLIBA traveling salesman problem library. ORSA Journal on Computing, 3(4), 376384.
 [33] Weintraub, A., Aboud, J., Fernandez, C., Laporte, G., & Ramirez, E. (1999). An emergency vehicle dispatching system for an electric utility in Chile. Journal of the Operational Research Society, 50(7), 690696.
 [34] Whitley, D., Hains, D., & Howe, A. (2010). A hybrid genetic algorithm for the traveling salesman problem using generalized partition crossover. In International Conference on Parallel Problem Solving from Nature (pp. 566575). Springer, Berlin, Heidelberg.
 [35] Watson, J., Ross, C., Eisele, V., Denton, J., Bins, J., Guerra, C. D. W. A. H., & Howe, A. (1998). The traveling salesman problem, edge assembly crossover, and 2opt. In International Conference on Parallel Problem Solving from Nature (pp. 823832). Springer, Berlin, Heidelberg.
Comments
There are no comments yet.