Skip to content

Benchmarks

Julien Coupey edited this page Jun 15, 2021 · 28 revisions

All computations done on a machine using an Intel Xeon E5-1620 CPU @ 3.50GHz, 4c/8t.

VRPTW

Solomon

The Solomon benchmark is used as a reference in most research papers including time window constraints.

All required instances and scripts are available to reproduce results.

Benchmark description

56 instances with 100 jobs, grouped in classes C (clustered), R (random), RC (mix of both), and 1 or 2 for short or long planning horizon.

Global indicators

Using v1.9.0 with -x 5 and comparing to best known solutions targeting cumulated travel time (more on objectives).

Computing times (ms)
Average 425
Median 484
Longest 652
Quality
Best known solutions 14 (25.0%)
Minimum gap +0.00%
Median gap +1.11%
Average gap +1.75%
Worst gap +8.61%

Results by problem class

Comparison to best known solutions (BKS) targeting cumulated travel time (CTT). Cumulated number of vehicles (CNV) is only reported for the record.

Class C1 C2 R1 R2 RC1 RC2 Total
BKS CNV 10.0 3.0 13.25 5.36 12.88 6.25 485
BKS CTT 828.38 589.86 1175.75 878.41 1340.02 1004.21 54,699
VROOM CNV 10.0 3.0 13.33 4.36 13.12 5.25 469
VROOM CTT 828.38 589.86 1190.03 911.56 1359.48 1038.89 55,668
CTT gap +0.00% +0.00% +1.21% +3.77% +1.45% +3.45% +1.77

MDVRP

Cordeau

All required instances and scripts are available to reproduce results on the Cordeau et al. (1997) dataset.

Benchmark description

33 instances with capacity constraints and a number of jobs ranging from 48 to 360 (average 176), with vehicles starting from 2 to 9 depots.

Results

Using v1.10.0 with -x 5.

Computing times (ms)
Average 1310
Median 423
Longest 5564
Quality
Best known solutions 5 (15.2%)
Minimum gap +0.00%
Median gap +1.67%
Average gap +1.82%
Worst gap +5.02%

PDPTW

Li & Lim

The Li & Lim 100 benchmark is derived from the Solomon instances with the addition of pickup and delivery constraints.

All required instances and scripts are available to reproduce results.

Benchmark description

56 instances with around 100 jobs, grouped in classes C (clustered), R (random), RC (mix of both), and 1 or 2 for short or long planning horizon.

Global indicators

Using v1.9.0 with -x 5.

Computing times

Computing times (ms)
Average 140
Median 130
Longest 283

Quality

We target cumulated travel time (CTT), but best known results are only reported for approaches using the number of vehicles as the first optimization objective. For a meaningful comparison we rule out instances lc103, lc104, lc109 andlr211 for which we provide solutions that are better in term of CTT but use one more vehicle than the reported "best known solution".

CTT comparison
Best known solutions 24 (46.2%)
Minimum gap +0.00%
Median gap +0.09%
Average gap +1.22%
Worst gap +9.86%

CVRP

CVRPLIB

CVRPLIB is a set of benchmark instances for single-depot CVRP. We use all instances described using euclidean distance (EDGE_WEIGHT_TYPE = EUC_2D), from classes A, B, E, F, M, P and X.

All required instances and scripts are available to reproduce results.

Benchmark description

  • 189 instances
  • Sizes ranging from 15 to 1,000 jobs
  • Average size around 240 jobs
  • Number of vehicles ranging from 2 to 207
  • Average capacity tightness (total job demand / total vehicle capacity): 0.95

Global indicators

Using v1.9.0.

Exploration level 0 (fastest) 5 (best)
Median computing time 41ms 707ms
Longest computing time 2.4s 132.5s
Jobs assigned 99.6% 99.9%
Solutions with all jobs 158 (83.6%) 170 (90.0%)
Best known solutions 13 (6.9%) 42 (22.2%)

Gaps to best known solutions

Only reported for instances with all jobs assigned, so cost comparison makes sense.

Exploration level 0 (fastest) 5 (best)
Minimum gap +0.00% +0.00%
Median gap +2.84% +0.99%
Average gap +3.25% +1.73%
Worst gap +16.41% +11.00%

TSP

TSPLIB

TSPLIB is the reference benchmark for TSP. We use all instances described using euclidean distance (EDGE_WEIGHT_TYPE = EUC_2D).

All required instances and scripts are available to reproduce results.

Benchmark description

  • 78 instances
  • Sizes ranging from 50 to 18,511 points
  • Average size around 1,170 points

Global indicators

Using v1.8.0.

  • Median computing time: 35 ms
  • Average gap to optimal solution: +2.69%
  • Worst gap to optimal solution: +6.21%

Examples

Instance Size Computing time Gap
kroB100 100 5ms +0.26%
a280 280 18ms +1.12%
rat575 575 61ms +3.35%
u1060 1,060 294ms +3.04%
u2152 2,152 1824ms +4.77%
rl5915 5,915 25s +1.92%
usa13509 13,509 14m18s +2.81%
d18512 18,512 35m37s +2.99%
Clone this wiki locally