infinity The Benchmarks

This section presents a very high-level overview of all the benchmark test suites I have considered and also an extremely high-level analysis of the results obtained with the different solvers across all benchmarks.

The conclusions on this section are quantitative enough to get a feeling about a specific solver performance, but given the amount of information condensed in a few tables/figures the analysis presented here has to be complemented by the suite-by-suite examination of the results.

One thing is definitely sure: the No Free Lunch Theorem strikes again (NFL).

In this page, the results in tables and pictures are presented by benchmark test suite in alphabetical order, although the order of the HTML web pages containing the description of the test suites is different: I have sorted them by their widespread usage in the optimization community or by the level of interest/novelty they inspired to me. So, for example, the first HTML page after this section contains the SciPy Extended benchmark test suite, followed by GKLS and then GlobOpt and so on.

stop Stopping Conditions and Tolerances

As mentioned in the main page, all the test functions on almost all benchmark suites have known global optimum values: in the few cases where the global optimum value is unknown (or I was unable to calculate it via analysis) their value has been estimated by enormous grid searches plus repeated minimizations. I then considered an algorithm to be successful if:

\lvert F_{known \hspace{3pt} minimum} - F_{algorithm \hspace{3pt} minimum} \rvert \leq 10^{-6}

All the other checks for tolerances - be those on relative change on the objective function from one iteration to the other, or too small variations in the variables in the function domain - have been disabled or mercilessly and brutally commented out in the code for each solver.

For example, for nlopt algorithms like CRS2 or DIRECT, a way to ignore other tolerances and only use the formula above may be:

 import nlopt

 algorithm = nlopt.GN_DIRECT_L

 problem = nlopt.opt(algorithm, N)
 problem.set_min_objective(objective_function)
 problem.set_lower_bounds(lb)
 problem.set_upper_bounds(ub)

 problem.set_stopval(fglob + tolf)
 problem.set_maxeval(maxfun)
 problem.set_ftol_abs(-1)
 problem.set_ftol_rel(-1)
 problem.set_xtol_abs(-1)
 problem.set_xtol_rel(-1)

description High-Level Description of the Test Suites

This whole exercise is made up of 6,825 test problems divided across 16 different test suites: most of these problems are of low dimensionality (2 to 6 variables) with a few benchmarks extending to 9+ variables. With all the sensitivities performed during this exercise on those benchmarks, the overall grand total number of functions evaluations stands at 3,859,786,025 - close to 4 billion. Not bad.

Each test suite has a separate section in the next pages, describing in a bit more detail what I have managed to understand from the approach used to create the benchmark suite - which in many cases is not much, but I hope I will be able to give a general idea of the main characteristics of each test suite, with a bunch of cool-looking pictures too...

So, in order of implementation, these are the benchmark suites:

  1. SciPy Extended: 235 multivariate problems (where the number of independent variables ranges from 2 to 17), again with multiple local/global minima.

    I have added about 40 new functions to the standard SciPy benchmarks and fixed a few bugs in the existing benchmark models in the SciPy repository.

  2. GKLS: 1,500 test functions, with dimensionality varying from 2 to 6, generated with the super famous GKLS Test Functions Generator. I have taken the original C code (available at http://netlib.org/toms/) and converted it to Python.

  3. GlobOpt: 288 tough problems, with dimensionality varying from 2 to 5, created with another test function generator which I arbitrarily named “GlobOpt”: https://www.researchgate.net/publication/225566516_A_new_class_of_test_functions_for_global_optimization . The original code is in C++ and I have bridged it to Python using Cython.

    Many thanks go to Professor Marco Locatelli for providing an updated copy of the C++ source code.

  4. MMTFG: sort-of an acronym for “Multi-Modal Test Function with multiple Global minima”, this test suite implements the work of Jani Ronkkonen: https://www.researchgate.net/publication/220265526_A_Generator_for_Multimodal_Test_Functions_with_Multiple_Global_Optima . It contains 981 test problems with dimensionality varying from 2 to 4. The original code is in C and I have bridge it to Python using Cython.

  5. GOTPY: a generator of benchmark functions using the Bocharov-Feldbaum “Method-Min”, containing 400 test problems with dimensionality varying from 2 to 5. I have taken the Python implementation from https://github.com/redb0/gotpy and improved it in terms of runtime.

    Original paper from http://www.mathnet.ru/php/archive.phtml?wshow=paper&jrnid=at&paperid=11985&option_lang=eng .

  6. Huygens: this benchmark suite is very different from the rest, as it uses a “fractal” approach to generate test functions. It is based on the work of Cara MacNish on Fractal Functions. The original code is in Java, and at the beginning I just converted it to Python: given it was slow as a turtle, I have re-implemented it in Fortran and wrapped it using f2py, then generating 600 2-dimensional test problems out of it.

  7. LGMVG: not sure about the meaning of the acronym, but the implementation follows the “Max-Set of Gaussians Landscape Generator” described in http://boyuan.global-optimization.com/LGMVG/index.htm . Source code is given in Matlab, but it’s fairly easy to convert it to Python. This test suite contains 304 problems with dimensionality varying from 2 to 5.

  8. NgLi: Stemming from the work of Chi-Kong Ng and Duan Li, this is a test problem generator for unconstrained optimization, but it’s fairly easy to assign bound constraints to it. The methodology is described in https://www.sciencedirect.com/science/article/pii/S0305054814001774 , while the Matlab source code can be found in http://www1.se.cuhk.edu.hk/~ckng/generator/ . I have used the Matlab script to generate 240 problems with dimensionality varying from 2 to 5 by outputting the generator parameters in text files, then used Python to create the objective functions based on those parameters and the benchmark methodology.

  9. MPM2: Implementing the “Multiple Peaks Model 2”, there is a Python implementation at https://github.com/jakobbossek/smoof/blob/master/inst/mpm2.py . This is a test problem generator also used in the smoof library, I have taken the code almost as is and generated 480 benchmark functions with dimensionality varying from 2 to 5.

  10. RandomFields: as described in https://www.researchgate.net/publication/301940420_Global_optimization_test_problems_based_on_random_field_composition , it generates benchmark functions by “smoothing” one or more multidimensional discrete random fields and composing them. No source code is given, but the implementation is fairly straightforward from the article itself.

  11. NIST: not exactly the realm of Global Optimization solvers, but the NIST StRD dataset can be used to generate a single objective function as “sum of squares”. I have used the NIST dataset as implemented in lmfit, thus creating 27 test problems with dimensionality ranging from 2 to 9.

  12. GlobalLib: Arnold Neumaier maintains a suite of test problems termed “COCONUT Benchmark” and Sahinidis has converted the GlobalLib and PricentonLib AMPL/GAMS dataset into C/Fortran code (http://archimedes.cheme.cmu.edu/?q=dfocomp ). I have used a simple C parser to convert the benchmarks from C to Python.

    The global minima are taken from Sahinidis or from Neumaier or refined using the NEOS server when the accuracy of the reported minima is too low. The suite contains 181 test functions with dimensionality varying between 2 and 9.

  13. CVMG: another “landscape generator”, I had to dig it out using the Wayback Machine at http://web.archive.org/web/20100612044104/https://www.cs.uwyo.edu/~wspears/multi.kennedy.html , the acronym stands for “Continuous Valued Multimodality Generator”. Source code is in C++ but it’s fairly easy to port it to Python. In addition to the original implementation (that uses the Sigmoid as a softmax/transformation function) I have added a few others to create varied landscapes. 360 test problems have been generated, with dimensionality ranging from 2 to 5.

  14. NLSE: again, not really the realm of Global optimization solvers, but Nonlinear Systems of Equations can be transformed to single objective functions to optimize. I have drawn from many different sources (Publications, ALIAS/COPRIN and many others) to create 44 systems of nonlinear equations with dimensionality ranging from 2 to 8.

  15. Schoen: based on the early work of Fabio Schoen and his short note on a simple but interesting idea on a test function generator, I have taken the C code in the note and converted it into Python, thus creating 285 benchmark functions with dimensionality ranging from 2 to 6.

    Many thanks go to Professor Fabio Schoen for providing an updated copy of the source code and for the email communications.

  16. Robust: the last benchmark test suite for this exercise, it is actually composed of 5 different kind-of analytical test function generators, containing deceptive, multimodal, flat functions depending on the settings. Matlab source code is available at http://www.alimirjalili.com/RO.html , I simply converted it to Python and created 420 benchmark functions with dimensionality ranging from 2 to 6.

As a high-level summary, Table 0.1 below reports for each test suite the number of problems as a function of dimensionality, together with their totals.


Table 0.1: Dimensionality of the test functions for all the benchmark suites
Test Suite N = 2 N = 3 N = 4 N = 5 N = 6+ Total
CVMG 90 90 90 90 0 360
GKLS 300 300 300 300 300 1,500
GlobalLib 76 23 26 9 47 181
GlobOpt 72 72 72 72 0 288
GOTPY 100 100 100 100 0 400
Huygens 600 0 0 0 0 600
LGMVG 76 76 76 76 0 304
MMTFG 312 330 339 0 0 981
MPM2 120 120 120 120 0 480
NgLi 60 60 60 60 0 240
NIST 6 7 3 2 9 27
NLSE 15 15 3 1 10 44
RandomFields 120 120 120 120 0 480
Robust 84 84 84 84 84 420
Schoen 57 57 57 57 57 285
SciPy Extended 193 13 13 6 10 235
Total 2,281 1,467 1,463 1,097 517 6,825

For pretty much all the test suites the value of the objective function at the global optimum (or optima) is known, and for many of those also the position of the global optimum (or optima) is also known. Since I only care about the value of the objective function at the global optimum (or optima), it didn’t bother me that some test suites didn’t have a way to know the position of the global optimum (or optima) in the search space.

Table 0.2 gives an indication of whether, for each benchmark suite, the optimal value of the objective function (FOPT) and the position(s) of the global optima (XOPT) in the search space is known beforehand. This is my analysis and it may easily be that I have missed a few technical detail by classifying a test suite in a way or another, i.e. maybe the position/value of the optima is known but I have failed to recognize/derive it.

In addition, Table 0.2 gives an indication of the “hardness” of a benchmark test suite, based on my results on applying all the optimizers under consideration with a given budget of number of function evaluations, and specifically a maximum of NF = 2,000. The criteria for this classification are completely arbitrary, but I chose them to be:

  1. Maximum allowed number of functions evaluations NF = 2,000

  2. Conditions for the classification:

    \begin{cases}
\textrm{Easy}   & \textrm{if } success \geq 75 \%       \\
\textrm{Medium} & \textrm{if } 50 \% < success < 75 \%    \\
\textrm{Hard}   & \textrm{if } success \leq 50 \%      \\
\end{cases}


The details are summarized in Table 0.2.

Table 0.2: Global optimum value and position assessment
Test Suite Fopt Known Xopt Known Difficulty
CVMG Yes No Medium
GKLS Yes Yes Hard
GlobalLib Yes Yes Easy
GlobOpt Yes No Hard
GOTPY Yes Yes Hard
Huygens No No Medium
LGMVG Yes No Easy
MMTFG Yes Yes Easy
MPM2 Yes Yes Medium
NgLi Yes Yes Easy
NIST Yes Yes Medium
NLSE Yes Yes Easy
RandomFields No No Hard
Robust Yes Yes Medium
Schoen Yes Yes Easy
SciPy Extended Yes Yes Easy

results Aggregated, High-Level Results

This section provides some very high-level results stemming from the application of all nonlinear solvers to all the benchmark test suites. It is by no means a complete analysis but it does offer an insight on the various solvers performances, both in general, as a function of the dimensionality of the problem and as a function of the type of benchmark. Some test suites are much harder than others, and I have also tried to categorize the benchmark suites in terms of their “difficulty”, if that even makes sense.

I am going to present results in table and graphs, normally side-by-side or one after the other. I prefer the visuals, but I appreciate that quite a few people want to see the raw numbers. Since the amount of results available is kind of humongous, I have decided to present the high-level results for all solvers at two cutoffs in terms of function evaluations limits, and specifically NF = 2,000 and NF = 10,000 maximum evaluations.

info General Results

Now, stopping the chit chat, Table 0.3 and Figure 0.1 below show the overall high-level results in terms of percentage of problems solved for each Global Optimization solver and each benchmark suite, at a 2,000 limit on function evaluations:

Table 0.3: Percentage of problems solved, per solver and benchmark suite at NF = 2,000
Test Suite AMPGO BasinHopping BiteOpt CMA-ES CRS2 DE DIRECT DualAnnealing LeapFrog MCS PSWARM SCE SHGO Best Solver
CVMG 53.9% 55.0% 47.2% 50.8% 40.6% 48.3% 55.6% 54.2% 42.5% 56.4% 41.9% 50.6% 54.7% MCS
GKLS 14.1% 15.5% 18.6% 7.1% 12.1% 13.7% 21.3% 24.0% 7.5% 28.5% 9.9% 12.1% 23.3% MCS
GlobalLib 60.8% 53.0% 72.4% 43.1% 43.6% 39.8% 45.9% 59.7% 49.2% 80.7% 19.3% 56.4% 60.2% MCS
GlobOpt 1.4% 10.4% 6.2% 0.3% 13.9% 4.5% 3.8% 5.6% 1.4% 6.9% 0.0% 0.3% 0.7% CRS2
GOTPY 7.5% 3.2% 11.8% 3.0% 9.2% 6.2% 23.0% 8.8% 6.5% 17.5% 4.2% 7.8% 18.0% DIRECT
Huygens 13.0% 2.0% 22.5% 8.0% 22.7% 36.5% 67.0% 2.8% 6.2% 11.5% 1.8% 14.5% 3.0% DIRECT
LGMVG 46.4% 13.2% 43.8% 28.3% 50.7% 42.1% 65.8% 75.3% 32.6% 70.1% 20.4% 48.4% 77.6% SHGO
MMTFG 60.2% 72.1% 72.3% 52.2% 49.8% 32.9% 59.4% 67.0% 43.9% 77.1% 19.6% 24.6% 69.2% MCS
MPM2 36.0% 68.1% 35.0% 39.6% 35.0% 32.9% 59.8% 50.6% 23.8% 65.6% 21.2% 40.8% 65.2% BasinHopping
NgLi 59.6% 32.5% 51.7% 56.2% 61.7% 43.3% 59.6% 68.8% 42.9% 86.7% 2.5% 77.9% 75.8% MCS
NIST 33.3% 11.1% 48.1% 14.8% 25.9% 7.4% 3.7% 11.1% 18.5% 63.0% 0.0% 29.6% 7.4% MCS
NLSE 63.6% 63.6% 75.0% 52.3% 38.6% 45.5% 47.7% 56.8% 54.5% 79.5% 27.3% 61.4% 65.9% MCS
RandomFields 10.6% 19.4% 24.8% 9.2% 13.1% 19.8% 34.4% 34.2% 8.3% 29.4% 14.4% 17.1% 34.0% DIRECT
Robust 52.1% 36.2% 62.4% 33.6% 40.5% 27.9% 45.7% 51.4% 22.9% 45.5% 28.1% 23.1% 36.4% BiteOpt
Schoen 53.3% 90.2% 70.5% 71.2% 67.4% 52.3% 75.4% 85.6% 52.6% 91.2% 58.2% 68.4% 95.4% SHGO
SciPy Extended 69.5% 64.7% 83.4% 58.8% 73.4% 75.4% 72.5% 72.4% 61.8% 81.9% 21.2% 71.2% 69.7% BiteOpt
Weighted Average 39.1% 41.0% 46.1% 33.7% 37.8% 33.3% 51.8% 46.6% 28.4% 53.2% 18.6% 32.8% 48.6% MCS

Figure 0.1 shows some sort of “heatmap”: the greener the color of a cell is, the better the performance of a specific solver is on the corresponding benchmarks. I have highlighted the “best” solver for each test suite in the picture with black square.

Percentage of problems solved, per solver and benchmark suite at :math:`NF = 2,000`

Figure 0.1: Percentage of problems solved, per solver and benchmark suite at NF = 2,000


Similarly, looking at problems solved for each Global Optimization solver and each benchmark suite, at a 10,000 limit on function evaluations (which is the maximum I have put in any benchmark test suite in this exercise):

Table 0.4: Percentage of problems solved, per solver and benchmark suite at NF = 10,000
Test Suite AMPGO BasinHopping BiteOpt CMA-ES CRS2 DE DIRECT DualAnnealing LeapFrog MCS PSWARM SCE SHGO Best Solver
CVMG 58.3% 55.8% 49.2% 53.3% 40.6% 53.1% 56.1% 54.7% 42.5% 56.9% 48.9% 53.6% 56.1% AMPGO
GKLS 26.6% 19.4% 22.8% 12.5% 13.7% 19.4% 28.6% 30.1% 7.5% 39.3% 19.5% 13.8% 29.5% MCS
GlobalLib 71.8% 53.6% 79.0% 54.1% 67.4% 65.2% 48.6% 63.0% 54.7% 85.1% 44.8% 66.3% 61.9% MCS
GlobOpt 5.9% 20.8% 10.8% 3.1% 18.4% 10.8% 10.4% 7.3% 1.4% 24.0% 7.3% 3.8% 3.8% MCS
GOTPY 25.5% 5.0% 14.8% 6.8% 14.2% 14.2% 26.8% 10.5% 6.5% 43.5% 10.8% 10.0% 26.5% MCS
Huygens 40.2% 4.5% 34.0% 22.3% 23.2% 36.8% 74.7% 9.0% 6.2% 27.7% 37.5% 35.8% 6.0% DIRECT
LGMVG 73.0% 11.8% 49.7% 33.2% 57.6% 69.1% 73.7% 79.6% 32.6% 86.2% 57.9% 53.6% 95.1% SHGO
MMTFG 75.8% 77.7% 78.9% 66.8% 61.1% 72.0% 64.4% 71.3% 43.9% 85.5% 67.7% 35.5% 78.7% MCS
MPM2 59.2% 88.3% 39.0% 52.3% 37.1% 47.1% 68.8% 57.3% 23.8% 87.1% 47.1% 43.1% 80.0% BasinHopping
NgLi 85.4% 33.3% 54.6% 82.1% 79.2% 82.5% 76.2% 72.9% 42.9% 96.2% 75.4% 79.6% 84.6% MCS
NIST 51.9% 14.8% 70.4% 44.4% 70.4% 40.7% 3.7% 14.8% 22.2% 74.1% 7.4% 37.0% 7.4% MCS
NLSE 75.0% 68.2% 90.9% 59.1% 52.3% 65.9% 52.3% 61.4% 65.9% 95.5% 54.5% 68.2% 65.9% MCS
RandomFields 27.5% 28.5% 31.0% 16.2% 14.0% 34.0% 44.0% 41.5% 8.3% 49.4% 24.2% 26.7% 58.1% SHGO
Robust 76.0% 41.7% 73.1% 58.1% 51.4% 59.0% 49.0% 64.8% 23.8% 59.5% 51.2% 31.2% 44.3% AMPGO
Schoen 90.5% 95.1% 74.4% 78.9% 69.8% 74.4% 83.5% 87.7% 52.6% 96.8% 71.9% 69.5% 96.1% MCS
SciPy Extended 80.7% 69.3% 89.8% 61.8% 79.8% 84.3% 77.9% 77.2% 62.7% 88.7% 75.9% 80.0% 77.1% BiteOpt
Weighted Average 58.1% 46.6% 52.4% 44.9% 44.4% 52.8% 58.2% 51.6% 28.8% 66.6% 47.5% 40.7% 57.5% MCS

And the corresponding image in Figure 0.2:

Percentage of problems solved, per solver and benchmark suite at :math:`NF = 10,000`

Figure 0.2: Percentage of problems solved, per solver and benchmark suite at NF = 10,000


There are some interesting conclusions to be drawn already: considering the test suites only for the moment, irrespective of the budget of function evaluations available, there seems to be very tough benchmark problems. The ones that stand out the most are GlobOpt, GOTPY, GKLS and Huygens. Even at very high limit on number of function evaluations, none of the solvers manages to solve more than 50% of the toughest three above.

Secondly, from the solvers point of view, it seems like MCS is doing quite a good job overall, followed by SHGO, DIRECT and BiteOpt. For a moderate budget (at NF = 2,000), MCS is the best solver for 8 of the 16 benchmark suites. At the maximum level of NF = 10,000, MCS is the best solver for 10 of the 16 benchmark suites.

However, in order to draw any meaningful, very high-level conclusion on the solvers performances we have to deepen our analysis and consider the problem dimensionality and the variations in budget in terms of maximum number of functions evaluations allowed.

size Dimensionality Effects

In this respect, we can first consider what fraction of problems each solver solved as a function of the benchmark dimensionality, and this is what the following two figures are showing (the first one at NF = 2,000 and the second one at NF = 10,000).

Percentage of problems solved, per solver and benchmark suite at :math:`NF = 2,000`

Figure 0.3: Percentage of problems solved as a function of problem dimension, per solver at NF = 2,000


Percentage of problems solved, per solver and benchmark suite at :math:`NF = 10,000`

Figure 0.4: Percentage of problems solved as a function of problem dimension, per solver at NF = 10,000


What is interesting to see from these graphs is that MCS is always a very good contender, irrespective of the problem dimensionality, followed by SHGO, DualAnnealing and DIRECT, although the results for DIRECT are heavily skewed due to the good performances it has on the Huygens benchmark, which is composed exclusively by 2D functions (and very many of them, 600).

test_functions Expensive Objective Functions

If we consider, for a moment, a relatively tight budget in terms of available number of function evaluations, we could draw a potential conclusion on which solvers are the most effective when resources are scarce. Figure 0.5 shows exactly that, by displaying the percentage of problems solved when the budget is relatively limited (200 \leq NF \leq 500).

Percentage of problems solved, per solver, at limited budget

Figure 0.5: Percentage of problems solved for low budget function evaluations


There are two main high-level conclusions we can draw from Figure 0.5:

  1. MCS appears to be the winner in this case too, even though it is by a very narrow margin.
  2. The SciPy global solvers SHGO and DualAnnealing are among the best when the number of functions evaluations is limited - this is relatively common in real-life problems where the objective function may take several hours to compute.
  3. AMPGO is unfortunately not as powerful anymore - this new set of benchmarks is much harder than my original tests in 2013.
  4. Other solvers - such as BiteOpt, BasinHopping and AMPGO begin to catch up with the leaders only when the budget expands reasonably (i.e., NF = 500).

time Simulation Time

While I don’t generally care about the performances of an optimization algorithm in terms of elapsed time, for this set of benchmarks I actually recorded the elapsed time for each solver - and specifically the average time spent by each solver in its own internal meanderings between each function evaluation.

Of course, each benchmark test suite is written in its own language - whether Python, C or Fortran - so the time spent evaluating a test function greatly vary between benchmarks but it is the same for all solvers for any specific test suite. In that sense, we can attempt some sort of “CPU” timings to see what are the fastest or slowest solvers in terms of time spent per function evaluation.

In Table 0.5 you can see the performances in terms of CPU time for each solver and each benchmark.

Table 0.5: Time spent per function evaluation in ms, per solver and benchmark suite at NF = 2,000
Test Suite AMPGO BasinHopping BiteOpt CMA-ES CRS2 DE DIRECT DualAnnealing LeapFrog MCS PSWARM SCE SHGO Slowest Solver
CVMG 0.066 0.054 0.094 0.219 0.039 0.099 0.041 0.085 0.079 0.825 0.071 0.163 0.565 MCS
GKLS 0.067 0.050 0.094 0.152 0.050 0.096 0.051 0.084 0.085 1.098 0.046 0.166 1.023 MCS
GlobalLib 0.095 0.097 0.125 0.301 0.078 0.129 0.079 0.111 0.113 0.416 0.103 0.197 1.711 SHGO
GlobOpt 0.030 0.014 0.055 0.168 0.004 0.063 0.004 0.055 0.040 0.308 0.031 0.107 0.577 SHGO
GOTPY 0.072 0.062 0.109 0.215 0.054 0.115 0.056 0.093 0.094 0.364 0.082 0.171 0.273 MCS
Huygens 0.080 0.059 0.090 0.205 0.044 0.105 0.047 0.092 0.072 0.288 0.072 0.130 0.189 MCS
LGMVG 0.061 0.058 0.096 0.198 0.043 0.105 0.045 0.087 0.085 0.483 0.072 0.162 0.449 MCS
MMTFG 0.028 0.018 0.050 0.233 0.005 0.061 0.005 0.044 0.036 0.278 0.032 0.102 0.223 MCS
MPM2 0.434 0.415 0.458 0.580 0.403 0.467 0.405 0.451 0.442 0.828 0.440 0.531 0.764 MCS
NgLi 0.019 0.015 0.052 0.177 0.004 0.059 0.005 0.039 0.040 0.399 0.020 0.129 0.390 MCS
NIST 0.043 0.034 0.079 0.291 0.027 0.087 0.029 0.066 0.069 0.316 0.056 0.166 1.404 SHGO
NLSE 0.046 0.045 0.080 0.327 0.025 0.085 0.028 0.057 0.065 0.301 0.058 0.156 1.772 SHGO
RandomFields 1.302 1.293 1.317 1.583 1.256 1.342 1.255 1.328 1.291 1.636 1.284 1.360 1.854 SHGO
Robust 0.056 0.042 0.091 0.204 0.031 0.094 0.032 0.079 0.075 0.564 0.062 0.148 0.954 SHGO
Schoen 0.060 0.050 0.096 0.237 0.035 0.105 0.035 0.067 0.078 0.585 0.070 0.183 0.389 MCS
SciPy Extended 0.061 0.055 0.084 0.248 0.040 0.096 0.041 0.075 0.071 0.349 0.068 0.134 0.236 MCS
Weighted Average 0.171 0.159 0.197 0.326 0.147 0.205 0.148 0.189 0.183 0.679 0.169 0.259 0.707 SHGO

The table reflects exactly my feelings while I was running the different benchmark test suites. Test suites written in Python (and especially the ones re-coded by me) such as GKLS, RandomFields and MPM2 are not particularly performant. Other test suites, either originally coded in C and wrapped by in with Cython or the ones I have translated to Fortran - such as MMTFG and GlobOpt, are relatively fast overall.

The table also perfectly represent me impatiently waiting for MCS and SHGO to finish their optimizations, especially when varying the maximum number of function evaluations allowed from 100 to 10,000... it took forever for those two solvers to finish, and the higher the dimensionality the worse the problem becomes.

This can actually be seen in Figure 0.6, which shows the average CPU time spent by each solver per function evaluation as a function of the problem dimension in the GKLS benchmark. Pretty much all the other solvers are stable around a relatively low runtime (which of course depends on how busy my machine was - those numbers are also influenced by the fact that the benchmarks are run in parallel, with one CPU core per solver), except for MCS and SHGO.

CPU runtime, per solver, for the GKLS test suite

Figure 0.6: CPU runtime, per solver, for the GKLS test suite as a function of problem dimension


This behaviour is somehow to be expected, SHGO is explicitely targeted to low-dimensional problems, as explicitely stated in the “Notes”” section of https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.shgo.html and MCS represents my translation (pretty much line by line) of an already inefficient Matlab code into a language even less efficient, Python.

solver Solvers - Example

To illustrate and contrast the search strategies that are employed by different algorithms, Figure 0.7 shows how the different solvers progress for the objective function SixHumpCamel. This two-dimensional test problem, referred to as the “six-hump camel back function”, exhibits six local minima, two of which are global minima. In the graphs of Figure 0.7, red and blue are used to represent high and low objective function values, respectively. Global minima are located at [−0.0898, 0.7126] and [0.0898, −0.7126] and are marked with magenta circles.

Each solver was given a limit of 2,500 function evaluations and the points evaluated are marked with white crosses. Solvers that require a starting point were given the same starting point. Starting points are marked with a green circle. The trajectory of the progress of the best point is marked with an orange line, and the final solution is marked with a yellow circle. As illustrated by these plots, solvers like CMA-ES, SHGO, BasinHopping spend some time looking around local optima. Stochastic solvers like PSWARM perform a rather large number of function evaluations that cover the entire search space. Partitioning-based solvers seem to strike a balance by evaluating more points than local search algorithms but fewer than stochastic search approaches.

Solvers example

Figure 0.7: Solver search progress for test problem Six Humps Camel Back function