.. include:: headings.inc .. role:: red .. role:: green .. role:: backred .. role:: backyellow .. role:: backgreen .. _the benchmarks: |infinity| The Benchmarks ========================= This section presents a very high-level overview of all the benchmark test suites I have considered and also an extremely high-level analysis of the results obtained with the different solvers across all benchmarks. The conclusions on this section are quantitative enough to get a feeling about a specific solver performance, but given the amount of information condensed in a few tables/figures the analysis presented here has to be complemented by the suite-by-suite examination of the results. One thing is definitely sure: the *No Free Lunch Theorem* strikes again (`NFL `_). In this page, the results in tables and pictures are presented by benchmark test suite in **alphabetical** order, although the order of the HTML web pages containing the description of the test suites is different: I have sorted them by their widespread usage in the optimization community or by the level of interest/novelty they inspired to me. So, for example, the first HTML page after this section contains the :ref:`SciPy Extended` benchmark test suite, followed by :ref:`GKLS` and then :ref:`GlobOpt` and so on. |stop| Stopping Conditions and Tolerances ----------------------------------------- As mentioned in the main page, all the test functions on almost all benchmark suites have known global optimum values: in the few cases where the global optimum value is unknown (or I was unable to calculate it via analysis) their value has been estimated by enormous grid searches plus repeated minimizations. I then considered an algorithm to be successful if: .. math:: \lvert F_{known \hspace{3pt} minimum} - F_{algorithm \hspace{3pt} minimum} \rvert \leq 10^{-6} All the other checks for tolerances - be those on relative change on the objective function from one iteration to the other, or too small variations in the variables in the function domain - have been disabled or mercilessly and brutally commented out in the code for each solver. For example, for `nlopt `_ algorithms like **CRS2** or **DIRECT**, a way to ignore other tolerances and only use the formula above may be: .. code-block:: python :emphasize-lines: 10,11,12,13,14,15,16 import nlopt algorithm = nlopt.GN_DIRECT_L problem = nlopt.opt(algorithm, N) problem.set_min_objective(objective_function) problem.set_lower_bounds(lb) problem.set_upper_bounds(ub) problem.set_stopval(fglob + tolf) problem.set_maxeval(maxfun) problem.set_ftol_abs(-1) problem.set_ftol_rel(-1) problem.set_xtol_abs(-1) problem.set_xtol_rel(-1) |description| High-Level Description of the Test Suites ------------------------------------------------------- This whole exercise is made up of **6,825** test problems divided across **16** different test suites: most of these problems are of low dimensionality (2 to 6 variables) with a few benchmarks extending to 9+ variables. With all the sensitivities performed during this exercise on those benchmarks, the overall grand total number of functions evaluations stands at **3,859,786,025** - close to **4 billion**. Not bad. Each test suite has a separate section in the next pages, describing in a bit more detail what I have managed to understand from the approach used to create the benchmark suite - which in many cases is not much, but I hope I will be able to give a general idea of the main characteristics of each test suite, with a bunch of cool-looking pictures too... So, in order of implementation, these are the benchmark suites: 1. :ref:`SciPy Extended`: 235 multivariate problems (where the number of independent variables ranges from 2 to 17), again with multiple local/global minima. I have added about 40 new functions to the standard `SciPy benchmarks `_ and fixed a few bugs in the existing benchmark models in the SciPy repository. 2. :ref:`GKLS`: 1,500 test functions, with dimensionality varying from 2 to 6, generated with the super famous `GKLS Test Functions Generator `_. I have taken the original C code (available at http://netlib.org/toms/) and converted it to Python. 3. :ref:`GlobOpt`: 288 tough problems, with dimensionality varying from 2 to 5, created with another test function generator which I arbitrarily named "GlobOpt": https://www.researchgate.net/publication/225566516_A_new_class_of_test_functions_for_global_optimization . The original code is in C++ and I have bridged it to Python using Cython. **Many thanks** go to Professor Marco Locatelli for providing an updated copy of the C++ source code. 4. :ref:`MMTFG`: sort-of an acronym for "Multi-Modal Test Function with multiple Global minima", this test suite implements the work of Jani Ronkkonen: https://www.researchgate.net/publication/220265526_A_Generator_for_Multimodal_Test_Functions_with_Multiple_Global_Optima . It contains 981 test problems with dimensionality varying from 2 to 4. The original code is in C and I have bridge it to Python using Cython. 5. :ref:`GOTPY`: a generator of benchmark functions using the Bocharov-Feldbaum "Method-Min", containing 400 test problems with dimensionality varying from 2 to 5. I have taken the Python implementation from https://github.com/redb0/gotpy and improved it in terms of runtime. Original paper from http://www.mathnet.ru/php/archive.phtml?wshow=paper&jrnid=at&paperid=11985&option_lang=eng . 6. :ref:`Huygens`: this benchmark suite is very different from the rest, as it uses a "fractal" approach to generate test functions. It is based on the work of Cara MacNish on `Fractal Functions `_. The original code is in Java, and at the beginning I just converted it to Python: given it was slow as a turtle, I have re-implemented it in Fortran and wrapped it using `f2py `_, then generating 600 2-dimensional test problems out of it. 7. :ref:`LGMVG`: not sure about the meaning of the acronym, but the implementation follows the "Max-Set of Gaussians Landscape Generator" described in http://boyuan.global-optimization.com/LGMVG/index.htm . Source code is given in Matlab, but it's fairly easy to convert it to Python. This test suite contains 304 problems with dimensionality varying from 2 to 5. 8. :ref:`NgLi`: Stemming from the work of Chi-Kong Ng and Duan Li, this is a test problem generator for unconstrained optimization, but it's fairly easy to assign bound constraints to it. The methodology is described in https://www.sciencedirect.com/science/article/pii/S0305054814001774 , while the Matlab source code can be found in http://www1.se.cuhk.edu.hk/~ckng/generator/ . I have used the Matlab script to generate 240 problems with dimensionality varying from 2 to 5 by outputting the generator parameters in text files, then used Python to create the objective functions based on those parameters and the benchmark methodology. 9. :ref:`MPM2`: Implementing the "Multiple Peaks Model 2", there is a Python implementation at https://github.com/jakobbossek/smoof/blob/master/inst/mpm2.py . This is a test problem generator also used in the `smoof `_ library, I have taken the code almost as is and generated 480 benchmark functions with dimensionality varying from 2 to 5. 10. :ref:`RandomFields`: as described in https://www.researchgate.net/publication/301940420_Global_optimization_test_problems_based_on_random_field_composition , it generates benchmark functions by "smoothing" one or more multidimensional discrete random fields and composing them. No source code is given, but the implementation is fairly straightforward from the article itself. 11. :ref:`NIST`: not exactly the realm of Global Optimization solvers, but the `NIST StRD `_ dataset can be used to generate a single objective function as "sum of squares". I have used the NIST dataset as implemented in `lmfit `_, thus creating 27 test problems with dimensionality ranging from 2 to 9. 12. :ref:`GlobalLib`: Arnold Neumaier `maintains `_ a suite of test problems termed "COCONUT Benchmark" and Sahinidis has converted the GlobalLib and PricentonLib AMPL/GAMS dataset into C/Fortran code (http://archimedes.cheme.cmu.edu/?q=dfocomp ). I have used a simple C parser to convert the benchmarks from C to Python. The global minima are taken from `Sahinidis `_ or from Neumaier or refined using the `NEOS server `_ when the accuracy of the reported minima is too low. The suite contains 181 test functions with dimensionality varying between 2 and 9. 13. :ref:`CVMG`: another "landscape generator", I had to dig it out using the Wayback Machine at http://web.archive.org/web/20100612044104/https://www.cs.uwyo.edu/~wspears/multi.kennedy.html , the acronym stands for "Continuous Valued Multimodality Generator". Source code is in C++ but it's fairly easy to port it to Python. In addition to the original implementation (that uses the `Sigmoid `_ as a softmax/transformation function) I have added a few others to create varied landscapes. 360 test problems have been generated, with dimensionality ranging from 2 to 5. 14. :ref:`NLSE`: again, not really the realm of Global optimization solvers, but Nonlinear Systems of Equations can be transformed to single objective functions to optimize. I have drawn from many different sources (`Publications `_, `ALIAS/COPRIN `_ and many others) to create 44 systems of nonlinear equations with dimensionality ranging from 2 to 8. 15. :ref:`Schoen`: based on the early work of Fabio Schoen and his `short note `_ on a simple but interesting idea on a test function generator, I have taken the C code in the note and converted it into Python, thus creating 285 benchmark functions with dimensionality ranging from 2 to 6. **Many thanks** go to Professor Fabio Schoen for providing an updated copy of the source code and for the email communications. 16. :ref:`Robust`: the last benchmark test suite for this exercise, it is actually composed of 5 different kind-of analytical test function generators, containing deceptive, multimodal, flat functions depending on the settings. Matlab source code is available at http://www.alimirjalili.com/RO.html , I simply converted it to Python and created 420 benchmark functions with dimensionality ranging from 2 to 6. As a high-level summary, `Table 0.1`_ below reports for each test suite the number of problems as a function of dimensionality, together with their totals. | .. _Table 0.1: .. cssclass:: pretty-table benchmark_dimensionality_table .. table:: **Table 0.1**: Dimensionality of the test functions for all the benchmark suites +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | **Test Suite** | **N = 2** | **N = 3** | **N = 4** | **N = 5** | **N = 6+** | **Total** | +=======================+===============+===============+===============+===============+================+===========+ | :ref:`CVMG` | 90 | 90 | 90 | 90 | 0 | 360 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`GKLS` | 300 | 300 | 300 | 300 | 300 | 1,500 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`GlobalLib` | 76 | 23 | 26 | 9 | 47 | 181 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`GlobOpt` | 72 | 72 | 72 | 72 | 0 | 288 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`GOTPY` | 100 | 100 | 100 | 100 | 0 | 400 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`Huygens` | 600 | 0 | 0 | 0 | 0 | 600 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`LGMVG` | 76 | 76 | 76 | 76 | 0 | 304 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`MMTFG` | 312 | 330 | 339 | 0 | 0 | 981 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`MPM2` | 120 | 120 | 120 | 120 | 0 | 480 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`NgLi` | 60 | 60 | 60 | 60 | 0 | 240 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`NIST` | 6 | 7 | 3 | 2 | 9 | 27 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`NLSE` | 15 | 15 | 3 | 1 | 10 | 44 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`RandomFields` | 120 | 120 | 120 | 120 | 0 | 480 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`Robust` | 84 | 84 | 84 | 84 | 84 | 420 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`Schoen` | 57 | 57 | 57 | 57 | 57 | 285 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | :ref:`SciPy Extended` | 193 | 13 | 13 | 6 | 10 | 235 | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | **Total** | **2,281** | **1,467** | **1,463** | **1,097** | **517** | **6,825** | +-----------------------+---------------+---------------+---------------+---------------+----------------+-----------+ | For pretty much all the test suites the value of the objective function at the global optimum (or optima) is known, and for many of those also the *position* of the global optimum (or optima) is also known. Since I only care about the value of the objective function at the global optimum (or optima), it didn't bother me that some test suites didn't have a way to know the position of the global optimum (or optima) in the search space. `Table 0.2`_ gives an indication of whether, for each benchmark suite, the optimal value of the objective function (**FOPT**) and the position(s) of the global optima (**XOPT**) in the search space is known beforehand. This is my analysis and it may easily be that I have missed a few technical detail by classifying a test suite in a way or another, i.e. maybe the position/value of the optima is known but I have failed to recognize/derive it. In addition, `Table 0.2`_ gives an indication of the "hardness" of a benchmark test suite, based on my results on applying all the optimizers under consideration with a given budget of number of function evaluations, and specifically a maximum of :math:`NF = 2,000`. The criteria for this classification are completely arbitrary, but I chose them to be: 1. Maximum allowed number of functions evaluations :math:`NF = 2,000` 2. Conditions for the classification: .. math:: \begin{cases} \textrm{Easy} & \textrm{if } success \geq 75 \% \\ \textrm{Medium} & \textrm{if } 50 \% < success < 75 \% \\ \textrm{Hard} & \textrm{if } success \leq 50 \% \\ \end{cases} | The details are summarized in `Table 0.2`_. .. _Table 0.2: .. cssclass:: pretty-table benchmark_dimensionality_table .. table:: **Table 0.2**: Global optimum value and position assessment ======================= =================== ==================== ==================== **Test Suite** F\ :sub:`opt` Known X\ :sub:`opt` Known Difficulty ======================= =================== ==================== ==================== :ref:`CVMG` :green:`Yes` :red:`No` :backyellow:`Medium` :ref:`GKLS` :green:`Yes` :green:`Yes` :backred:`Hard` :ref:`GlobalLib` :green:`Yes` :green:`Yes` :backgreen:`Easy` :ref:`GlobOpt` :green:`Yes` :red:`No` :backred:`Hard` :ref:`GOTPY` :green:`Yes` :green:`Yes` :backred:`Hard` :ref:`Huygens` :red:`No` :red:`No` :backyellow:`Medium` :ref:`LGMVG` :green:`Yes` :red:`No` :backgreen:`Easy` :ref:`MMTFG` :green:`Yes` :green:`Yes` :backgreen:`Easy` :ref:`MPM2` :green:`Yes` :green:`Yes` :backyellow:`Medium` :ref:`NgLi` :green:`Yes` :green:`Yes` :backgreen:`Easy` :ref:`NIST` :green:`Yes` :green:`Yes` :backyellow:`Medium` :ref:`NLSE` :green:`Yes` :green:`Yes` :backgreen:`Easy` :ref:`RandomFields` :red:`No` :red:`No` :backred:`Hard` :ref:`Robust` :green:`Yes` :green:`Yes` :backyellow:`Medium` :ref:`Schoen` :green:`Yes` :green:`Yes` :backgreen:`Easy` :ref:`SciPy Extended` :green:`Yes` :green:`Yes` :backgreen:`Easy` ======================= =================== ==================== ==================== | |results| Aggregated, High-Level Results ---------------------------------------- This section provides some very high-level results stemming from the application of all nonlinear solvers to all the benchmark test suites. It is by no means a complete analysis but it does offer an insight on the various solvers performances, both in general, as a function of the dimensionality of the problem and as a function of the type of benchmark. Some test suites are much harder than others, and I have also tried to categorize the benchmark suites in terms of their "difficulty", if that even makes sense. I am going to present results in table and graphs, normally side-by-side or one after the other. I prefer the visuals, but I appreciate that quite a few people want to see the raw numbers. Since the amount of results available is kind of humongous, I have decided to present the high-level results for all solvers at two cutoffs in terms of function evaluations limits, and specifically :math:`NF = 2,000` and :math:`NF = 10,000` maximum evaluations. |info| General Results ++++++++++++++++++++++ Now, stopping the chit chat, `Table 0.3`_ and `Figure 0.1`_ below show the overall high-level results in terms of percentage of problems solved for each Global Optimization solver and each benchmark suite, at a **2,000** limit on function evaluations: .. _Table 0.3: .. cssclass:: pretty-table benchmark_dimensionality_table .. table:: **Table 0.3**: Percentage of problems solved, per solver and benchmark suite at :math:`NF = 2,000` +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | Test Suite | AMPGO | BasinHopping | BiteOpt | CMA-ES | CRS2 | DE | DIRECT | DualAnnealing | LeapFrog | MCS | PSWARM | SCE | SHGO | **Best Solver** | +=======================+========+==============+===========+========+===========+========+===========+===============+==========+===========+========+========+===========+==================+ | :ref:`CVMG` | 53.9% | 55.0% | 47.2% | 50.8% | 40.6% | 48.3% | 55.6% | 54.2% | 42.5% | **56.4%** | 41.9% | 50.6% | 54.7% | **MCS** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`GKLS` | 14.1% | 15.5% | 18.6% | 7.1% | 12.1% | 13.7% | 21.3% | 24.0% | 7.5% | **28.5%** | 9.9% | 12.1% | 23.3% | **MCS** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`GlobalLib` | 60.8% | 53.0% | 72.4% | 43.1% | 43.6% | 39.8% | 45.9% | 59.7% | 49.2% | **80.7%** | 19.3% | 56.4% | 60.2% | **MCS** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`GlobOpt` | 1.4% | 10.4% | 6.2% | 0.3% | **13.9%** | 4.5% | 3.8% | 5.6% | 1.4% | 6.9% | 0.0% | 0.3% | 0.7% | **CRS2** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`GOTPY` | 7.5% | 3.2% | 11.8% | 3.0% | 9.2% | 6.2% | **23.0%** | 8.8% | 6.5% | 17.5% | 4.2% | 7.8% | 18.0% | **DIRECT** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`Huygens` | 13.0% | 2.0% | 22.5% | 8.0% | 22.7% | 36.5% | **67.0%** | 2.8% | 6.2% | 11.5% | 1.8% | 14.5% | 3.0% | **DIRECT** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`LGMVG` | 46.4% | 13.2% | 43.8% | 28.3% | 50.7% | 42.1% | 65.8% | 75.3% | 32.6% | 70.1% | 20.4% | 48.4% | **77.6%** | **SHGO** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`MMTFG` | 60.2% | 72.1% | 72.3% | 52.2% | 49.8% | 32.9% | 59.4% | 67.0% | 43.9% | **77.1%** | 19.6% | 24.6% | 69.2% | **MCS** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`MPM2` | 36.0% | **68.1%** | 35.0% | 39.6% | 35.0% | 32.9% | 59.8% | 50.6% | 23.8% | 65.6% | 21.2% | 40.8% | 65.2% | **BasinHopping** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`NgLi` | 59.6% | 32.5% | 51.7% | 56.2% | 61.7% | 43.3% | 59.6% | 68.8% | 42.9% | **86.7%** | 2.5% | 77.9% | 75.8% | **MCS** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`NIST` | 33.3% | 11.1% | 48.1% | 14.8% | 25.9% | 7.4% | 3.7% | 11.1% | 18.5% | **63.0%** | 0.0% | 29.6% | 7.4% | **MCS** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`NLSE` | 63.6% | 63.6% | 75.0% | 52.3% | 38.6% | 45.5% | 47.7% | 56.8% | 54.5% | **79.5%** | 27.3% | 61.4% | 65.9% | **MCS** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`RandomFields` | 10.6% | 19.4% | 24.8% | 9.2% | 13.1% | 19.8% | **34.4%** | 34.2% | 8.3% | 29.4% | 14.4% | 17.1% | 34.0% | **DIRECT** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`Robust` | 52.1% | 36.2% | **62.4%** | 33.6% | 40.5% | 27.9% | 45.7% | 51.4% | 22.9% | 45.5% | 28.1% | 23.1% | 36.4% | **BiteOpt** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`Schoen` | 53.3% | 90.2% | 70.5% | 71.2% | 67.4% | 52.3% | 75.4% | 85.6% | 52.6% | 91.2% | 58.2% | 68.4% | **95.4%** | **SHGO** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`SciPy Extended` | 69.5% | 64.7% | **83.4%** | 58.8% | 73.4% | 75.4% | 72.5% | 72.4% | 61.8% | 81.9% | 21.2% | 71.2% | 69.7% | **BiteOpt** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | **Weighted Average** | 39.1% | 41.0% | 46.1% | 33.7% | 37.8% | 33.3% | 51.8% | 46.6% | 28.4% | 53.2% | 18.6% | 32.8% | 48.6% | **MCS** | +-----------------------+--------+--------------+-----------+--------+-----------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | `Figure 0.1`_ shows some sort of "heatmap": the greener the color of a cell is, the better the performance of a specific solver is on the corresponding benchmarks. I have highlighted the "best" solver for each test suite in the picture with black square. .. _Figure 0.1: .. figure:: figures/high_level/high_level_heatmap_2000.png :alt: Percentage of problems solved, per solver and benchmark suite at :math:`NF = 2,000` :align: center **Figure 0.1**: Percentage of problems solved, per solver and benchmark suite at :math:`NF = 2,000` | Similarly, looking at problems solved for each Global Optimization solver and each benchmark suite, at a **10,000** limit on function evaluations (which is the maximum I have put in any benchmark test suite in this exercise): .. _Table 0.4: .. cssclass:: pretty-table benchmark_dimensionality_table .. table:: **Table 0.4**: Percentage of problems solved, per solver and benchmark suite at :math:`NF = 10,000` +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | Test Suite | AMPGO | BasinHopping | BiteOpt | CMA-ES | CRS2 | DE | DIRECT | DualAnnealing | LeapFrog | MCS | PSWARM | SCE | SHGO | **Best Solver** | +=======================+===========+==============+===========+========+========+========+===========+===============+==========+===========+========+========+===========+==================+ | :ref:`CVMG` | **58.3%** | 55.8% | 49.2% | 53.3% | 40.6% | 53.1% | 56.1% | 54.7% | 42.5% | 56.9% | 48.9% | 53.6% | 56.1% | **AMPGO** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`GKLS` | 26.6% | 19.4% | 22.8% | 12.5% | 13.7% | 19.4% | 28.6% | 30.1% | 7.5% | **39.3%** | 19.5% | 13.8% | 29.5% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`GlobalLib` | 71.8% | 53.6% | 79.0% | 54.1% | 67.4% | 65.2% | 48.6% | 63.0% | 54.7% | **85.1%** | 44.8% | 66.3% | 61.9% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`GlobOpt` | 5.9% | 20.8% | 10.8% | 3.1% | 18.4% | 10.8% | 10.4% | 7.3% | 1.4% | **24.0%** | 7.3% | 3.8% | 3.8% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`GOTPY` | 25.5% | 5.0% | 14.8% | 6.8% | 14.2% | 14.2% | 26.8% | 10.5% | 6.5% | **43.5%** | 10.8% | 10.0% | 26.5% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`Huygens` | 40.2% | 4.5% | 34.0% | 22.3% | 23.2% | 36.8% | **74.7%** | 9.0% | 6.2% | 27.7% | 37.5% | 35.8% | 6.0% | **DIRECT** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`LGMVG` | 73.0% | 11.8% | 49.7% | 33.2% | 57.6% | 69.1% | 73.7% | 79.6% | 32.6% | 86.2% | 57.9% | 53.6% | **95.1%** | **SHGO** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`MMTFG` | 75.8% | 77.7% | 78.9% | 66.8% | 61.1% | 72.0% | 64.4% | 71.3% | 43.9% | **85.5%** | 67.7% | 35.5% | 78.7% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`MPM2` | 59.2% | **88.3%** | 39.0% | 52.3% | 37.1% | 47.1% | 68.8% | 57.3% | 23.8% | 87.1% | 47.1% | 43.1% | 80.0% | **BasinHopping** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`NgLi` | 85.4% | 33.3% | 54.6% | 82.1% | 79.2% | 82.5% | 76.2% | 72.9% | 42.9% | **96.2%** | 75.4% | 79.6% | 84.6% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`NIST` | 51.9% | 14.8% | 70.4% | 44.4% | 70.4% | 40.7% | 3.7% | 14.8% | 22.2% | **74.1%** | 7.4% | 37.0% | 7.4% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`NLSE` | 75.0% | 68.2% | 90.9% | 59.1% | 52.3% | 65.9% | 52.3% | 61.4% | 65.9% | **95.5%** | 54.5% | 68.2% | 65.9% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`RandomFields` | 27.5% | 28.5% | 31.0% | 16.2% | 14.0% | 34.0% | 44.0% | 41.5% | 8.3% | 49.4% | 24.2% | 26.7% | **58.1%** | **SHGO** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`Robust` | **76.0%** | 41.7% | 73.1% | 58.1% | 51.4% | 59.0% | 49.0% | 64.8% | 23.8% | 59.5% | 51.2% | 31.2% | 44.3% | **AMPGO** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`Schoen` | 90.5% | 95.1% | 74.4% | 78.9% | 69.8% | 74.4% | 83.5% | 87.7% | 52.6% | **96.8%** | 71.9% | 69.5% | 96.1% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | :ref:`SciPy Extended` | 80.7% | 69.3% | **89.8%** | 61.8% | 79.8% | 84.3% | 77.9% | 77.2% | 62.7% | 88.7% | 75.9% | 80.0% | 77.1% | **BiteOpt** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | **Weighted Average** | 58.1% | 46.6% | 52.4% | 44.9% | 44.4% | 52.8% | 58.2% | 51.6% | 28.8% | 66.6% | 47.5% | 40.7% | 57.5% | **MCS** | +-----------------------+-----------+--------------+-----------+--------+--------+--------+-----------+---------------+----------+-----------+--------+--------+-----------+------------------+ | And the corresponding image in `Figure 0.2`_: .. _Figure 0.2: .. figure:: figures/high_level/high_level_heatmap_10000.png :alt: Percentage of problems solved, per solver and benchmark suite at :math:`NF = 10,000` :align: center **Figure 0.2**: Percentage of problems solved, per solver and benchmark suite at :math:`NF = 10,000` | There are some interesting conclusions to be drawn already: considering the test suites only for the moment, irrespective of the budget of function evaluations available, there seems to be very tough benchmark problems. The ones that stand out the most are :ref:`GlobOpt`, :ref:`GOTPY`, :ref:`GKLS` and :ref:`Huygens`. Even at very high limit on number of function evaluations, none of the solvers manages to solve more than 50% of the toughest three above. Secondly, from the solvers point of view, it seems like :ref:`MCS` is doing quite a good job overall, followed by **SHGO**, **DIRECT** and **BiteOpt**. For a moderate budget (at :math:`NF = 2,000`), MCS is the best solver for **8** of the **16** benchmark suites. At the maximum level of :math:`NF = 10,000`, MCS is the best solver for **10** of the **16** benchmark suites. However, in order to draw any meaningful, very high-level conclusion on the solvers performances we have to deepen our analysis and consider the problem dimensionality and the variations in budget in terms of maximum number of functions evaluations allowed. |size| Dimensionality Effects +++++++++++++++++++++++++++++ In this respect, we can first consider what fraction of problems each solver solved as a function of the benchmark dimensionality, and this is what the following two figures are showing (the first one at :math:`NF = 2,000` and the second one at :math:`NF = 10,000`). .. _Figure 0.3: .. figure:: figures/high_level/high_level_dimens_2000.png :alt: Percentage of problems solved, per solver and benchmark suite at :math:`NF = 2,000` :align: center **Figure 0.3**: Percentage of problems solved as a function of problem dimension, per solver at :math:`NF = 2,000` | .. _Figure 0.4: .. figure:: figures/high_level/high_level_dimens_10000.png :alt: Percentage of problems solved, per solver and benchmark suite at :math:`NF = 10,000` :align: center **Figure 0.4**: Percentage of problems solved as a function of problem dimension, per solver at :math:`NF = 10,000` | What is interesting to see from these graphs is that :ref:`MCS` is always a very good contender, irrespective of the problem dimensionality, followed by **SHGO**, **DualAnnealing** and **DIRECT**, although the results for **DIRECT** are heavily skewed due to the good performances it has on the :ref:`Huygens` benchmark, which is composed exclusively by 2D functions (and very many of them, 600). |test_functions| Expensive Objective Functions ++++++++++++++++++++++++++++++++++++++++++++++ If we consider, for a moment, a relatively tight budget in terms of available number of function evaluations, we could draw a potential conclusion on which solvers are the most effective when resources are scarce. `Figure 0.5`_ shows exactly that, by displaying the percentage of problems solved when the budget is relatively limited (:math:`200 \leq NF \leq 500`). .. _Figure 0.5: .. figure:: figures/high_level/high_level_budget.png :alt: Percentage of problems solved, per solver, at limited budget :align: center **Figure 0.5**: Percentage of problems solved for low budget function evaluations | There are two main high-level conclusions we can draw from `Figure 0.5`_: 1. :ref:`MCS` appears to be the winner in this case too, even though it is by a very narrow margin. 2. The SciPy global solvers **SHGO** and **DualAnnealing** are among the best when the number of functions evaluations is limited - this is relatively common in real-life problems where the objective function may take several hours to compute. 3. **AMPGO** is unfortunately not as powerful anymore - this new set of benchmarks is much harder than my original tests in 2013. 4. Other solvers - such as **BiteOpt**, **BasinHopping** and **AMPGO** begin to catch up with the leaders only when the budget expands reasonably (i.e., :math:`NF = 500`). |time| Simulation Time ++++++++++++++++++++++ While I don't generally care about the performances of an optimization algorithm in terms of elapsed time, for this set of benchmarks I actually recorded the elapsed time for each solver - and specifically the average time spent by each solver in its own internal meanderings between each function evaluation. Of course, each benchmark test suite is written in its own language - whether Python, C or Fortran - so the time spent evaluating a test function greatly vary *between* benchmarks but it is the same for all solvers for any specific test suite. In that sense, we can attempt some sort of "CPU" timings to see what are the fastest or slowest solvers in terms of **time spent per function evaluation**. In `Table 0.5`_ you can see the performances in terms of CPU time for each solver and each benchmark. .. _Table 0.5: .. cssclass:: pretty-table benchmark_dimensionality_table .. table:: **Table 0.5**: Time spent per function evaluation in ms, per solver and benchmark suite at :math:`NF = 2,000` +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | Test Suite | AMPGO | BasinHopping | BiteOpt | CMA-ES | CRS2 | DE | DIRECT | DualAnnealing | LeapFrog | MCS | PSWARM | SCE | SHGO | Slowest Solver | +=======================+========+==============+=========+========+========+========+========+===============+==========+===========+========+========+===========+================+ | :ref:`CVMG` | 0.066 | 0.054 | 0.094 | 0.219 | 0.039 | 0.099 | 0.041 | 0.085 | 0.079 | **0.825** | 0.071 | 0.163 | 0.565 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`GKLS` | 0.067 | 0.050 | 0.094 | 0.152 | 0.050 | 0.096 | 0.051 | 0.084 | 0.085 | **1.098** | 0.046 | 0.166 | 1.023 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`GlobalLib` | 0.095 | 0.097 | 0.125 | 0.301 | 0.078 | 0.129 | 0.079 | 0.111 | 0.113 | 0.416 | 0.103 | 0.197 | **1.711** | **SHGO** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`GlobOpt` | 0.030 | 0.014 | 0.055 | 0.168 | 0.004 | 0.063 | 0.004 | 0.055 | 0.040 | 0.308 | 0.031 | 0.107 | **0.577** | **SHGO** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`GOTPY` | 0.072 | 0.062 | 0.109 | 0.215 | 0.054 | 0.115 | 0.056 | 0.093 | 0.094 | **0.364** | 0.082 | 0.171 | 0.273 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`Huygens` | 0.080 | 0.059 | 0.090 | 0.205 | 0.044 | 0.105 | 0.047 | 0.092 | 0.072 | **0.288** | 0.072 | 0.130 | 0.189 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`LGMVG` | 0.061 | 0.058 | 0.096 | 0.198 | 0.043 | 0.105 | 0.045 | 0.087 | 0.085 | **0.483** | 0.072 | 0.162 | 0.449 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`MMTFG` | 0.028 | 0.018 | 0.050 | 0.233 | 0.005 | 0.061 | 0.005 | 0.044 | 0.036 | **0.278** | 0.032 | 0.102 | 0.223 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`MPM2` | 0.434 | 0.415 | 0.458 | 0.580 | 0.403 | 0.467 | 0.405 | 0.451 | 0.442 | **0.828** | 0.440 | 0.531 | 0.764 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`NgLi` | 0.019 | 0.015 | 0.052 | 0.177 | 0.004 | 0.059 | 0.005 | 0.039 | 0.040 | **0.399** | 0.020 | 0.129 | 0.390 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`NIST` | 0.043 | 0.034 | 0.079 | 0.291 | 0.027 | 0.087 | 0.029 | 0.066 | 0.069 | 0.316 | 0.056 | 0.166 | **1.404** | **SHGO** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`NLSE` | 0.046 | 0.045 | 0.080 | 0.327 | 0.025 | 0.085 | 0.028 | 0.057 | 0.065 | 0.301 | 0.058 | 0.156 | **1.772** | **SHGO** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`RandomFields` | 1.302 | 1.293 | 1.317 | 1.583 | 1.256 | 1.342 | 1.255 | 1.328 | 1.291 | 1.636 | 1.284 | 1.360 | **1.854** | **SHGO** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`Robust` | 0.056 | 0.042 | 0.091 | 0.204 | 0.031 | 0.094 | 0.032 | 0.079 | 0.075 | 0.564 | 0.062 | 0.148 | **0.954** | **SHGO** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`Schoen` | 0.060 | 0.050 | 0.096 | 0.237 | 0.035 | 0.105 | 0.035 | 0.067 | 0.078 | **0.585** | 0.070 | 0.183 | 0.389 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | :ref:`SciPy Extended` | 0.061 | 0.055 | 0.084 | 0.248 | 0.040 | 0.096 | 0.041 | 0.075 | 0.071 | **0.349** | 0.068 | 0.134 | 0.236 | **MCS** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | **Weighted Average** | 0.171 | 0.159 | 0.197 | 0.326 | 0.147 | 0.205 | 0.148 | 0.189 | 0.183 | 0.679 | 0.169 | 0.259 | 0.707 | **SHGO** | +-----------------------+--------+--------------+---------+--------+--------+--------+--------+---------------+----------+-----------+--------+--------+-----------+----------------+ | The table reflects exactly my feelings while I was running the different benchmark test suites. Test suites written in Python (and especially the ones re-coded by me) such as :ref:`GKLS`, :ref:`RandomFields` and :ref:`MPM2` are not particularly performant. Other test suites, either originally coded in C and wrapped by in with Cython or the ones I have translated to Fortran - such as :ref:`MMTFG` and :ref:`GlobOpt`, are relatively fast overall. The table also perfectly represent me impatiently waiting for :ref:`MCS` and **SHGO** to finish their optimizations, especially when varying the maximum number of function evaluations allowed from 100 to 10,000... it took forever for those two solvers to finish, and the higher the dimensionality the worse the problem becomes. This can actually be seen in `Figure 0.6`_, which shows the average CPU time spent by each solver per function evaluation as a function of the problem dimension in the :ref:`GKLS` benchmark. Pretty much all the other solvers are stable around a relatively low runtime (which of course depends on how busy my machine was - those numbers are also influenced by the fact that the benchmarks are run in parallel, with one CPU core per solver), except for :ref:`MCS` and **SHGO**. .. _Figure 0.6: .. figure:: figures/high_level/high_level_cpu_2000.png :alt: CPU runtime, per solver, for the GKLS test suite :align: center **Figure 0.6**: CPU runtime, per solver, for the GKLS test suite as a function of problem dimension | This behaviour is somehow to be expected, **SHGO** is explicitely targeted to low-dimensional problems, as explicitely stated in the "Notes"" section of https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.shgo.html and :ref:`MCS` represents my translation (pretty much line by line) of an already inefficient Matlab code into a language even less efficient, Python. .. toctree:: :maxdepth: 2 :hidden: scipy_extended gkls globopt mmtfg gotpy huygens lgmvg ngli mpm2 randomfields nist globallib cvmg nlse schoen robust