.. include:: headings.inc .. role:: boldred .. role:: boldgreen .. role:: backred .. role:: backyellow .. role:: backgreen .. role:: red .. role:: green .. _Robust: |infinity| Robust ================= | .. table:: =================== =================== ==================== F\ :sub:`opt` Known X\ :sub:`opt` Known Difficulty =================== =================== ==================== :green:`Yes` :red:`Yes` :backyellow:`Medium` =================== =================== ==================== | This is the last benchmark of the series, and you will forgive me if I stop after this... The Robust test suite is actually composed of 5 different kind-of analytical test function generators, containing deceptive, multimodal, flat functions depending on the settings. The theory behind this test suite is described in the paper https://www.sciencedirect.com/science/article/pii/S0020025515006301 . Matlab source code is available at http://www.alimirjalili.com/RO.html , I simply converted it to Python. | |methodology| Methodology ------------------------- The actual objective functions created by this test function generator depend on which generator is actually chosen - 5 of them are available. 1. For generator I: .. math:: f(x) = \frac{1}{\sqrt {2 \pi}}e^{-0.5 \left ( \frac{ \sum_{i=1}^{n} (x_i - 1.5)^2}{0.5} \right )^2} + \frac{2}{\sqrt {2 \pi}}e^{-0.5 \left ( \frac{\sum_{i=1}^{n} (x_i - 0.5)^2}{\alpha} \right )^2} Where :math:`\alpha` indicates the robustness of the global optimum by narrowing or widening it. The robustness of the global optimum is decreased proportionally to the value of the parameter :math:`\alpha`. 2. For generator II: .. math:: f(x) = -G(x) \cdot H(x_1) \cdot H(x_2) + \omega Where: .. math:: H(x) = \frac{e^{-2x^2} \sin \left [ 2 \pi \lambda \left (x + \frac{\pi}{4 \lambda} \right ) \right ] - x^{\beta}} {3} + 0.5 .. math:: G(x) = 1 + 10 \frac{\sum_{i=3}^{N} x_i}{N} This benchmark generator allows generating :math:`(\lambda + 1)^2` number of local optima through the search space. The search space becomes more challenging as :math:`\lambda` increases. 3. For generator III: .. math:: f(x) = (H(x_1) + H(x_2)) \cdot G(x) - 1 Where: .. math:: H(x) = \frac{1}{2} - 0.3 e^{ -\left ( \frac{x-0.2}{0.004}\right )^2}- 0.5 e^{ -\left ( \frac{x-0.5}{0.05}\right )^2} - 0.3 e^{ -\left ( \frac{x-0.8}{0.004}\right )^2} + \sin(\pi x) .. math:: G(x) = \sum_{i=3}^{N} 50 x_i^2 + 1 This benchmark generator generates deceptive test functions. 4. For generator IV: .. math:: f(x) = (H(x_1) + H(x_2)) \cdot G(x) - 2 Where: .. math:: H(x) = 1.2 - 0.2 e^{ -\left ( \frac{x-0.95}{0.03}\right )^2}- 0.2 e^{ -\left ( \frac{x-0.05}{0.01}\right )^2} .. math:: G(x) = \sum_{i=3}^{N} 50 x_i^2 This benchmark generator generates flat test functions. For the Robust benchmark, I have created 420 test functions with dimensionality ranging from 2 to 6. A few examples of 2D benchmark functions created with the Robust generator can be seen in `Figure 16.1`_. .. _Figure 16.1: +------------------------------------------------------------------------------+------------------------------------------------------------------------------+---------------------------------------------------------------------+ | .. figure:: ../benchmarks/Robust/figures/docs/Robust_1.png | .. figure:: ../benchmarks/Robust/figures/docs/Robust_2.png | .. figure:: ../benchmarks/Robust/figures/docs/Robust_3.png | | :align: center | :align: center | :align: center | | | | | | **Robust Function 2** | **Robust Function 10** | **Robust Function 65** | +------------------------------------------------------------------------------+------------------------------------------------------------------------------+---------------------------------------------------------------------+ | .. figure:: ../benchmarks/Robust/figures/docs/Robust_4.png | .. figure:: ../benchmarks/Robust/figures/docs/Robust_5.png | .. figure:: ../benchmarks/Robust/figures/docs/Robust_6.png | | :align: center | :align: center | :align: center | | | | | | **Robust Function 173** | **Robust Function 253** | **Robust Function 327** | +------------------------------------------------------------------------------+------------------------------------------------------------------------------+---------------------------------------------------------------------+ | |test_functions| General Solvers Performances --------------------------------------------- `Table 16.1`_ below shows the overall success of all Global Optimization algorithms, considering every benchmark function, for a maximum allowable budget of :math:`NF = 2,000`. The Robust benchmark suite is a medium-difficulty test suite: the best solver for :math:`NF = 2,000` is **BiteOpt**, with a success rate of 62.4%, followed at a distance by **AMPGO** and **DualAnnealing**. .. note:: The reported number of functions evaluations refers to **successful optimizations only**. | .. _Table 16.1: .. cssclass:: pretty-table benchmark_dimensionality_table .. table:: **Table 16.1**: Solvers performances on the Robust benchmark suite at NF = 2,000 +---------------------+---------------------+-----------------------+ | Optimization Method | Overall Success (%) | Functions Evaluations | +=====================+=====================+=======================+ | AMPGO | 52.14% | 583 | +---------------------+---------------------+-----------------------+ | BasinHopping | 36.19% | 651 | +---------------------+---------------------+-----------------------+ | BiteOpt | 62.38% | 383 | +---------------------+---------------------+-----------------------+ | CMA-ES | 33.57% | 768 | +---------------------+---------------------+-----------------------+ | CRS2 | 40.48% | 892 | +---------------------+---------------------+-----------------------+ | DE | 27.86% | 1,146 | +---------------------+---------------------+-----------------------+ | DIRECT | 45.71% | 600 | +---------------------+---------------------+-----------------------+ | DualAnnealing | 51.43% | 569 | +---------------------+---------------------+-----------------------+ | LeapFrog | 22.86% | 177 | +---------------------+---------------------+-----------------------+ | MCS | 45.48% | 268 | +---------------------+---------------------+-----------------------+ | PSWARM | 28.10% | 1,457 | +---------------------+---------------------+-----------------------+ | SCE | 23.10% | 710 | +---------------------+---------------------+-----------------------+ | SHGO | 36.43% | 368 | +---------------------+---------------------+-----------------------+ These results are also depicted in `Figure 16.2`_, which shows that **BiteOpt** is the better-performing optimization algorithm, followed at a distance by **AMPGO** and **DualAnnealing**. .. _Figure 16.2: .. figure:: figures/Robust/performances_Robust_2000.png :alt: Optimization algorithms performances on the Robust test suite at :math:`NF = 2,000` :align: center **Figure 16.2**: Optimization algorithms performances on the Robust test suite at :math:`NF = 2,000` | Pushing the available budget to a very generous :math:`NF = 10,000`, the results show **AMPGO** snatching the top spot from **BiteOpt**, although the two algorithms are close. **DualAnnealing** and :ref:`MCS` trail behind at a large distance. The results are also shown visually in `Figure 16.3`_. .. _Table 16.2: .. cssclass:: pretty-table benchmark_dimensionality_table .. table:: **Table 16.2**: Solvers performances on the Robust benchmark suite at NF = 10,000 +---------------------+---------------------+-----------------------+ | Optimization Method | Overall Success (%) | Functions Evaluations | +=====================+=====================+=======================+ | AMPGO | 75.95% | 1,897 | +---------------------+---------------------+-----------------------+ | BasinHopping | 41.67% | 1,100 | +---------------------+---------------------+-----------------------+ | BiteOpt | 73.10% | 1,272 | +---------------------+---------------------+-----------------------+ | CMA-ES | 58.10% | 2,330 | +---------------------+---------------------+-----------------------+ | CRS2 | 51.43% | 1,463 | +---------------------+---------------------+-----------------------+ | DE | 59.05% | 3,614 | +---------------------+---------------------+-----------------------+ | DIRECT | 49.05% | 771 | +---------------------+---------------------+-----------------------+ | DualAnnealing | 64.76% | 1,317 | +---------------------+---------------------+-----------------------+ | LeapFrog | 23.81% | 182 | +---------------------+---------------------+-----------------------+ | MCS | 59.52% | 1,268 | +---------------------+---------------------+-----------------------+ | PSWARM | 51.19% | 1,967 | +---------------------+---------------------+-----------------------+ | SCE | 31.19% | 1,726 | +---------------------+---------------------+-----------------------+ | SHGO | 44.29% | 1,630 | +---------------------+---------------------+-----------------------+ .. _Figure 16.3: .. figure:: figures/Robust/performances_Robust_10000.png :alt: Optimization algorithms performances on the Robust test suite at :math:`NF = 10,000` :align: center **Figure 16.3**: Optimization algorithms performances on the Robust test suite at :math:`NF = 10,000` | |results| Sensitivities on Functions Evaluations Budget ------------------------------------------------------- It is also interesting to analyze the success of an optimization algorithm based on the fraction (or percentage) of problems solved given a fixed number of allowed function evaluations, let’s say 100, 200, 300,... 2000, 5000, 10000. In order to do that, we can present the results using two different types of visualizations. The first one is some sort of "small multiples" in which each solver gets an individual subplot showing the improvement in the number of solved problems as a function of the available number of function evaluations - on top of a background set of grey, semi-transparent lines showing all the other solvers performances. This visual gives an indication of how good/bad is a solver compared to all the others as function of the budget available. Results are shown in `Figure 16.4`_. .. _Figure 16.4: .. figure:: figures/Robust/sm_maxfun_Robust.png :alt: Percentage of problems solved given a fixed number of function evaluations on the Robust test suite :align: center **Figure 16.4**: Percentage of problems solved given a fixed number of function evaluations on the Robust test suite | The second type of visualization is sometimes referred as "Slopegraph" and there are many variants on the plot layout and appearance that we can implement. The version shown in `Figure 16.5`_ aggregates all the solvers together, so it is easier to spot when a solver overtakes another or the overall performance of an algorithm while the available budget of function evaluations changes. .. _Figure 16.5: .. figure:: figures/Robust/sg_maxfun_Robust.png :alt: Percentage of problems solved given a fixed number of function evaluations on the Robust test suite :align: center **Figure 16.5**: Percentage of problems solved given a fixed number of function evaluations on the Robust test suite | A few obvious conclusions we can draw from these pictures are: 1. For this specific benchmark test suite, for very small budgets (i.e., :math:`NF < 500`) :ref:`MCS` appears to be the most successful solver. 2. After that threshold, **BiteOpt** becomes vastly superior in terms of performances compared to all other solvers. 3. **AMPGO** manages to get on the top sopt only at very large budgets. |size| Dimensionality Effects ----------------------------- Since I used the Robust test suite to generate test functions with dimensionality ranging from 2 to 6, it is interesting to take a look at the solvers performances as a function of the problem dimensionality. Of course, in general it is to be expected that for larger dimensions less problems are going to be solved - although it is not always necessarily so as it also depends on the function being generated. Results are shown in `Table 16.3`_ . .. _Table 16.3: .. cssclass:: pretty-table benchmark_dimensionality_table .. table:: **Table 16.3**: Dimensionality effects on the Robust benchmark suite at NF = 2,000 +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | **Solver** | **N = 2** | **N = 3** | **N = 4** | **N = 5** | **N = 6** | **Overall** | +======================+===================+===================+===================+===================+===================+===================+ | AMPGO | 72.6 | 57.1 | 45.2 | 48.8 | 36.9 | 52.1 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | BasinHopping | 58.3 | 40.5 | 34.5 | 23.8 | 23.8 | 36.2 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | :boldgreen:`BiteOpt` | :boldgreen:`91.7` | 54.8 | :boldgreen:`57.1` | :boldgreen:`54.8` | :boldgreen:`53.6` | :boldgreen:`62.4` | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | CMA-ES | 56.0 | 46.4 | 29.8 | 23.8 | 11.9 | 33.6 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | CRS2 | 63.1 | 52.4 | 40.5 | 32.1 | 14.3 | 40.5 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | DE | 81.0 | 46.4 | :boldred:`8.3` | :boldred:`3.6` | :boldred:`0.0` | 27.9 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | DIRECT | 71.4 | 51.2 | 50.0 | 44.0 | 11.9 | 45.7 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | DualAnnealing | 79.8 | 50.0 | 52.4 | 40.5 | 34.5 | 51.4 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | LeapFrog | :boldred:`45.2` | :boldred:`27.4` | 10.7 | 15.5 | 15.5 | 22.9 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | MCS | 67.9 | :boldgreen:`59.5` | 35.7 | 33.3 | 31.0 | 45.5 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | PSWARM | 48.8 | 28.6 | 28.6 | 19.0 | 15.5 | 28.1 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | SCE | 59.5 | 31.0 | 21.4 | :boldred:`3.6` | :boldred:`0.0` | 23.1 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | SHGO | 89.3 | 28.6 | 27.4 | 17.9 | 19.0 | 36.4 | +----------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | `Figure 16.6`_ shows the same results in a visual way. .. _Figure 16.6: .. figure:: figures/Robust/Robust_high_level_dimens_2000.png :alt: Percentage of problems solved as a function of problem dimension at :math:`NF = 2,000` :align: center **Figure 16.6**: Percentage of problems solved as a function of problem dimension for the Robust test suite at :math:`NF = 2,000` | What we can infer from the table and the figure is that, for pretty much all dimensionalities, **BiteOpt** displays much better convergence rates than other solvers, only slightly beaten for :math:`N = 3` by :ref:`MCS`. Pushing the available budget to a very generous :math:`NF = 10,000` shows **SHGO** scoring a perfect 100% success rate for low dimensionality problems (:math:`N = 2`) while **AMPGO** shows extremely high rates of success for higher dimensionalities. The results for the benchmarks at :math:`NF = 10,000` are displayed in `Table 16.4`_ and `Figure 16.7`_. .. _Table 16.4: .. cssclass:: pretty-table benchmark_dimensionality_table .. table:: **Table 16.4**: Dimensionality effects on the Robust benchmark suite at NF = 10,000 +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | **Solver** | **N = 2** | **N = 3** | **N = 4** | **N = 5** | **N = 6** | **Overall** | +====================+====================+===================+===================+===================+===================+===================+ | :boldgreen:`AMPGO` | 86.9 | :boldgreen:`82.1` | 75.0 | :boldgreen:`69.0` | :boldgreen:`66.7` | :boldgreen:`76.0` | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | BasinHopping | 63.1 | 39.3 | 48.8 | 29.8 | 27.4 | 41.7 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | BiteOpt | 94.0 | 72.6 | :boldgreen:`81.0` | 61.9 | 56.0 | 73.1 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | CMA-ES | 64.3 | 63.1 | 60.7 | 52.4 | 50.0 | 58.1 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | CRS2 | 61.9 | 56.0 | 58.3 | 46.4 | 34.5 | 51.4 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | DE | 83.3 | 81.0 | 54.8 | 46.4 | 29.8 | 59.0 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | DIRECT | 72.6 | 51.2 | 50.0 | 48.8 | 22.6 | 49.0 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | DualAnnealing | 85.7 | 61.9 | 63.1 | 59.5 | 53.6 | 64.8 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | LeapFrog | :boldred:`48.8` | :boldred:`27.4` | :boldred:`10.7` | 16.7 | 15.5 | 23.8 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | MCS | 85.7 | 79.8 | 51.2 | 41.7 | 39.3 | 59.5 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | PSWARM | 70.2 | 66.7 | 59.5 | 33.3 | 26.2 | 51.2 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | SCE | 69.0 | 41.7 | 39.3 | :boldred:`4.8` | :boldred:`1.2` | 31.2 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | SHGO | :boldgreen:`100.0` | 53.6 | 21.4 | 25.0 | 21.4 | 44.3 | +--------------------+--------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | .. _Figure 16.7: .. figure:: figures/Robust/Robust_high_level_dimens_10000.png :alt: Percentage of problems solved as a function of problem dimension at :math:`NF = 10,000` :align: center **Figure 16.7**: Percentage of problems solved as a function of problem dimension for the Robust test suite at :math:`NF = 10,000`