Key Insights

This platform contains the results of benchmarking 5 optimization solvers on 120 problems arising from energy system models. For each benchmark run, we measure runtime and memory consumption of the solver, along with other metrics to ensure solution quality across solvers.

Note that we run all solvers with their default options, with some exceptions – see full details on our Methodology page. We also gather information such as the number of variables and constraints for each problem instance, along with information about the scenario being modelled by each problem, this along with download links to each problem can be found on our Benchmark Set page.

This page presents the main takeaways from our benchmark platform, in an introductory and accessible manner. Advanced users, and those wishing to dig into more details can visit the full results in our interactive dashboards.

How good is each solver, and for what cases?

The overall summary of our results is shown in the plot below, which shows the runtime of each solver, relative to the fastest solver, on each subset of our benchmark set. A problem on which a solver timed out or errored is assumed to have a runtime equal to the timeout with which it was run. (More details, and other ways to handle time outs and errors, can be found on our main dashboard). We split our set of problems by problem size and also categorize certain problems as realistic if they arise from, or have similar model features as, models used in real-world energy planning studies. Hovering over any bar on the plot above will show you the average runtime of that solver on the subset of benchmarks, along with the percentage of benchmarks it could solve in the time limit.

Runtime relative to fastest solver

The next plot shows the concrete performance of each solver on a few representative realistic problems from a few modelling frameworks in our benchmark set. Hover over the problem name in order to see more details about the benchmark features and why we consider it as representative for that modelling framework. Solvers that timed out or errored on a particular problem are indicated by red text above the corresponding bar. 4 out of the 7 problems can be solved by at least one open source solver, with different solvers (HiGHS or SCIP) providing the best performance on different problems.

Runtime relative to fastest solver

Note: As with all benchmarks, our results provide only an indication of which solvers might be good for your problems. Our benchmark set is not yet as diverse and comprehensive as we would like, see the What benchmark problems do we have section below to view the gaps in our benchmark set. We encourage users to use our scripts to benchmark solvers on their own problems before picking a solver, and also encourage modellers to contribute problems that can help us make our benchmark set more representative and diverse. Reach out to us if you'd like to contribute!

How are solvers evolving over time?

This plot shows the average runtime of each year’s final-released solver version, relative to that year’s fastest solver, over all S and M size benchmarks in our set. This shows the performance evolution of solvers, relative to one another.

SGM Runtime (Relative to Best per Year)
Solver:

The plot below shows the performance evolution of the selected solver individually, relative to the first version of that solver that we have benchmarked. The bars denote the number of unsolved problems in our benchmark set, so the fewer the better. The red line shows the reduction in average runtime over the set relative to the first version (i.e. speedup factor).

More detailed statistics regarding performance evolution of solvers can be seen in our Performance History dashboard, which also allows calculating performance statistics on any subset of benchmarks that are of interest.

What is feasible for open source solvers?

Here are the largest LP and MILP problems that open source solvers can solve, from each modelling framework in our set. Please note that we did not generate / collect benchmark problems with the intention of finding the largest ones solvable by open solvers – so there could be larger problems solvable by open solvers than those in our set. This section can still be used to get an idea of the kinds of spatial and temporal resolutions that are solvable in reasonable time periods by open source solvers, and we encourage the community to contribute more benchmark problems so we can more accurately identify the boundary of feasibility.

Clicking on any benchmark problem name takes you to the benchmark details page that contains more information on the model scenario, various size instances, full results on that problem, and download links to the problem LP/MPS file and solver logs and solution files.

Model Framework
LP Benchmark
Num. variables
Num. constraints
Spatial resolution
Temporal resolution
Solver
Runtime
Note: There are some important caveats to keep in mind when comparing spatial and temporal resolutions across different modelling frameworks. Concerning spatial resolution, regions and number of nodes do not represent the same entity and cannot be compared; in general, some models use nodes to disaggregate the spatial scale due to a larger detail needed on power sector analysis and/or on the feedback of other sectors on the electrical grid, while regions are used to ease data aggregation from energy use statistics and for analysis with a broader focus on the system itself rather than on the physical structure of the (electricity, but not only) network. Furthermore, some nodal models can use a hybrid approach to also aggregate nodes to reflect a regional focus. Regarding temporal resolution, time slices are aggregations of time frames with similar energy production/consumption features. Therefore, to give and idea of the correspondence with a 1 hour resolution-model, a model adopting time slices should consider 8760 time slices per year, with different data (and thus results) associated to each of them.

Given the limitations of our benchmark set, the strongest observable influence on runtime is model size, in terms of number of variables/constraints (see more details in What factors affect solver performance below). This is despite the fact that the above problems do not share many features and are built with different spatial/temporal resolutions and time horizons. It is also interesting that a realistic TEMOA-based problem like temoa-US_9R_TS_SP (9-12) does not have similar runtime to the largest solved TIMES-based model, Times-Ireland-noco2-counties (26-1ts), despite both having > 1e6 variables.

Model Framework
MILP Benchmark
Num. variables
Num. constraints
Spatial resolution
Temporal resolution
Solver
Runtime

We note that we do not yet have large problem instances from some modelling frameworks in our benchmark set. We welcome contributions to fill these gaps!

What factors affect solver performance?

The most obvious driver of solver performance is the number of variables in the LP/MILP problem, and the plot below shows the correlation between runtime and number of variables. The toggle allows you to select open source solvers only, or all solvers. Each green dot represents the runtime of the fastest (open source) solver on a problem with a given number of variables, and a red X denotes that no (open source) solver could solve the problem within the timeout (1 hr for small and medium problems, and 10 hrs for large problems). The plot gives an indication of the order of magnitude at which solvers start to hit the timeout.

Open solvers only

All solvers

fastest solver
all solvers timed out

The rest of this section examines the effect of different model features on solver performance. You can use the toggles to select between open source solvers only, or all solvers. Hovering over a model name shows you the details of the model scenario, including application type, constraints, LP/MILP, etc.

Effect of increasing spatial and temporal resolutions on PyPSA models

This is a series of different size instances of a PyPSA-Eur sector-coupled model, where the spatial and temporal resolution are varied to create increasingly larger LP problems. One can see the runtime of solvers increasing as either resolution is made more fine grained.

Open solvers only

All solvers

Runtime of fastest solver
Effect of unit commitment (UC) on GenX models

genx-10_IEEE_9_bus_DC_OPF-9-1h is an MILP problem that adds UC as an extra model constraint to the power sector model genx-10_IEEE_9_bus_DC_OPF-no_uc-9-1h (LP problem). Open source solvers are slower on the MILP problem with UC, whereas commercial solvers actually have better MILP performance in this case. Recall that we run all solvers with their default options except for setting a relative duality (MIP) gap tolerance; in particular this means that some solvers may choose to run crossover and others not, which could affect performance of the UC case.

Open solvers only

All solvers

Runtime of fastest solver
Effect of unit commitment (UC) on PyPSA models

pypsa-eur-elec-op-ucconv-2-3h is an MILP problem that adds UC as an extra model constraint to the power sector operational model pypsa-eur-elec-op-2-1h (LP problem). Despite having different temporal resolutions (1h VS 3h), open source solvers are slower on the MILP problem with UC, whereas commercial solvers have good MILP performance and the gap between the LP and MILP is probably due to the LP problem having a higher temporal resolution. Recall that we run all solvers with their default options except for setting a relative duality (MIP) gap tolerance; in particular this means that some solvers may choose to run crossover and others not, which could affect performance of the UC case.

Open solvers only

All solvers

Runtime of fastest solver
Effect of UC, transmission expansion, and CO2 constraints on GenX models

The set of GenX benchmarks below compares solver performance on 1) a case with optimal transmission expansion, 2) a case with both optimal transmission expansion and a CO2 constraint, 3) a case with transmission expansion and UC, and 4) a case with CO2 emission constraints. All the benchmarks except for genx-elec_trex_uc-15-24h share the same spatial and temporal resolution, except for genx-elec-trex_uc-15-24h (the corresponding 168h instance fails due to memory issues).

Open solvers only

All solvers

Runtime of fastest solver

Stacking transmission expansion optimality and CO2 constraint leads to almost 2X the solution time of the cases taking into account one of the two features at a time. As in the PyPSA-Eur case above, the effect of UC on Gurobi solution time looks negligible with respect to the different time resolution, though for a better comparison a case with UC and the same time resolution as for the other benchmarks listed here would be needed.

Effect of increasingly stringent CO2 constraints on TEMOA models

In this set of TEMOA models, the 1st case has no CO2 constraints, the 2nd one considers US Nationally Determined Contributions (NDCs), and the 3rd one enforces net-zero emissions by 2050. It is not surprising here to see that, with Gurobi, increasingly stringent CO2 constraints add runtime requirements with respect to the base case. However, the case of HiGHS needs to be investigated as the 2nd case, despite a less stringent CO2 constraint, cannot be solved, while this is not true for the 3rd case.

Open solvers only

All solvers

Runtime of fastest solver
Effect of time horizons on TIMES models

The comparison on two eTIMES-EU benchmarks highlights how the addition of multi-stage analysis (in this case 8 optimization periods) in perfect foresight has a large impact on runtime in Gurobi. Though one could expect an increase in runtime comparable to the increase in solution stages, proprietary solvers take approximately 180x more time on the multi-stage problem.

Open solvers only

All solvers

Runtime of fastest solver

Benchmark problems corresponding to representative model use-cases

All our technical dashboards can be filtered or focused to the application domain or problem type of interest. All our plots and results are generated on-the-fly when you select any particular filter option. Since this may be overwhelming for some users, we highlight in the table below some particular filter combinations that correspond to representative problems arising from common use-cases of each modelling framework. Click any benchmark problem name to see more details about it, and to view its results.

Framework
Problem Class
Application
Time Horizon
MILP Features
Realistic
Example
GenX
LP
Infrastructure & Capacity Expansion
Single PeriodNoneRealistic
GenX
MILP
Infrastructure & Capacity Expansion
Single PeriodUnit commitmentRealistic
PyPSA
LP
Infrastructure & Capacity Expansion
Single PeriodNoneRealistic
TEMOA
LP
Infrastructure & Capacity Expansion
Multi PeriodNoneRealistic
TIMES
LP
Infrastructure & Capacity Expansion
Multi PeriodNoneRealistic

What benchmark problems do we have (and what are missing?)

This section breaks down our current benchmark set according to modelling framework, problem type, application domain, and model features. This highlights the kinds of energy models that we test solvers on, but is also a useful warning of the gaps in our collection.

DCOPF
GenX
PowerModels
PyPSA
Sienna
TEMOA
TIMES
Tulipa
Problem Classes
LP
MILP
Applications
DC Optimal Power Flow
N.A
N.A
Resource Adequacy
Infrastructure & Capacity Expansion
N.A
Operational
N.A
N.A
N.A
Steady-state Optimal Power Flow
N.A
N.A
N.A
N.A
N.A
Production cost modelling
N.A
Time Horizons
Single Period
Multi Period
MILP Features
None
Unit commitment
Piecewise fuel usage
Transmission switching
Modularity
Realistic
Realistic
Other

* N.A. : the modelling framework does not cover this kind of analysis

For version 2 of our platform, we plan to have a public call for benchmarks to address the gaps above. In particular, we welcome benchmark problem contributions that cover:

  • “Application”: DC optimal power flow, Operational and Production cost modelling analyses for most modelling frameworks.
  • “Time horizon”: Multi Period analyses; something particularly important as problems with multiple time horizons are more challenging to solve.
  • “MILP features”: Unit commitment for TIMES and TEMOA (though, indeed, they do not focus particularly on power sector modelling); other MILP features, such as Piecewise fuel usage, Transmission switching and Modularity are missing for most frameworks.
  • “Realistic”: Realistic problems are missing for PowerModels and Sienna
  • Large problem instances are also missing for many model frameworks, see the section What is feasible for open solvers above.

Reach out to us if you'd like to contribute any benchmark problems that can fill the above gaps!