Key Insights
On this page
This platform contains the results of benchmarking 5 optimization solvers on 120 problems arising from energy system models. For each benchmark run, we measure runtime and memory consumption of the solver, along with other metrics to ensure solution quality across solvers.
Note that we run all solvers with their default options, with some exceptions – see full details on our Methodology page. We also gather information such as the number of variables and constraints for each problem instance, along with information about the scenario being modelled by each problem, this along with download links to each problem can be found on our Benchmark Set page.
This page presents the main takeaways from our benchmark platform, in an introductory and accessible manner. Advanced users, and those wishing to dig into more details can visit the full results in our interactive dashboards.
How good is each solver, and for what cases?
The overall summary of our results is shown in the plot below, which shows the runtime of each solver, relative to the fastest solver, on each subset of our benchmark set. A problem on which a solver timed out or errored is assumed to have a runtime equal to the timeout with which it was run. (More details, and other ways to handle time outs and errors, can be found on our main dashboard). We split our set of problems by problem size and also categorize certain problems as realistic if they arise from, or have similar model features as, models used in real-world energy planning studies. Hovering over any bar on the plot above will show you the average runtime of that solver on the subset of benchmarks, along with the percentage of benchmarks it could solve in the time limit.
The next plot shows the concrete performance of each solver on a few representative realistic problems from a few modelling frameworks in our benchmark set. Hover over the problem name in order to see more details about the benchmark features and why we consider it as representative for that modelling framework. Solvers that timed out or errored on a particular problem are indicated by red text above the corresponding bar. 4 out of the 7 problems can be solved by at least one open source solver, with different solvers (HiGHS or SCIP) providing the best performance on different problems.
How are solvers evolving over time?
This plot shows the average runtime of each year’s final-released solver version, relative to that year’s fastest solver, over all S and M size benchmarks in our set. This shows the performance evolution of solvers, relative to one another.
The plot below shows the performance evolution of the selected solver individually, relative to the first version of that solver that we have benchmarked. The bars denote the number of unsolved problems in our benchmark set, so the fewer the better. The red line shows the reduction in average runtime over the set relative to the first version (i.e. speedup factor).
More detailed statistics regarding performance evolution of solvers can be seen in our Performance History dashboard, which also allows calculating performance statistics on any subset of benchmarks that are of interest.
What is feasible for open source solvers?
Here are the largest LP and MILP problems that open source solvers can solve, from each modelling framework in our set. Please note that we did not generate / collect benchmark problems with the intention of finding the largest ones solvable by open solvers – so there could be larger problems solvable by open solvers than those in our set. This section can still be used to get an idea of the kinds of spatial and temporal resolutions that are solvable in reasonable time periods by open source solvers, and we encourage the community to contribute more benchmark problems so we can more accurately identify the boundary of feasibility.
Clicking on any benchmark problem name takes you to the benchmark details page that contains more information on the model scenario, various size instances, full results on that problem, and download links to the problem LP/MPS file and solver logs and solution files.
Model Framework | LP Benchmark | Num. variables | Num. constraints | Spatial resolution | Temporal resolution | Solver | Runtime |
---|
Given the limitations of our benchmark set, the strongest observable influence on runtime is model size, in terms of number of variables/constraints (see more details in What factors affect solver performance below). This is despite the fact that the above problems do not share many features and are built with different spatial/temporal resolutions and time horizons. It is also interesting that a realistic TEMOA-based problem like temoa-US_9R_TS_SP (9-12) does not have similar runtime to the largest solved TIMES-based model, Times-Ireland-noco2-counties (26-1ts), despite both having > 1e6 variables.
Model Framework | MILP Benchmark | Num. variables | Num. constraints | Spatial resolution | Temporal resolution | Solver | Runtime |
---|
We note that we do not yet have large problem instances from some modelling frameworks in our benchmark set. We welcome contributions to fill these gaps!
What factors affect solver performance?
The most obvious driver of solver performance is the number of variables in the LP/MILP problem, and the plot below shows the correlation between runtime and number of variables. The toggle allows you to select open source solvers only, or all solvers. Each green dot represents the runtime of the fastest (open source) solver on a problem with a given number of variables, and a red X denotes that no (open source) solver could solve the problem within the timeout (1 hr for small and medium problems, and 10 hrs for large problems). The plot gives an indication of the order of magnitude at which solvers start to hit the timeout.
Open solvers only
All solvers
The rest of this section examines the effect of different model features on solver performance. You can use the toggles to select between open source solvers only, or all solvers. Hovering over a model name shows you the details of the model scenario, including application type, constraints, LP/MILP, etc.
Effect of increasing spatial and temporal resolutions on PyPSA models
This is a series of different size instances of a PyPSA-Eur sector-coupled model, where the spatial and temporal resolution are varied to create increasingly larger LP problems. One can see the runtime of solvers increasing as either resolution is made more fine grained.
Open solvers only
All solvers
Effect of unit commitment (UC) on GenX models
genx-10_IEEE_9_bus_DC_OPF-9-1h is an MILP problem that adds UC as an extra model constraint to the power sector model genx-10_IEEE_9_bus_DC_OPF-no_uc-9-1h (LP problem). Open source solvers are slower on the MILP problem with UC, whereas commercial solvers actually have better MILP performance in this case. Recall that we run all solvers with their default options except for setting a relative duality (MIP) gap tolerance; in particular this means that some solvers may choose to run crossover and others not, which could affect performance of the UC case.
Open solvers only
All solvers
Effect of unit commitment (UC) on PyPSA models
pypsa-eur-elec-op-ucconv-2-3h is an MILP problem that adds UC as an extra model constraint to the power sector operational model pypsa-eur-elec-op-2-1h (LP problem). Despite having different temporal resolutions (1h VS 3h), open source solvers are slower on the MILP problem with UC, whereas commercial solvers have good MILP performance and the gap between the LP and MILP is probably due to the LP problem having a higher temporal resolution. Recall that we run all solvers with their default options except for setting a relative duality (MIP) gap tolerance; in particular this means that some solvers may choose to run crossover and others not, which could affect performance of the UC case.
Open solvers only
All solvers
Effect of UC, transmission expansion, and CO2 constraints on GenX models
The set of GenX benchmarks below compares solver performance on 1) a case with optimal transmission expansion, 2) a case with both optimal transmission expansion and a CO2 constraint, 3) a case with transmission expansion and UC, and 4) a case with CO2 emission constraints. All the benchmarks except for genx-elec_trex_uc-15-24h share the same spatial and temporal resolution, except for genx-elec-trex_uc-15-24h (the corresponding 168h instance fails due to memory issues).
Open solvers only
All solvers
Stacking transmission expansion optimality and CO2 constraint leads to almost 2X the solution time of the cases taking into account one of the two features at a time. As in the PyPSA-Eur case above, the effect of UC on Gurobi solution time looks negligible with respect to the different time resolution, though for a better comparison a case with UC and the same time resolution as for the other benchmarks listed here would be needed.
Effect of increasingly stringent CO2 constraints on TEMOA models
In this set of TEMOA models, the 1st case has no CO2 constraints, the 2nd one considers US Nationally Determined Contributions (NDCs), and the 3rd one enforces net-zero emissions by 2050. It is not surprising here to see that, with Gurobi, increasingly stringent CO2 constraints add runtime requirements with respect to the base case. However, the case of HiGHS needs to be investigated as the 2nd case, despite a less stringent CO2 constraint, cannot be solved, while this is not true for the 3rd case.
Open solvers only
All solvers
Effect of time horizons on TIMES models
The comparison on two eTIMES-EU benchmarks highlights how the addition of multi-stage analysis (in this case 8 optimization periods) in perfect foresight has a large impact on runtime in Gurobi. Though one could expect an increase in runtime comparable to the increase in solution stages, proprietary solvers take approximately 180x more time on the multi-stage problem.
Open solvers only
All solvers
Benchmark problems corresponding to representative model use-cases
All our technical dashboards can be filtered or focused to the application domain or problem type of interest. All our plots and results are generated on-the-fly when you select any particular filter option. Since this may be overwhelming for some users, we highlight in the table below some particular filter combinations that correspond to representative problems arising from common use-cases of each modelling framework. Click any benchmark problem name to see more details about it, and to view its results.
Framework | Problem Class | Application | Time Horizon | MILP Features | Realistic | Example |
---|---|---|---|---|---|---|
GenX | LP | Infrastructure & Capacity Expansion | Single Period | None | Realistic | |
GenX | MILP | Infrastructure & Capacity Expansion | Single Period | Unit commitment | Realistic | |
PyPSA | LP | Infrastructure & Capacity Expansion | Single Period | None | Realistic | |
TEMOA | LP | Infrastructure & Capacity Expansion | Multi Period | None | Realistic | |
TIMES | LP | Infrastructure & Capacity Expansion | Multi Period | None | Realistic |
What benchmark problems do we have (and what are missing?)
This section breaks down our current benchmark set according to modelling framework, problem type, application domain, and model features. This highlights the kinds of energy models that we test solvers on, but is also a useful warning of the gaps in our collection.
DCOPF | GenX | PowerModels | PyPSA | Sienna | TEMOA | TIMES | Tulipa | |
---|---|---|---|---|---|---|---|---|
Problem Classes | ||||||||
LP | ✕ | ✕ | ||||||
MILP | ✕ | ✕ | ||||||
Applications | ||||||||
DC Optimal Power Flow | ✕ | ✕ | ✕ | ✕ | N.A | N.A | ✕ | |
Resource Adequacy | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | |
Infrastructure & Capacity Expansion | N.A | ✕ | ✕ | ✕ | ✕ | ✕ | ||
Operational | ✕ | ✕ | N.A | ✕ | N.A | N.A | ✕ | |
Steady-state Optimal Power Flow | N.A | ✕ | ✕ | N.A | N.A | N.A | N.A | |
Production cost modelling | ✕ | ✕ | N.A | ✕ | ✕ | ✕ | ✕ | ✕ |
Time Horizons | ||||||||
Single Period | ✕ | |||||||
Multi Period | ✕ | ✕ | ✕ | ✕ | ✕ | |||
MILP Features | ||||||||
None | ✕ | ✕ | ✕ | |||||
Unit commitment | ✕ | ✕ | ||||||
Piecewise fuel usage | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | |
Transmission switching | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | |
Modularity | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | ||
Realistic | ||||||||
Realistic | ✕ | |||||||
Other | ✕ | ✕ |
* N.A. : the modelling framework does not cover this kind of analysis
For version 2 of our platform, we plan to have a public call for benchmarks to address the gaps above. In particular, we welcome benchmark problem contributions that cover:
- “Application”: DC optimal power flow, Operational and Production cost modelling analyses for most modelling frameworks.
- “Time horizon”: Multi Period analyses; something particularly important as problems with multiple time horizons are more challenging to solve.
- “MILP features”: Unit commitment for TIMES and TEMOA (though, indeed, they do not focus particularly on power sector modelling); other MILP features, such as Piecewise fuel usage, Transmission switching and Modularity are missing for most frameworks.
- “Realistic”: Realistic problems are missing for PowerModels and Sienna
- Large problem instances are also missing for many model frameworks, see the section What is feasible for open solvers above.
Reach out to us if you'd like to contribute any benchmark problems that can fill the above gaps!