Key Insights
This platform contains the results of benchmarking -2 optimization solvers on 0 problems arising from energy system models. For each benchmark run, we measure runtime and memory consumption of the solver, along with other metrics to ensure solution quality across solvers.
Note that we run all solvers with their default options, with some exceptions – see full details on our Methodology page. For each problem, we also gather information such as the number of variables and constraints along with information about the scenario being modelled. This along with download links to each problem can be found on our Benchmark Set page.
This page presents the main takeaways from our benchmark platform in an introductory and accessible manner. Advanced users and those wishing to dig into more details can visit the full results in our interactive dashboards.
To find out how good each solver is overall, we plot below the average (SGM) runtime of each solver, relative to the fastest solver, on all the LP and MILP problems in our benchmark set. A problem on which a solver timed out or errored is assumed to have a runtime equal to the timeout with which it was run. (More details, and other ways to handle time outs and errors, can be found on our main dashboard). We group our set of problems according to problem size and also categorize certain problems as realistic if they arise from, or have similar model features as, models used in real-world energy planning studies. Hovering over any bar on the plot above will show you the average runtime of that solver on the subset of benchmarks, along with the percentage of benchmarks it could solve in the time limit.
The next plot shows the concrete performance of each solver on a few representative realistic problems from a few modelling frameworks in our benchmark set. Hover over the problem name in order to see more details about the benchmark features and why we consider it as representative for that modelling framework. Solvers that timed out or errored on a particular problem are indicated by red text above the corresponding bar.
3 out of the 5 problems can be solved by at least one open source solver, with different solvers providing the best performance on different problems.
This plot shows the average runtime of each year’s final-released solver version, relative to the best solver ever measured, over all S and M size benchmarks in our set. This shows the performance evolution of solvers, relative to one another.
The plot below shows the performance evolution of the selected solver individually, relative to the first version of that solver that we have benchmarked. The bars denote the number of unsolved problems in our benchmark set, so the fewer the better. The red line shows the reduction in average runtime over the set relative to the first version (i.e. speedup factor).
More detailed statistics regarding performance evolution of solvers can be seen in our Performance History dashboard, which also allows calculating performance statistics on any subset of benchmarks that are of interest.
Here are the largest LP and MILP problems that open source solvers can solve, from each modelling framework in our set. Please note that we did not generate / collect benchmark problems with the intention of finding the largest ones solvable by open solvers – so there could be larger problems solvable by open solvers than those in our set. This section can still be used to get an idea of the kinds of spatial and temporal resolutions that are solvable in reasonable time periods by open source solvers, and we encourage the community to contribute more benchmark problems so we can more accurately identify the boundary of feasibility.
Clicking on any benchmark problem name takes you to the benchmark details page that contains more information on the model scenario, various size instances, full results on that problem, and download links to the problem LP/MPS file and solver logs and solution files.
Model Framework | LP Benchmark | Num. variables | Num. constraints | Spatial resolution | Temporal resolution | Solver | Runtime |
|---|
Given the limitations of our benchmark set, the strongest observable influence on runtime is model size, in terms of number of variables/constraints (see more details in What factors affect solver performance below). This is despite the fact that the above problems do not share many features and are built with different spatial/temporal resolutions and time horizons. It is also interesting that a realistic TEMOA-based problem like temoa-US_9R_TS_SP (9-12) does not have similar runtime to the largest solved TIMES-based model, times-ireland-noco2-counties (26-1ts), despite both having > 1e6 variables.
Model Framework | MILP Benchmark | Num. variables | Num. constraints | Spatial resolution | Temporal resolution | Solver | Runtime |
|---|
We note that we do not yet have large problem instances from some modelling frameworks in our benchmark set. We welcome contributions to fill these gaps!
What factors affect solver performance?
The most obvious driver of solver performance is the number of variables in the LP/MILP problem, and the plot below shows the correlation between runtime and number of variables. The toggle allows you to select open source solvers only, or all solvers. Each green dot represents the runtime of the fastest (open source) solver on a problem with a given number of variables, and a red X denotes that no (open source) solver could solve the problem within the timeout (1 hr for small and medium problems, and 24 hrs for large problems). The plot gives an indication of the order of magnitude at which solvers start to hit the timeout.
The rest of this section examines the effect of different model features on solver performance. You can use the toggles to select between open source solvers only, or all solvers. Hovering over a model name shows you the details of the model scenario, including application type, constraints, LP/MILP, etc.
Effect of increasing spatial resolutions on PyPSA models
While only HiGHS among open solvers is able to solve the smallest instance (10-1h), at a higher number of nodes Gurobi highlights the expected nonlinear growth in computational effort as model size and complexity expand.
Effect of increasing temporal resolutions on PyPSA models
Again, only HiGHS among open solvers can solve the smallest instance (50-168h). On the other hand, as temporal resolution increases (from 168h to 24h and below), Gurobi runtime escalates dramatically: while the weekly aggregation solves in seconds, the daily resolution already requires nearly an hour, and finer resolutions hit the time limit (1 hour).
Effect of unit commitment (UC) on GenX models
genx-10_IEEE_9_bus_DC_OPF (9-1h) is an MILP problem that adds UC as an extra model constraint to the power sector model genx-10_IEEE_9_bus_DC_OPF-no_uc (9-1h) (LP problem). Adding unit commitment (UC) transforms the LP DC-OPF into an MILP and fundamentally changes solver performance. In the LP case, runtimes are in the order of seconds with HiGHS (which also outperforms Gurobi in this case), while the MILP formulation introduces a dramatic increase in computational effort. In this benchmark, Gurobi is the fastest solver for the UC case (28 seconds), whereas the fastest open-source solver (SCIP) requires around 40 minutes, illustrating the substantial performance gap that can emerge once integer variables are introduced. All solvers are run with default settings except for a fixed relative MIP gap tolerance.
Effect of unit commitment (UC) on PyPSA models
pypsa-power+ely-ucgas (1-1h) is an MILP problem that adds UC as an extra model constraint to the power-only model pypsa-power+ely (1-1h) (LP problem). The LP version solves in a few seconds with both Gurobi and HiGHS, while the MILP version requires significantly more time. Gurobi maintains relatively strong performance in the UC case, whereas open-source solvers exhibit a more pronounced slowdown. All solvers are run with default settings except for a fixed relative MIP gap tolerance.
Effect of transmission expansion and CO2 constraints on GenX models
Under open-source solvers, all three GenX variants hit the time limit, failing in providing any indication about the effect of transmission expansion optimization and CO2 constraints. When including Gurobi, the models become solvable within reasonable time, but runtimes vary significantly: adding both transmission expansion and CO2 constraints leads to the longest solve time (2h 45min), while models with only one of the features solve faster (around 1h 30min). This highlights how stacking structural constraints can materially increase computational complexity, even when the formulation remains linear.
Effect of increasingly stringent CO2 constraints on TEMOA models
Increasing the stringency of CO2 constraints affects solver performance differently across solver families. Under open-source solvers, runtime increases when moving from the base case to constrained scenarios, with the NDC case being particularly challenging. When including all solvers, the models solve in a few minutes and runtimes increase only moderately as constraints become more stringent.
Benchmark problems corresponding to representative model use-cases
All our technical dashboards can be filtered or focused to the application domain or problem type of interest. All our plots and results are generated on-the-fly when you select any particular filter option. Since this may be overwhelming for some users, we highlight in the table below some particular filter combinations that correspond to representative problems arising from common use-cases of each modelling framework. Click any benchmark problem name to see more details about it, and to view its results.
Framework | Problem Class | Application | MILP Features | Realistic | Example |
|---|---|---|---|---|---|
GenX | LP | Infrastructure & Capacity Expansion | None | Yes | |
GenX | MILP | Infrastructure & Capacity Expansion | Unit commitment | Yes | |
PyPSA | LP | Infrastructure & Capacity Expansion | None | Yes | |
TEMOA | LP | Infrastructure & Capacity Expansion | None | Yes | |
TIMES | LP | Infrastructure & Capacity Expansion | None | Yes | |
Switch | LP | Infrastructure & Capacity Expansion | None | Yes |
What benchmark problems do we have (and what are missing?)
This section breaks down our current benchmark set according to modelling framework, problem type, application domain, and model features. This highlights the kinds of energy models that we test solvers on, but is also a useful warning of the gaps in our collection.
Category |
|---|
Problem Classes |
LP |
MILP |
Applications |
Infrastructure & Capacity Expansion |
Operational |
Production cost modelling |
DC Optimal Power Flow |
Steady-state Optimal Power Flow |
Resource Adequacy |
MILP Features |
None |
Unit commitment |
Transmission switching |
Modularity |
Binary transmission investment decisions |
Piecewise fuel usage |
Piecewise-linear part-load efficiency modeling |
Piecewise efficiency |
Modelling of fixed costs |
NonConvex operation |
Realistic |
Realistic |
For version 2 of our platform, we plan to have a public call for benchmarks to address the gaps above. In particular, we welcome benchmark problem contributions that cover:
- “Application”: DC optimal power flow, Operational and Production cost modelling analyses for most modelling frameworks.
- “Time horizon”: Multi Period analyses; something particularly important as problems with multiple time horizons are more challenging to solve.
- “MILP features”: Unit commitment for TIMES and TEMOA (though, indeed, they do not focus particularly on power sector modelling); other MILP features, such as Piecewise fuel usage, Transmission switching and Modularity are missing for most frameworks.
- “Realistic”: Realistic problems are missing for PowerModels and Sienna
- Large problem instances are also missing for many model frameworks, see the section What is feasible for open solvers above.
Reach out to us if you'd like to contribute any benchmark problems that can fill the above gaps!