This platform contains the results of benchmarking -2 optimization solvers on 0 problems arising from energy system models. For each benchmark run, we measure runtime and memory consumption of the solver, along with other metrics to ensure solution quality across solvers.

Note that we run all solvers with their default options, with some exceptions – see full details on our Methodology page. For each problem, we also gather information such as the number of variables and constraints along with information about the scenario being modelled. This along with download links to each problem can be found on our Benchmark Set page.

This page presents the main takeaways from our benchmark platform in an introductory and accessible manner. Advanced users and those wishing to dig into more details can visit the full results in our interactive dashboards.

How good is each solver, and for what cases?

To find out how good each solver is overall, we plot below the average (SGM) runtime of each solver, relative to the fastest solver, on all the LP and MILP problems in our benchmark set. A problem on which a solver timed out or errored is assumed to have a runtime equal to the timeout with which it was run. (More details, and other ways to handle time outs and errors, can be found on our main dashboard). We group our set of problems according to problem size and also categorize certain problems as realistic if they arise from, or have similar model features as, models used in real-world energy planning studies. Hovering over any bar on the plot above will show you the average runtime of that solver on the subset of benchmarks, along with the percentage of benchmarks it could solve in the time limit.

Runtime relative to fastest solver - LP
Runtime relative to fastest solver - MILP

The next plot shows the concrete performance of each solver on a few representative realistic problems from a few modelling frameworks in our benchmark set. Hover over the problem name in order to see more details about the benchmark features and why we consider it as representative for that modelling framework. Solvers that timed out or errored on a particular problem are indicated by red text above the corresponding bar.

Runtime relative to fastest solver

3 out of the 5 problems can be solved by at least one open source solver, with different solvers providing the best performance on different problems.

Note: As with all benchmarks, our results provide only an indication of which solvers might be good for your problems. Our benchmark set is not yet as diverse and comprehensive as we would like, see the What benchmark problems do we have section below to view the gaps in our benchmark set. We encourage users to use our scripts to benchmark solvers on their own problems before picking a solver, and also encourage modellers to contribute problems that can help us make our benchmark set more representative and diverse. Reach out to us if you'd like to contribute!

How are solvers evolving over time?

This plot shows the average runtime of each year’s final-released solver version, relative to the best solver ever measured, over all S and M size benchmarks in our set. This shows the performance evolution of solvers, relative to one another.

SGM Runtime (Relative to Best Ever Measured)
Solver:

The plot below shows the performance evolution of the selected solver individually, relative to the first version of that solver that we have benchmarked. The bars denote the number of unsolved problems in our benchmark set, so the fewer the better. The red line shows the reduction in average runtime over the set relative to the first version (i.e. speedup factor).

More detailed statistics regarding performance evolution of solvers can be seen in our Performance History dashboard, which also allows calculating performance statistics on any subset of benchmarks that are of interest.

What is feasible for open source solvers?

Here are the largest LP and MILP problems that open source solvers can solve, from each modelling framework in our set. Please note that we did not generate / collect benchmark problems with the intention of finding the largest ones solvable by open solvers – so there could be larger problems solvable by open solvers than those in our set. This section can still be used to get an idea of the kinds of spatial and temporal resolutions that are solvable in reasonable time periods by open source solvers, and we encourage the community to contribute more benchmark problems so we can more accurately identify the boundary of feasibility.

Clicking on any benchmark problem name takes you to the benchmark details page that contains more information on the model scenario, various size instances, full results on that problem, and download links to the problem LP/MPS file and solver logs and solution files.

Model Framework
LP Benchmark
Num. variables
Num. constraints
Spatial resolution
Temporal resolution
Solver
Runtime
Note: There are several important caveats to consider when comparing spatial and temporal resolutions across different modelling frameworks. With respect to spatial resolution, regions and nodes do not represent the same concept and therefore cannot be directly compared. In general, some models use nodes to disaggregate the spatial scale when a higher level of detail is required, particularly for power sector analyses and for capturing feedbacks from other sectors on the electricity grid. By contrast, regions are often employed to facilitate data aggregation from energy-use statistics and to support analyses with a broader system-level focus, rather than on the physical structure of the (electricity, and more broadly energy) network. Moreover, some nodal models adopt hybrid approaches in which nodes are partially aggregated to reflect a regional perspective. Regarding temporal resolution, time slices represent aggregations of time periods with similar energy production and consumption characteristics. Consequently, to establish an equivalence with a model operating at an hourly resolution, a time-slice-based model would, in principle, require 8,760 time slices per year, each associated with distinct input data and, therefore, potentially different results.

Given the limitations of our benchmark set, the strongest observable influence on runtime is model size, in terms of number of variables/constraints (see more details in What factors affect solver performance below). This is despite the fact that the above problems do not share many features and are built with different spatial/temporal resolutions and time horizons. It is also interesting that a realistic TEMOA-based problem like temoa-US_9R_TS_SP (9-12) does not have similar runtime to the largest solved TIMES-based model, times-ireland-noco2-counties (26-1ts), despite both having > 1e6 variables.

Model Framework
MILP Benchmark
Num. variables
Num. constraints
Spatial resolution
Temporal resolution
Solver
Runtime

We note that we do not yet have large problem instances from some modelling frameworks in our benchmark set. We welcome contributions to fill these gaps!

What factors affect solver performance?

The most obvious driver of solver performance is the number of variables in the LP/MILP problem, and the plot below shows the correlation between runtime and number of variables. The toggle allows you to select open source solvers only, or all solvers. Each green dot represents the runtime of the fastest (open source) solver on a problem with a given number of variables, and a red X denotes that no (open source) solver could solve the problem within the timeout (1 hr for small and medium problems, and 24 hrs for large problems). The plot gives an indication of the order of magnitude at which solvers start to hit the timeout.

fastest solver
all solvers timed out

The rest of this section examines the effect of different model features on solver performance. You can use the toggles to select between open source solvers only, or all solvers. Hovering over a model name shows you the details of the model scenario, including application type, constraints, LP/MILP, etc.

Effect of increasing spatial resolutions on PyPSA models

While only HiGHS among open solvers is able to solve the smallest instance (10-1h), at a higher number of nodes Gurobi highlights the expected nonlinear growth in computational effort as model size and complexity expand.

Runtime of fastest solver
Effect of increasing temporal resolutions on PyPSA models

Again, only HiGHS among open solvers can solve the smallest instance (50-168h). On the other hand, as temporal resolution increases (from 168h to 24h and below), Gurobi runtime escalates dramatically: while the weekly aggregation solves in seconds, the daily resolution already requires nearly an hour, and finer resolutions hit the time limit (1 hour).

Runtime of fastest solver
Effect of unit commitment (UC) on GenX models

genx-10_IEEE_9_bus_DC_OPF (9-1h) is an MILP problem that adds UC as an extra model constraint to the power sector model genx-10_IEEE_9_bus_DC_OPF-no_uc (9-1h) (LP problem). Adding unit commitment (UC) transforms the LP DC-OPF into an MILP and fundamentally changes solver performance. In the LP case, runtimes are in the order of seconds with HiGHS (which also outperforms Gurobi in this case), while the MILP formulation introduces a dramatic increase in computational effort. In this benchmark, Gurobi is the fastest solver for the UC case (28 seconds), whereas the fastest open-source solver (SCIP) requires around 40 minutes, illustrating the substantial performance gap that can emerge once integer variables are introduced. All solvers are run with default settings except for a fixed relative MIP gap tolerance.

Runtime of fastest solver
Effect of unit commitment (UC) on PyPSA models

pypsa-power+ely-ucgas (1-1h) is an MILP problem that adds UC as an extra model constraint to the power-only model pypsa-power+ely (1-1h) (LP problem). The LP version solves in a few seconds with both Gurobi and HiGHS, while the MILP version requires significantly more time. Gurobi maintains relatively strong performance in the UC case, whereas open-source solvers exhibit a more pronounced slowdown. All solvers are run with default settings except for a fixed relative MIP gap tolerance.

Runtime of fastest solver
Effect of transmission expansion and CO2 constraints on GenX models

Under open-source solvers, all three GenX variants hit the time limit, failing in providing any indication about the effect of transmission expansion optimization and CO2 constraints. When including Gurobi, the models become solvable within reasonable time, but runtimes vary significantly: adding both transmission expansion and CO2 constraints leads to the longest solve time (2h 45min), while models with only one of the features solve faster (around 1h 30min). This highlights how stacking structural constraints can materially increase computational complexity, even when the formulation remains linear.

Runtime of fastest solver
Effect of increasingly stringent CO2 constraints on TEMOA models

Increasing the stringency of CO2 constraints affects solver performance differently across solver families. Under open-source solvers, runtime increases when moving from the base case to constrained scenarios, with the NDC case being particularly challenging. When including all solvers, the models solve in a few minutes and runtimes increase only moderately as constraints become more stringent.

Runtime of fastest solver

Benchmark problems corresponding to representative model use-cases

All our technical dashboards can be filtered or focused to the application domain or problem type of interest. All our plots and results are generated on-the-fly when you select any particular filter option. Since this may be overwhelming for some users, we highlight in the table below some particular filter combinations that correspond to representative problems arising from common use-cases of each modelling framework. Click any benchmark problem name to see more details about it, and to view its results.

Framework
Problem Class
Application
MILP Features
Realistic
Example
GenX
LP
Infrastructure & Capacity Expansion
None
Yes
GenX
MILP
Infrastructure & Capacity Expansion
Unit commitment
Yes
PyPSA
LP
Infrastructure & Capacity Expansion
None
Yes
TEMOA
LP
Infrastructure & Capacity Expansion
None
Yes
TIMES
LP
Infrastructure & Capacity Expansion
None
Yes
Switch
LP
Infrastructure & Capacity Expansion
None
Yes

What benchmark problems do we have (and what are missing?)

This section breaks down our current benchmark set according to modelling framework, problem type, application domain, and model features. This highlights the kinds of energy models that we test solvers on, but is also a useful warning of the gaps in our collection.

Category
Problem Classes
LP
MILP
Applications
Infrastructure & Capacity Expansion
Operational
Production cost modelling
DC Optimal Power Flow
Steady-state Optimal Power Flow
Resource Adequacy
MILP Features
None
Unit commitment
Transmission switching
Modularity
Binary transmission investment decisions
Piecewise fuel usage
Piecewise-linear part-load efficiency modeling
Piecewise efficiency
Modelling of fixed costs
NonConvex operation
Realistic
Realistic

For version 2 of our platform, we plan to have a public call for benchmarks to address the gaps above. In particular, we welcome benchmark problem contributions that cover:

  • “Application”: DC optimal power flow, Operational and Production cost modelling analyses for most modelling frameworks.
  • “Time horizon”: Multi Period analyses; something particularly important as problems with multiple time horizons are more challenging to solve.
  • “MILP features”: Unit commitment for TIMES and TEMOA (though, indeed, they do not focus particularly on power sector modelling); other MILP features, such as Piecewise fuel usage, Transmission switching and Modularity are missing for most frameworks.
  • “Realistic”: Realistic problems are missing for PowerModels and Sienna
  • Large problem instances are also missing for many model frameworks, see the section What is feasible for open solvers above.

Reach out to us if you'd like to contribute any benchmark problems that can fill the above gaps!