Stian Backe
Stian Backe

Reputation: 21

Explanation of difference in performance between CPLEX, Gurobi and FICO Xpress using interior point method (barrier) without crossover?

I am working with a very large (stochastic) LP with the barrier algorithm without crossover. My model is implemented in Pyomo, and I have tried to use CPLEX, Gurobi and FICO Xpress to solve it. The solver settings in Pyomo are as follows:

For CPLEX:

opt = SolverFactory("cplex")
opt.options["lpmethod"] = 4
opt.options["barrier crossover"] = -1
results = opt.solve(instance)

For Gurobi:

opt = SolverFactory('gurobi_persistent')
opt.set_instance(instance)
opt.options["Crossover"]=0
opt.options["Method"]=2
results = opt.solve(instance, load_solutions=True)
results = opt.load_vars()
results = opt.load_duals()

For FICO Xpress:

opt = SolverFactory("xpress")
opt.options["defaultAlg"] = 4
opt.options["crossover"] = 0
results = opt.solve(instance)

All solvers find a solution, but with (very) varying speed:

My question is: Why is there such a huge difference between the solvers (especially comparing CPLEX and Gurobi)? What is going on in the different barrier algorithms?

Summary of log for CPLEX:

IBM(R) ILOG(R) CPLEX(R) Interactive Optimizer 12.8.0.0
Read time = 52.61 sec. (3283.43 ticks)
Objective sense      : Minimize
Variables            : 17684371
Objective nonzeros   : 7976817
Linear constraints   : 26929486  [Less: 25202826,  Equal: 1726660]
  Nonzeros           : 83463204
  RHS nonzeros       :  621453
Tried aggregator 1 time.
DUAL formed by presolve
Aggregator has done 14545 substitutions...
LP Presolve eliminated 8512063 rows and 3459945 columns.
Reduced LP has 14209881 rows, 21009396 columns, and 61814653 nonzeros.
Presolve time = 148.20 sec. (209740.04 ticks)
Parallel mode: using up to 24 threads for barrier.
***NOTE: Found 243 dense columns.
Number of nonzeros in lower triangle of A*A' = 268787475
Elapsed ordering time = 17.45 sec. (10000.00 ticks)
Using Nested Dissection ordering
Total time for automatic ordering = 376.13 sec. (209058.23 ticks)
Summary statistics for Cholesky factor:
  Threads                   = 24
  Rows in Factor            = 14210124
  Integer space required    = 145889976
  Total non-zeros in factor = 12261481354
  Total FP ops to factor    = 39156639536792
Total time on 24 threads = 40177.89 sec. (62234738.71 ticks)
Barrier - Non-optimal:  Objective =  9.9825095360e+11
Solution time = 40177.90 sec.  Iterations = 258

Summary of log for Xpress:

FICO Xpress Solver 64bit v8.5.0:
Problem Statistics
    26929486 (      0 spare) rows
    17684371 (      0 spare) structural columns
    83463204 (      0 spare) non-zero elements
Presolved problem has:
  18426768 rows     14805105 cols     59881674 elements
Barrier cache sizes : L1=32K L2=20480K
Using AVX support
Cores per CPU (CORESPERCPU): 12
Barrier starts, using up to 24 threads, 12 cores
Matrix ordering - Dense cols.:   6776   NZ(L): 485925484   Flops: 273369311062
Barrier method finished in 7874 seconds
Optimal solution found
Barrier solved problem
  169 barrier iterations in 7981s
Final objective                         : 9.982464100682021e+11
Max primal violation      (abs / rel) : 1.612e-03 / 1.612e-03
Max dual violation        (abs / rel) : 1.837e+02 / 7.381e+01
Max complementarity viol. (abs / rel) : 1.837e+02 / 1.675e-07

Summary of log for Gurobi:

Gurobi 8.0.0: 
Optimize a model with 26929485 rows, 17684370 columns and 83463203 nonzeros
Coefficient statistics:
  Matrix range     [1e-05, 4e+00]
  Objective range  [2e+00, 8e+06]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e-01, 2e+08]
Presolve removed 8527789 rows and 2871939 columns
Presolve time: 53.79s
Presolved: 18401696 rows, 14812431 columns, 59808411 nonzeros
Ordering time: 29.38s
Barrier statistics:
 Dense cols : 4607
 AA' NZ     : 6.262e+07
 Factor NZ  : 5.722e+08 (roughly 18.0 GBytes of memory)
 Factor Ops : 3.292e+11 (roughly 4 seconds per iteration)
 Threads    : 12
Barrier performed 337 iterations in 4592.92 seconds
Sub-optimal termination - objective 9.98262837e+11

Upvotes: 1

Views: 2769

Answers (1)

Fred Fiand
Fred Fiand

Reputation: 21

With CPLEX the factorization is much more expensive.

#FLOPS CPLEX  ~3.92e+13
#FLOPS Xpress ~2.73e+11
#FLOPS Gurobi ~3.29e+11

I've seen such discrepancy a few times for large-scale LPs. In most of these cases CPLEX made a bad decision to pass the dual to the optimizer

DUAL formed by presolve

Setting the PreDual parameter to -1 prevents that.

Upvotes: 2

Related Questions