[Ipopt] large quadratic problem - linear solver performance

Mon Jan 17 22:13:41 EST 2022

Hi,

you might have to compare the output of Pardiso from this problem with 
the one of similar problems to see what is off (except for time).
Is the number of nonzeros in L+U (353891710) exceptionally large compare 
to the one in A (10792217)?
Maybe a different ordering strategy would help (pardiso_order)?
Also try disabling parallelization, just to see whether that had a 
negative effect.
Does setting pardiso_msglvl to a larger value give more output?

Stefan

On 1/9/22 20:09, Ivo Stefanov wrote:
> Hello,
>    I am not sure if this is the right place to ask this as it seems to be purely related to the linear solver used, but let me give some background of what I am observing.  I am trying to solve the following (quadratic) problem:
>     Number of nonzeros in equality constraint Jacobian...:
>   8978916  Number of nonzeros in inequality constraint Jacobian.:
>   
> 378079  Number of nonzeros in Lagrangian Hessian.............:
>   
>   
> 9206
>    Total number of variables............................:
>   
> 378696
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
> variables with only lower bounds:
>   
> 338885
>   
>   
>   
>   
>   
>   
>   
>   variables with lower and upper bounds:
>   
>   39177
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
> variables with only upper bounds:
>   
>   
>   
>   0  Total number of equality constraints.................:
>   
>   39757  Total number of inequality constraints...............:
>   
> 338923
>   
>   
>   
>   inequality constraints with only lower bounds:
>   
>   
>   
>   0
>   
> inequality constraints with lower and upper bounds:
>   
>   
>   
>   1
>   
>   
>   
>   inequality constraints with only upper bounds:
>   
> 338922
>    It is decently large, but still a simple quadratic problem with linear constraints. I am solving many of those successfully (and fast) with the MKL PARDISO linear solver used, but this is the biggest one so far.  I am able to solve this in 2 ways:
>    1. Windows, IpOpt 3.7.0, MUMPS linear solver  Number of Iterations....: 82   Total CPU secs in IPOPT (w/o function evaluations)
>   
> =
>   
>   521.732  Total CPU secs in NLP function evaluations
>   
>   
>   
>   
>   
> =
>   
>   
> 20.381
>    I am normally not using this because it is a lot (sometimes orders of magnitude) slower than the PARDISO version, but this time I ran it for comparison purposes.
>    
>    2. Linux, IpOpt 3.12.12, MKL PARDISO linear solver (same laptop as in the above)  The iterations are going the same way as in the previous setup, so I guess it will all be the same in the end .. with the exception that it seems to take forever.  I know it is not apples to apples comparison, but so far I have never seen PARDISO being slower than MUMPS, let alone with such margin. I ran with
> pardiso_msglvl and for the first iteration I saw that:
>     === PARDISO: solving a symmetric indefinite system ===  1-based array indexing is turned ON  PARDISO double precision computation is turned ON  METIS algorithm at reorder step is turned ON  Matching is turned ON
>     Parallel Direct Factorization is running on 4 OpenMP
>       
>   
>   
>   
>   
>   
>   
> number of equations:
>   
>   
>   
>   
>   
> 1096299
>   
>   
>   
>   
>   
>   
> number of non-zeros in A:
>   
>   
>   10792217
>   
>   
>   
>   
>   
>   
> number of non-zeros in A (%): 0.000898
>    
>   
>   
>   
>   
>   
>   
> number of right-hand sides:
>   
>   1
>       
>   
>   
>   
>   
>   
>   
> number of columns for each panel: 112
>   
>   
>   
>   
>   
>   
> number of independent subgraphs:
>   0
>   
>   
>   
>   
>   
>   
> number of supernodes:
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   748178
>   
>   
>   
>   
>   
>   
> size of largest supernode:
>   
>   
>   
>   
>   
>   
>   
> 26102
>   
>   
>   
>   
>   
>   
> number of non-zeros in L:
>   
>   
>   
>   
>   
>   
>   
>   353891709
>   
>   
>   
>   
>   
>   
> number of non-zeros in U:
>   
>   
>   
>   
>   
>   
>   
>   1
>   
>   
>   
>   
>   
>   
> number of non-zeros in L+U:
>   
>   
>   
>   
>   
>   
>   353891710  === PARDISO is running in In-Core mode, because iparam(60)=0 ===
>     Times:  ======  Time spent in copying matrix to internal data structure (A to LU): 0.000001 s  Time spent in factorization step (numfct)
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   : 430.960081 s  Time spent in allocation of internal data structures (malloc)
>   
>   : 0.047239 s  Time spent in additional calculations
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   : 0.000080 s  Total time spent
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
>   
> : 431.007401 s
>    So it spent nearly the same amount of time in the first factorization as the MUMPS setup for the whole problem, which definitely makes no sense.
>    Subsequent iterations are faster at about 300 seconds each, which is still very bad.
>    I am wondering if I am missing something in the usage of PARDISO in that context, maybe an option that could help or something that seems generally off ? Does anyone have experience with a situation like that ?
>    Thank you very much !
>    Ivo
> 
> 
> _______________________________________________
> Ipopt mailing list
> Ipopt at list.coin-or.org
> https://list.coin-or.org/mailman/listinfo/ipopt