[Ipopt] IPOPT performance (and impact of BLAS library)

Jon Herman jon.herman at colorado.edu
Mon Sep 8 17:24:15 EDT 2014


I've copied below the timing output from one of the moderately sized 
examples I've looked at, using ma27. I haven't taken a look at these 
outputs before (thanks for the recommendation!), so I'll study this a 
little more, but any thoughts are welcome.
This solves in 130 iterations (142 objective/constraint evaluations, 131 
gradient evaluations), so about 0.2 CPU seconds per iteration (this is 
running on 4 cores).

Using metis ordering doesn't seem to significantly affect performance. I 
haven't tried using ma86 or ma97 with OpenMP enabled, I'll go and give 
that a shot.

For Tony Kelman: what do you mean by "unless my function evaluations are 
implemented inefficiently"? At this point they are a minority of the 
run-time, so any efficiency there does not seem to be the problem? Or 
are you getting at something else?

Thank you for the quick responses so far!

Timing Statistics:

OverallAlgorithm....................:     26.471 (sys:      0.922 
wall:      6.861)
  PrintProblemStatistics.............:      0.001 (sys:      0.000 
wall:      0.000)
  InitializeIterates.................:      0.175 (sys:      0.004 
wall:      0.062)
  UpdateHessian......................:      0.467 (sys:      0.013 
wall:      0.120)
  OutputIteration....................:      0.005 (sys:      0.001 
wall:      0.002)
  UpdateBarrierParameter.............:      8.311 (sys:      0.309 
wall:      2.153)
  ComputeSearchDirection.............:      6.042 (sys:      0.191 
wall:      1.557)
  ComputeAcceptableTrialPoint........:      1.658 (sys:      0.059 
wall:      0.429)
  AcceptTrialPoint...................:      1.943 (sys:      0.063 
wall:      0.501)
  CheckConvergence...................:      7.860 (sys:      0.282 
wall:      2.034)
PDSystemSolverTotal.................:     12.647 (sys:      0.417 
wall:      3.264)
  PDSystemSolverSolveOnce............:     11.446 (sys:      0.378 
wall:      2.954)
  ComputeResiduals...................:      0.997 (sys:      0.030 
wall:      0.257)
  StdAugSystemSolverMultiSolve.......:     10.953 (sys:      0.379 
wall:      2.831)
  LinearSystemScaling................:      0.000 (sys:      0.000 
wall:      0.000)
  LinearSystemSymbolicFactorization..:      0.018 (sys:      0.000 
wall:      0.005)
  LinearSystemFactorization..........:      5.611 (sys:      0.195 
wall:      1.451)
  LinearSystemBackSolve..............:      4.692 (sys:      0.169 
wall:      1.215)
  LinearSystemStructureConverter.....:      0.000 (sys:      0.000 
wall:      0.000)
   LinearSystemStructureConverterInit:      0.000 (sys:      0.000 
wall:      0.000)
QualityFunctionSearch...............:      1.581 (sys:      0.077 
wall:      0.414)
TryCorrector........................:      0.000 (sys:      0.000 
wall:      0.000)
Task1...............................:      0.363 (sys:      0.018 
wall:      0.096)
Task2...............................:      0.567 (sys:      0.022 
wall:      0.147)
Task3...............................:      0.076 (sys:      0.005 
wall:      0.020)
Task4...............................:      0.000 (sys:      0.000 
wall:      0.000)
Task5...............................:      0.507 (sys:      0.020 
wall:      0.132)
Function Evaluations................:      9.348 (sys:      0.328 
wall:      2.417)
  Objective function.................:      0.240 (sys:      0.009 
wall:      0.062)
  Objective function gradient........:      4.316 (sys:      0.150 
wall:      1.116)
  Equality constraints...............:      0.316 (sys:      0.012 
wall:      0.082)
  Inequality constraints.............:      0.000 (sys:      0.000 
wall:      0.000)
  Equality constraint Jacobian.......:      4.477 (sys:      0.157 
wall:      1.157)
  Inequality constraint Jacobian.....:      0.000 (sys:      0.000 
wall:      0.000)
  Lagrangian Hessian.................:      0.000 (sys:      0.000 
wall:      0.000)



On 09/08/2014 03:02 PM, Greg Horn wrote:
> My usual answer to increasing efficiency is using HSL (ma86/ma97) with 
> metis ordering and openmp. How expensive are your function 
> evaluations? What is your normal time per iteration, and how many 
> iterations does it take to solve? What sort of problem are you solving?
>
> On Mon, Sep 8, 2014 at 10:53 PM, Jon Herman <jon.herman at colorado.edu 
> <mailto:jon.herman at colorado.edu>> wrote:
>
>     Hello,
>
>     I am working on implementing IPOPT in a piece of software that has
>     a need for very good performance. Unfortunately, it seems that
>     right now my total run-time is about 80% in IPOPT (that number
>     excludes the function evaluations, as well as any time setting up
>     the problem, etc.). For me to put IPOPT to good use, I'm hoping to
>     make it run more efficiently, and even out the workload between
>     IPOPT and the function evaluations, preferably shifting the work
>     to the function evaluations as much as possible.
>
>     Originally, I was using the BLAS/LAPACK that can be installed with
>     IPOPT. In an attempt to improve performance, I switched to
>     OpenBLAS. To my confusion, performance did not change at all. This
>     is leading me to believe that something other than the BLAS
>     library is dominating the cost. (I am certain I properly removed
>     the old libraries when switching BLAS implementation) I'm not sure
>     how to effectively narrow down where IPOPT is spending most of
>     it's time, and how to subsequently improve that performance.
>
>     I've made sure to try the ma27, ma57, ma77, ma86, ma97, and mumps
>     solvers. Performance varies among them, but 80% of the time spent
>     in IPOPT is the best result I achieve (which is typically with
>     ma27 or ma57, the other solvers are closer to 90%). I've also made
>     sure to try problems as small as 500 variables and 400
>     constraints, to as large as 110 000 variables and 80 000
>     constraints (and many points in between those extremes).
>     Performance is very consistent across that range (for a given
>     solver), again regardless of the BLAS library being used. I've
>     been doing this using the quasi-Newton approximation for the
>     Hessian, which I was hoping to get away with, but I suppose this
>     may put a lot of work into IPOPT's side of the court. I'll also
>     mention that I'm calling IPOPT through the PyIPOPT module (though
>     I'm expecting this to create only a small, fixed overhead).
>
>     If you have any thoughts on why IPOPT might be hogging such a
>     large fraction of my total run-time, and/or how I could improve
>     this (or determining if this might be entirely unavoidable), I
>     would greatly appreciate it! (and of course I'd be happy to
>     provide additional information if that would be useful)
>
>     Best regards,
>
>     Jon
>
>     _______________________________________________
>     Ipopt mailing list
>     Ipopt at list.coin-or.org <mailto:Ipopt at list.coin-or.org>
>     http://list.coin-or.org/mailman/listinfo/ipopt
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.coin-or.org/pipermail/ipopt/attachments/20140908/ad07526b/attachment-0001.html>


More information about the Ipopt mailing list