[Ipopt] IPOPT performance (and impact of BLAS library)

Jonathan Hogg jonathan.hogg at stfc.ac.uk
Tue Sep 9 05:08:28 EDT 2014


As has been pointed out - your function evaluations are expensive.
Of the 6.8 seconds wallclock in Ipopt below it breaks down as:
2.4 in function evaluations
1.5 in sparse linear factor
1.2 in sparse linear solve
1.7 elsewhere

An observation is that you're spending almost as much time in the solve 
as in the factorization, and throwing more threads at that is unlikely 
to help as its constrained by memory throughput - we've found in the 
past that a single core is often capable of saturating the available 
bandwidth, and adding more doesn't get you much. You need to tackle the 
memory usage at a algorithmic level by fiddling with the ordering (try 
both amd and metis) and supernode amalgamation strategy (try doubling or 
halving nemin [ma97_nemin in ipopt naming I think]). If you're getting a 
lot of delayed pivots reported you can try fiddling with scaling 
strategies too.

Solver-wise for small problems I'd expect ma27 to win on small problems 
and ma97/metis/threading to win on big ones. If you use the 
ma97_dump_matrix option to output .rb files I'm happy to take a quick 
look at a few (for a typical sized problem, go for an iteration at the 
start, middle and end of factorization) and advise on parameters that 
might help.

Regards,

Jonathan.

On 08/09/14 22:24, Jon Herman wrote:
> I've copied below the timing output from one of the moderately sized 
> examples I've looked at, using ma27. I haven't taken a look at these 
> outputs before (thanks for the recommendation!), so I'll study this a 
> little more, but any thoughts are welcome.
> This solves in 130 iterations (142 objective/constraint evaluations, 
> 131 gradient evaluations), so about 0.2 CPU seconds per iteration 
> (this is running on 4 cores).
>
> Using metis ordering doesn't seem to significantly affect performance. 
> I haven't tried using ma86 or ma97 with OpenMP enabled, I'll go and 
> give that a shot.
>
> For Tony Kelman: what do you mean by "unless my function evaluations 
> are implemented inefficiently"? At this point they are a minority of 
> the run-time, so any efficiency there does not seem to be the problem? 
> Or are you getting at something else?
>
> Thank you for the quick responses so far!
>
> Timing Statistics:
>
> OverallAlgorithm....................:     26.471 (sys:      0.922 
> wall:      6.861)
>  PrintProblemStatistics.............:      0.001 (sys:      0.000 
> wall:      0.000)
>  InitializeIterates.................:      0.175 (sys:      0.004 
> wall:      0.062)
>  UpdateHessian......................:      0.467 (sys:      0.013 
> wall:      0.120)
>  OutputIteration....................:      0.005 (sys:      0.001 
> wall:      0.002)
>  UpdateBarrierParameter.............:      8.311 (sys:      0.309 
> wall:      2.153)
>  ComputeSearchDirection.............:      6.042 (sys:      0.191 
> wall:      1.557)
>  ComputeAcceptableTrialPoint........:      1.658 (sys:      0.059 
> wall:      0.429)
>  AcceptTrialPoint...................:      1.943 (sys:      0.063 
> wall:      0.501)
>  CheckConvergence...................:      7.860 (sys:      0.282 
> wall:      2.034)
> PDSystemSolverTotal.................:     12.647 (sys:      0.417 
> wall:      3.264)
>  PDSystemSolverSolveOnce............:     11.446 (sys:      0.378 
> wall:      2.954)
>  ComputeResiduals...................:      0.997 (sys:      0.030 
> wall:      0.257)
>  StdAugSystemSolverMultiSolve.......:     10.953 (sys:      0.379 
> wall:      2.831)
>  LinearSystemScaling................:      0.000 (sys:      0.000 
> wall:      0.000)
>  LinearSystemSymbolicFactorization..:      0.018 (sys:      0.000 
> wall:      0.005)
>  LinearSystemFactorization..........:      5.611 (sys:      0.195 
> wall:      1.451)
>  LinearSystemBackSolve..............:      4.692 (sys:      0.169 
> wall:      1.215)
>  LinearSystemStructureConverter.....:      0.000 (sys:      0.000 
> wall:      0.000)
>   LinearSystemStructureConverterInit:      0.000 (sys:      0.000 
> wall:      0.000)
> QualityFunctionSearch...............:      1.581 (sys:      0.077 
> wall:      0.414)
> TryCorrector........................:      0.000 (sys:      0.000 
> wall:      0.000)
> Task1...............................:      0.363 (sys:      0.018 
> wall:      0.096)
> Task2...............................:      0.567 (sys:      0.022 
> wall:      0.147)
> Task3...............................:      0.076 (sys:      0.005 
> wall:      0.020)
> Task4...............................:      0.000 (sys:      0.000 
> wall:      0.000)
> Task5...............................:      0.507 (sys:      0.020 
> wall:      0.132)
> Function Evaluations................:      9.348 (sys:      0.328 
> wall:      2.417)
>  Objective function.................:      0.240 (sys:      0.009 
> wall:      0.062)
>  Objective function gradient........:      4.316 (sys:      0.150 
> wall:      1.116)
>  Equality constraints...............:      0.316 (sys:      0.012 
> wall:      0.082)
>  Inequality constraints.............:      0.000 (sys:      0.000 
> wall:      0.000)
>  Equality constraint Jacobian.......:      4.477 (sys:      0.157 
> wall:      1.157)
>  Inequality constraint Jacobian.....:      0.000 (sys:      0.000 
> wall:      0.000)
>  Lagrangian Hessian.................:      0.000 (sys:      0.000 
> wall:      0.000)
>
>
>
> On 09/08/2014 03:02 PM, Greg Horn wrote:
>> My usual answer to increasing efficiency is using HSL (ma86/ma97) 
>> with metis ordering and openmp. How expensive are your function 
>> evaluations? What is your normal time per iteration, and how many 
>> iterations does it take to solve? What sort of problem are you solving?
>>
>> On Mon, Sep 8, 2014 at 10:53 PM, Jon Herman <jon.herman at colorado.edu 
>> <mailto:jon.herman at colorado.edu>> wrote:
>>
>>     Hello,
>>
>>     I am working on implementing IPOPT in a piece of software that
>>     has a need for very good performance. Unfortunately, it seems
>>     that right now my total run-time is about 80% in IPOPT (that
>>     number excludes the function evaluations, as well as any time
>>     setting up the problem, etc.). For me to put IPOPT to good use,
>>     I'm hoping to make it run more efficiently, and even out the
>>     workload between IPOPT and the function evaluations, preferably
>>     shifting the work to the function evaluations as much as possible.
>>
>>     Originally, I was using the BLAS/LAPACK that can be installed
>>     with IPOPT. In an attempt to improve performance, I switched to
>>     OpenBLAS. To my confusion, performance did not change at all.
>>     This is leading me to believe that something other than the BLAS
>>     library is dominating the cost. (I am certain I properly removed
>>     the old libraries when switching BLAS implementation) I'm not
>>     sure how to effectively narrow down where IPOPT is spending most
>>     of it's time, and how to subsequently improve that performance.
>>
>>     I've made sure to try the ma27, ma57, ma77, ma86, ma97, and mumps
>>     solvers. Performance varies among them, but 80% of the time spent
>>     in IPOPT is the best result I achieve (which is typically with
>>     ma27 or ma57, the other solvers are closer to 90%). I've also
>>     made sure to try problems as small as 500 variables and 400
>>     constraints, to as large as 110 000 variables and 80 000
>>     constraints (and many points in between those extremes).
>>     Performance is very consistent across that range (for a given
>>     solver), again regardless of the BLAS library being used. I've
>>     been doing this using the quasi-Newton approximation for the
>>     Hessian, which I was hoping to get away with, but I suppose this
>>     may put a lot of work into IPOPT's side of the court. I'll also
>>     mention that I'm calling IPOPT through the PyIPOPT module (though
>>     I'm expecting this to create only a small, fixed overhead).
>>
>>     If you have any thoughts on why IPOPT might be hogging such a
>>     large fraction of my total run-time, and/or how I could improve
>>     this (or determining if this might be entirely unavoidable), I
>>     would greatly appreciate it! (and of course I'd be happy to
>>     provide additional information if that would be useful)
>>
>>     Best regards,
>>
>>     Jon
>>
>>     _______________________________________________
>>     Ipopt mailing list
>>     Ipopt at list.coin-or.org <mailto:Ipopt at list.coin-or.org>
>>     http://list.coin-or.org/mailman/listinfo/ipopt
>>
>>
>
>
>
> _______________________________________________
> Ipopt mailing list
> Ipopt at list.coin-or.org
> http://list.coin-or.org/mailman/listinfo/ipopt


-- 
Scanned by iCritical.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.coin-or.org/pipermail/ipopt/attachments/20140909/96faba94/attachment.html>


More information about the Ipopt mailing list