[Ipopt] IPOPT performance (and impact of BLAS library)
Joel Andersson
j.a.e.andersson at gmail.com
Tue Sep 9 06:38:39 EDT 2014
If you're using IPOPT via a modelling system with good support for
algorithmic differentiation (such as our tool CasADi, available for
Python), you should be able to speed up function evaluation over
hand-written code (especially if you are generating C-code for
Jacobian/Hessian).
Also, if you use exact Hessian (default in CasADi), you might also see a
lot of speed up, thanks to fewer iterations, no logic for BFGS and more
sparse linear system.
Best regards,
Joel
2014-09-09 11:08 GMT+02:00 Jonathan Hogg <jonathan.hogg at stfc.ac.uk>:
> As has been pointed out - your function evaluations are expensive.
> Of the 6.8 seconds wallclock in Ipopt below it breaks down as:
> 2.4 in function evaluations
> 1.5 in sparse linear factor
> 1.2 in sparse linear solve
> 1.7 elsewhere
>
> An observation is that you're spending almost as much time in the solve as
> in the factorization, and throwing more threads at that is unlikely to help
> as its constrained by memory throughput - we've found in the past that a
> single core is often capable of saturating the available bandwidth, and
> adding more doesn't get you much. You need to tackle the memory usage at a
> algorithmic level by fiddling with the ordering (try both amd and metis)
> and supernode amalgamation strategy (try doubling or halving nemin
> [ma97_nemin in ipopt naming I think]). If you're getting a lot of delayed
> pivots reported you can try fiddling with scaling strategies too.
>
> Solver-wise for small problems I'd expect ma27 to win on small problems
> and ma97/metis/threading to win on big ones. If you use the
> ma97_dump_matrix option to output .rb files I'm happy to take a quick look
> at a few (for a typical sized problem, go for an iteration at the start,
> middle and end of factorization) and advise on parameters that might help.
>
> Regards,
>
> Jonathan.
>
>
> On 08/09/14 22:24, Jon Herman wrote:
>
> I've copied below the timing output from one of the moderately sized
> examples I've looked at, using ma27. I haven't taken a look at these
> outputs before (thanks for the recommendation!), so I'll study this a
> little more, but any thoughts are welcome.
> This solves in 130 iterations (142 objective/constraint evaluations, 131
> gradient evaluations), so about 0.2 CPU seconds per iteration (this is
> running on 4 cores).
>
> Using metis ordering doesn't seem to significantly affect performance. I
> haven't tried using ma86 or ma97 with OpenMP enabled, I'll go and give that
> a shot.
>
> For Tony Kelman: what do you mean by "unless my function evaluations are
> implemented inefficiently"? At this point they are a minority of the
> run-time, so any efficiency there does not seem to be the problem? Or are
> you getting at something else?
>
> Thank you for the quick responses so far!
>
> Timing Statistics:
>
> OverallAlgorithm....................: 26.471 (sys: 0.922
> wall: 6.861)
> PrintProblemStatistics.............: 0.001 (sys: 0.000
> wall: 0.000)
> InitializeIterates.................: 0.175 (sys: 0.004
> wall: 0.062)
> UpdateHessian......................: 0.467 (sys: 0.013
> wall: 0.120)
> OutputIteration....................: 0.005 (sys: 0.001
> wall: 0.002)
> UpdateBarrierParameter.............: 8.311 (sys: 0.309
> wall: 2.153)
> ComputeSearchDirection.............: 6.042 (sys: 0.191
> wall: 1.557)
> ComputeAcceptableTrialPoint........: 1.658 (sys: 0.059
> wall: 0.429)
> AcceptTrialPoint...................: 1.943 (sys: 0.063
> wall: 0.501)
> CheckConvergence...................: 7.860 (sys: 0.282
> wall: 2.034)
> PDSystemSolverTotal.................: 12.647 (sys: 0.417
> wall: 3.264)
> PDSystemSolverSolveOnce............: 11.446 (sys: 0.378
> wall: 2.954)
> ComputeResiduals...................: 0.997 (sys: 0.030
> wall: 0.257)
> StdAugSystemSolverMultiSolve.......: 10.953 (sys: 0.379
> wall: 2.831)
> LinearSystemScaling................: 0.000 (sys: 0.000
> wall: 0.000)
> LinearSystemSymbolicFactorization..: 0.018 (sys: 0.000
> wall: 0.005)
> LinearSystemFactorization..........: 5.611 (sys: 0.195
> wall: 1.451)
> LinearSystemBackSolve..............: 4.692 (sys: 0.169
> wall: 1.215)
> LinearSystemStructureConverter.....: 0.000 (sys: 0.000
> wall: 0.000)
> LinearSystemStructureConverterInit: 0.000 (sys: 0.000
> wall: 0.000)
> QualityFunctionSearch...............: 1.581 (sys: 0.077
> wall: 0.414)
> TryCorrector........................: 0.000 (sys: 0.000
> wall: 0.000)
> Task1...............................: 0.363 (sys: 0.018
> wall: 0.096)
> Task2...............................: 0.567 (sys: 0.022
> wall: 0.147)
> Task3...............................: 0.076 (sys: 0.005
> wall: 0.020)
> Task4...............................: 0.000 (sys: 0.000
> wall: 0.000)
> Task5...............................: 0.507 (sys: 0.020
> wall: 0.132)
> Function Evaluations................: 9.348 (sys: 0.328
> wall: 2.417)
> Objective function.................: 0.240 (sys: 0.009
> wall: 0.062)
> Objective function gradient........: 4.316 (sys: 0.150
> wall: 1.116)
> Equality constraints...............: 0.316 (sys: 0.012
> wall: 0.082)
> Inequality constraints.............: 0.000 (sys: 0.000
> wall: 0.000)
> Equality constraint Jacobian.......: 4.477 (sys: 0.157
> wall: 1.157)
> Inequality constraint Jacobian.....: 0.000 (sys: 0.000
> wall: 0.000)
> Lagrangian Hessian.................: 0.000 (sys: 0.000
> wall: 0.000)
>
>
>
> On 09/08/2014 03:02 PM, Greg Horn wrote:
>
> My usual answer to increasing efficiency is using HSL (ma86/ma97) with
> metis ordering and openmp. How expensive are your function evaluations?
> What is your normal time per iteration, and how many iterations does it
> take to solve? What sort of problem are you solving?
>
> On Mon, Sep 8, 2014 at 10:53 PM, Jon Herman <jon.herman at colorado.edu>
> wrote:
>
>> Hello,
>>
>> I am working on implementing IPOPT in a piece of software that has a need
>> for very good performance. Unfortunately, it seems that right now my total
>> run-time is about 80% in IPOPT (that number excludes the function
>> evaluations, as well as any time setting up the problem, etc.). For me to
>> put IPOPT to good use, I'm hoping to make it run more efficiently, and even
>> out the workload between IPOPT and the function evaluations, preferably
>> shifting the work to the function evaluations as much as possible.
>>
>> Originally, I was using the BLAS/LAPACK that can be installed with IPOPT.
>> In an attempt to improve performance, I switched to OpenBLAS. To my
>> confusion, performance did not change at all. This is leading me to believe
>> that something other than the BLAS library is dominating the cost. (I am
>> certain I properly removed the old libraries when switching BLAS
>> implementation) I'm not sure how to effectively narrow down where IPOPT is
>> spending most of it's time, and how to subsequently improve that
>> performance.
>>
>> I've made sure to try the ma27, ma57, ma77, ma86, ma97, and mumps
>> solvers. Performance varies among them, but 80% of the time spent in IPOPT
>> is the best result I achieve (which is typically with ma27 or ma57, the
>> other solvers are closer to 90%). I've also made sure to try problems as
>> small as 500 variables and 400 constraints, to as large as 110 000
>> variables and 80 000 constraints (and many points in between those
>> extremes). Performance is very consistent across that range (for a given
>> solver), again regardless of the BLAS library being used. I've been doing
>> this using the quasi-Newton approximation for the Hessian, which I was
>> hoping to get away with, but I suppose this may put a lot of work into
>> IPOPT's side of the court. I'll also mention that I'm calling IPOPT through
>> the PyIPOPT module (though I'm expecting this to create only a small, fixed
>> overhead).
>>
>> If you have any thoughts on why IPOPT might be hogging such a large
>> fraction of my total run-time, and/or how I could improve this (or
>> determining if this might be entirely unavoidable), I would greatly
>> appreciate it! (and of course I'd be happy to provide additional
>> information if that would be useful)
>>
>> Best regards,
>>
>> Jon
>>
>> _______________________________________________
>> Ipopt mailing list
>> Ipopt at list.coin-or.org
>> http://list.coin-or.org/mailman/listinfo/ipopt
>>
>>
>
>
>
> _______________________________________________
> Ipopt mailing listIpopt at list.coin-or.orghttp://list.coin-or.org/mailman/listinfo/ipopt
>
>
>
> --
> Scanned by iCritical.
>
>
> _______________________________________________
> Ipopt mailing list
> Ipopt at list.coin-or.org
> http://list.coin-or.org/mailman/listinfo/ipopt
>
>
--
--
Joel Andersson, PhD
Ptge. Busquets 11-13, atico 3
E-08940 Cornella de Llobregat, Spain
Home: +34-93-6034011
Mobile: +32-486-672874 (Belgium) / +34-63-4408800 (Spain) / +46-707-360512
(Sweden)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.coin-or.org/pipermail/ipopt/attachments/20140909/f004b6c9/attachment-0001.html>
More information about the Ipopt
mailing list