[Coin-ipopt] Convergence and BFGS Hessian computation (fwd)

Mon Oct 9 11:19:35 EDT 2006

Forwarded on request of sender - I deleted attachment though (too large)

---------- Forwarded message ----------
Date: Sat, 07 Oct 2006 16:15:18 +0900
From: Sylvain Miossec <SylvainMiossec at yahoo.fr>
To: Carl Damon Laird <claird at andrew.cmu.edu>,
     Andreas Waechter <andreasw at watson.ibm.com>
Subject: Re: [Coin-ipopt] Convergence and BFGS Hessian computation

Hi Carl, Hi Andreas,

thank you very much for your detailed responses.

I cannot use my work e-mail adress, so I just send this mail to you. If you 
think interesting, can you forward it on the coin-ipopt mailing list ?

I hadn't time to test all you proposed me. I will not be able to continue to 
work on it next week, so I send you my current achievements.

I increased the maximum number of iterations and multiply the objective by 100, 
set the L-BFGS history to 20, reset the tolerances to their default values. I 
also looked at the values of the jacobian of the constraints. The maximum 
values were about 300, so I scaled the optimization parameters to obtain 
maximum values of about 0.3.
Those modification made the objective to go down a bit faster. The scaling of 
the optimization variables allowed to decrease the primal infeasibility to 
about 1e-3. But I still need a lot iterations (almost 2000) so that the 
criteria reaches the one obtained with the L-BFGS history of 131 after 300 
iterations.

I also set the 'recalc_y_feas_tol' to 1 and the dual infeasibility tolerance 
decrease to almost 1e-2.

I send you a run I obtained with a good result.

If I increase the stopping tolerances, I think I can get the optimization to 
finish earlier, but it would not be with the minimum criteria, since the values 
of lowest infeasibilities are not well correlated with the lowest criteria (I 
can get the lowest values of feasibilities and the criteria is later decreasing 
again).

So it seems to me the convergence when we get closer to the optimum becomes 
very slow. Perhaps my computations are not precise enough. I think I still have 
to look at the scaling of the optimization variables independently, and also of 
the constraints. And I don't know what are very nonlinear constraints. But I 
suppose I have quite nonlinear constraints.

Some more details on the way I do the motion optimization. I use a method based 
on the inverse model of the robot. That is, I parameterize the trajectory of 
the robot and compute the control needed. My criteria is the energy consumption 
of the robot. It is then an integral. I use trapezoidal method for the moment, 
so my gradient computation should not be approximated (just to the numerical 
precision). The dynamic model of the robot involve many recursive computations, 
which perhaps increases numerical errors.
I don't know a lot about AMPL and the automatic differentiation, but I think I 
cannot use it with my program as it is. I could get only analytic expressions, 
but not easily.

Best regards

Sylvain Miossec

Carl Damon Laird wrote:
> Hi Sylvain Miossec, glad to hear IPOPT has been working for you. Sorry for 
> the long response, but there are a few things to discuss here.
> 
> As for the convergence criteria, you are using:
>> tol : 1e-3
>> acceptable tol : 3e-3
>> expect_infeasible_problem : yes
>> dual_inf_tol : 1e-6
>> compl_inf_tol : 1e-6
>> acceptable_dual_inf_tol : 3e-3
>> acceptable_compl_inf_tol : 3e-3
> 
> You have correctly seen that IPOPT has two levels of convergence criteria, 
> standard and acceptable. Even if the standard tolerance criteria are not 
> satisfied, if the acceptable criteria are satisfied for a certain number of 
> iterations, then the problem converges with an "acceptable" solution.
> 
> Is your problem converging with an acceptable solution?
> 
> Also, the parameters tol and acceptable_tol are relative tolerances (they are 
> scaled as per the paper), while the dual_inf_tol, compl_inf_tol, and 
> constr_viol_tol (as well as the acceptable versions) are all absolute 
> tolerances. IPOPT only converges when both the relative and the absolute are 
> satisfied. I notice that you have actually tightened the absolute tolerances 
> (to 1e-6 from 1e-4) If you have a problem that is not ideally scaled, the 
> absolute tolerances will be difficult to achieve. In some cases, when 
> problems have had poor scaling, we have greatly loosened the absolute 
> tolerances to enable convergence. Of course, changing the problem scaling 
> would be better.
> 
> You might try increasing these absolute tolerances. Also, is there any reason 
> why you have changed the absolute tolerance for dual infeasibility and 
> complementarity, but not for constraint violation (constr_viol_tol)? I would 
> try increasing all three of these and seeing if that helps.
> 
> One final comment. If the problem is poorly scaled or nearly singular, the 
> multipliers may be converging to very large values. This will, in turn, cause 
> the dual infeasibility to be large. Fixing the scaling of the problem can 
> help greatly.
> 
> As far as the L-BFGS versus the BFGS I don't have too much to say except that 
> the C++ version of IPOPT does not have a BFGS reduced space option like the 
> Fortran version. Therefore, I think L-BFGS is your only choice with C++ IPOPT 
> right now.
> 
> Hope this helps and let us know if you have further difficulties,
> 
> Cheers,
> 
> Carl.
> 
> 
> On Thu, 5 Oct 2006, Sylvain Miossec wrote:
> 
>> Hi all,
>> 
>> I am using IPOPT for a year now and I found it quite fast for the type
>> of problems I solve (robot motion optimization). Thanks a lot for the
>> good job.
>> 
>> But I have one problem is the proper convergence of the optimization.
>> Now I used to stop the optimization by reaching the maximum number of
>> iterations. And then checking that the criteria is not changing a lot
>> during the last iterations, and that the constraints are well satisfied.
>> The motions I obtain for the robot look also optimal.
>> 
>> Does anyone would have an idea why I don't obtain proper convergence ?
>> 
>> I provide two files (in a zip compressed file) of some runs of optimization 
>> for two different history of BFGS approximation.
>> 
>> Since the Hessian computation is very complicate to program in my case,
>> I use the limited-memory BFGS approximation. I tried a history of 6 for
>> the BFGS computation and 131 (the size of my problem).  The BFGS
>> computation is slower with a 131 history but the convergence is better.
>> I have another question then : would it be better for my problem size to
>> use a BFGS approximation (not limited-memory one) ?
>> 
>> the parameters I used for IPOPT are the following :
>> tol : 1e-3
>> acceptable tol : 3e-3
>> max_iter : 300/500
>> expect_infeasible_problem : yes
>> hessian_approximation : yes
>> limited_memory_max_history : 131/6
>> dual_inf_tol : 1e-6
>> compl_inf_tol : 1e-6
>> acceptable_dual_inf_tol : 3e-3
>> acceptable_compl_inf_tol : 3e-3
>> check_derivatives_for_naninf : yes
>> derivative_test : first-order
>> alpha_for_y : full
>> 
>> During the optimization, the dual infeasibility is not really decreasing
>> (and not reaching the stopping tolerance) while the criteria seems to
>> converge. Is it normal ? Could it be due to a scaling problem (I scaled
>> the problem myself by trial and error). Could it be due to the Hessian
>> approximation ? Perhaps it is another reason for using BFGS
>> approximation instead of limited-memory one. Or perhaps I should try the
>> real hessian computation ? (but I would like to be sure it would solve
>> my problem, since it will take lots of time to implement).
>> 
>> An help on this point would be greatly appreciated.
>> 
>> Best regards
>> 
>> Sylvain Miossec
>> 
>> 
>