[ADOL-C] Evaluating Hessian of Lagrangian

Fri May 3 08:28:51 EDT 2013

So I finally got the modified version of the Hessian computation working. I tried it on a
problem with 2300 variables and 2040 nonlinear constraints using an Intel E8400. 
The problem has a sparse Hessian with 37000 non-zeros of 5600000 possible.

Turns out that, except for the first call, the new version treating the Lagrange
multipliers (and ipopt's obj_factor) as variables is much faster. The table summarizes
the situation:

                      First call  (repeat=0)       10 subsequent calls
Old                 47 sec                             480 sec  (repeat=0)
New               54 sec                              15 sec    (repeat=1)

Unfortunately Valgrind still has some complaints about the new code, but my intuition
is that the problem is not with ADOL-C and that a fix will not change the above results
significantly (the errors strangely show up in functions unrelated to the Hessian computation,
but are somehow anyway related to the new method because they only appear if I run
the new method more than once). Anyway, the code has thus far worked without crashes.

Thanks for all helpful comments!

Kind regards

Ingrid

From: ingridhallen at hotmail.com
To: kshitij at math.upb.de
Date: Fri, 3 May 2013 07:43:09 +0200
CC: adol-c at list.coin-or.org
Subject: Re: [ADOL-C] Evaluating Hessian of Lagrangian

Hi,

Thanks for the quick reply! The problem turned out to be my function for evaluating
the Lagrangian. A bit simplified, the following is the original code:

adouble L = obj_factor*obj_fun(x);
for (unsigned int i=0; i<nconstr; i++)
      L += lambda[i]*constr[i];

This code worked fine with lambda as doubles, but when using lambda as adoubles,
i.e. as variables, additional non-zeros appeared in H_{xx}. However, when changing 
the code to

adouble L = obj_factor*obj_fun(x);
for (unsigned int i=0; i<nconstr; i++)
      L = L + lambda[i]*constr[i];

everything worked fine, with all additional non-zeros ending up in columns>dim(x). 
Not sure why though, perhaps someone could explain?

Kind regards

Ingrid

> Date: Thu, 2 May 2013 15:54:28 +0200
> From: kshitij at math.upb.de
> To: ingridhallen at hotmail.com
> CC: awalther at math.uni-paderborn.de; adol-c at list.coin-or.org
> Subject: Re: [ADOL-C] Evaluating Hessian of Lagrangian
> 
> Hello,
> 
> the sequence of rows or columns in the hessian is exactly the same as of
> the execution of the <<= operators that define independents between
> trace_on() and trace_off(). So as long as the <<= operator was executed
> for all x variables before all lagrange multipliers, the H_{xl}
> components should have column indices larger than dim(x).
> 
> Sincerely
> Kshitij Kulshrestha.
> 
> As on 2013-05-02 15:28h, Ingrid Hallen did write:
> > Hi,
> > 
> > I finally got time to start implementing the alternative method for
> > computing the Hessian of
> > the Lagrangian to do some benchmarking. However, I have come across the
> > following difficulty:
> > 
> > Using also the Lagrange multipliers as variables yields an Hessian of
> > the Lagrangian in the
> > form of a block matrix:
> > 
> > H = / H_{xx}         H_{xl} \
> >        \ H_{xl}                0    /
> > 
> > Here H_{xx} is the Hessian wrt to the original variables x\inR^n that we
> > are interested in. After calling
> > sparse_hess(tag,n,repeat,x,&nnz,&rind,&cind,&values,&options) I thought
> > that it should be
> > straightforward to identify the parts of rind, cind and values
> > pertaining to H_{xx}. For instance,
> > say we want to count the non-zeros in the first row of H_{xx}, nnz_xx,
> > using the following code:
> > 
> > int k = 0
> > int nnz_xx = 0
> > while rind[k]==0
> >     if cind[k]<n
> >           nnz_xx++;
> >     end
> >     k++;
> > end
> > 
> > This of course only works provided the non-zeros of H_{xl} are found in
> > the columns>n, but that
> > appears not to be the case.
> > 
> > Am I missing something really simple here, or how can I extract H_{xx}?.
> > 
> > Kind regards
> > Ingrid
> > 
> > 
> > 
> > 
> > ------------------------------------------------------------------------
> > From: ingridhallen at hotmail.com
> > To: awalther at math.uni-paderborn.de
> > Date: Thu, 11 Apr 2013 09:26:23 +0200
> > CC: adol-c at list.coin-or.org
> > Subject: Re: [ADOL-C] Evaluating Hessian of Lagrangian
> > 
> > Thanks for your reply!
> > 
> > I think I will do some benchmarking then, to see which
> > of the two approaches works best for my problem.
> > 
> > The approach suggested by Norm is certainly interesting,
> > but I think I have to little experience with optimization
> > to implement it successfully. At least for the moment.
> > 
> > Sincerely,
> > 
> > Ingrid
> > 
> > 
> >> Date: Wed, 10 Apr 2013 22:51:22 +0200
> >> From: awalther at math.uni-paderborn.de
> >> To: ingridhallen at hotmail.com
> >> CC: normvcr at telus.net; adol-c at list.coin-or.org
> >> Subject: Re: [ADOL-C] Evaluating Hessian of Lagrangian
> >>
> >> Hi
> >>
> >> sorry for entering the discussion so late but I was on travel
> >> the last days and most of the time offline.
> >>
> >> If you can exploit the minor size of the reduced Hessian
> >> the approach proposed by Norm is certainly the best way to go
> >> since it requires the least number of Hessian vector product
> >> which determines the cost of the derivative calculation.
> >>
> >> However, I am not sure whether Ipopt can really exploit this.
> >> Regarding the other approaches discussed so far:
> >>
> >> We made the experience that it really depends on the application
> >> whether
> >>
> >> * tracing the Lagrangian once with x and lambda as inputs
> >> and evaluating only a part of the Hessian reusing the trace
> >> in all iterations
> >>
> >> or
> >>
> >> * retracing the Lagrangian with x as adoubles and lambda as doubles
> >> in each iteration and computing then the whole Hessian
> >>
> >> performs better in terms of runtime. You could give both approaches
> >> a try and see what works better for you. Both approaches have their
> >> pros and cons with respect to efficiency.
> >>
> >> Best regards
> >>
> >> Andrea
> >>
> >> --
> >> Prof. Dr. Andrea Walther
> >> Lehrstuhl fuer Mathematik und ihre Anwendungen
> >> Institut fuer Mathematik
> >> Universitaet Paderborn
> >> Warburger Str. 100
> >> 33098 Paderborn
> >>
> >> Email: andrea.walther at uni-paderborn.de
> >> Phone: ++49 5251 602721
> >> ++49 5251 602724 (sekr.)
> >> Fax: ++49 5251 603728
> >>
> >> **********
> >>
> >>
> >>
> > 
> > _______________________________________________ ADOL-C mailing list
> > ADOL-C at list.coin-or.org http://list.coin-or.org/mailman/listinfo/adol-c
> > 
> > 
> > _______________________________________________
> > ADOL-C mailing list
> > ADOL-C at list.coin-or.org
> > http://list.coin-or.org/mailman/listinfo/adol-c
> > 
> 
> -- 
> Dr. Kshitij Kulshreshtha
> 
> Institut für Mathematik,
> Universität Paderborn,
> Warburger Straße 100,
> 33098 Paderborn.
> 
> Büro: A3.235
> 
> Privatanschrift:
> Arnikaweg 62
> 33100 Paderborn.

_______________________________________________
ADOL-C mailing list
ADOL-C at list.coin-or.org
http://list.coin-or.org/mailman/listinfo/adol-c 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.coin-or.org/pipermail/adol-c/attachments/20130503/703c0b39/attachment.html>