[Ipopt] Using IPOPT to solve a very large and very nonlinear system of equations

Mon Aug 3 20:54:16 EDT 2015

Dear All:

I am hoping to use IPOPT to solve a very large and very nonlinear system of
equations. I am hoping that you can advise me as to the best approach.
Beneath my signature, I describe the problem, describe the four approaches
that I have or am considering, and finally ask several questions. Any
advice that you can provide me would be very much appreciated.

Thanks so much,
Michael Mwangi
Assistant Professor
Department of Biochemistry and Molecular Biology
Penn State University

*THE PROBLEM*

I would like to solve a system of equations:

g1(x1, x2, ..., xn) = 0
g2(x1, x2, ..., xn) = 0
.
.
.
gm(x1, x2, ..., xn) = 0

l <= xi <= u for all i

where:

   - n ~ 1,000,000
   - m ~ 10,000
   - gi depends on only a subset of ~10,000 of the ~1,000,000 variables for
   all i so that the Jacobian of the gi has only ~m x 10^4 ~ 10^4 x 10^4 ~
   10^8 nonzero elements
   - gi is a highly nonlinear function of the ~10,000 variables so that the
   second-order partial derivatives of gi with respect to the ~10,000
   variables are mostly nonzero for all i
   - gi and gj depend on disjoint subsets of variables when |j - i| >~ 100
   so that the Jacobian of the gi has a block-like form

*APPROACH 1: SOLVE A LEAST SQUARES FORMULATION USING L-BFGS-B*

I tried to solve a least squares formulation of the problem

minimize f = (g1)^2 + (g2)^2 + ... + (gm)^2

l <= xi <= u for all i

using the L-BFGS-B solver here <http://www.chokkan.org/software/liblbfgs/>
that only permits the simple constraints l <= xi <= u.

This approach fails because it always yields a local but not global minimum
of f for which some of the (gi)^2 are unacceptably large.

*APPROACH 2: SOLVE THE LEAST SQUARES FORMULATION IN APPROACH 1 USING IPOPT*

I am thinking of using the IPOPT solver here
<http://www.coin-or.org/Ipopt/documentation/node23.html> to solve the same
exact problem as in approach 1.

*APPROACH 3: SOLVE A MORE CONSTRAINED LEAST SQUARES FORMULATION USING IPOPT*

I am thinking of using the IPOPT solver to solve a more constrained least
squares formulation of the problem:

minimize f = (g1)^2 + (g2)^2 + ... + (gm)^2

l <= xi <= u for all i

-e <= gi <= e for all i where e >= 0 is some small error tolerance

My hope is that the additional constraints -e <= gi <= e force the solver
to find a better solution.

*APPROACH 4: SOLVE THE ORIGINAL SYSTEM OF EQUATIONS USING IPOPT*

I am thinking of using the IPOPT solver to solve the original system of
equations:

minimize f = 0

l <= xi <= u for all i

0 <= gi <= 0 for all i

Thus, the objective function f is trivially zero, and the problem is
encapsulated in the constraint functions.

*COMMENTS ABOUT APPROACHES 2 - 4*

As already noted, the Jacobian of the gi has only ~10^8 nonzero elements
and has a block-like form.

As already noted, the gi are highly nonlinear functions of different
subsets of ~10,000 of the total of ~1,000,000 variables so that the
second-order partial derivatives of a gi with respect to the ~10,000
variables are mostly nonzero. This means the Hessian in Eq. (9) here
<http://www.coin-or.org/Ipopt/documentation/node22.html#eq:IpoptLAG> is
very dense with ~10^6 x 10^6 ~ 10^12 nonzero elements. Therefore, the
limited-memory quasi-Newton approximation will need to be used by
setting hessian_approximation = "limited-memory"
<http://www.coin-or.org/Ipopt/documentation/node52.html#SECTION0001113010000000000000>
.

I will be doing the computations on a workstation with 120 GB of RAM, so
memory should not be an issue.

*MY QUESTIONS FOR YOU*

I have the following questions for you:

   1. Will approach 2 be much slower than approach 1 even though the
   approaches are identical except that the IPOPT solver is being used in
   place of the L-BFGS-B solver? Since IPOPT will be using the
   same limited-memory quasi-Newton approximation as in L-BFGS-B, I imagine
   that the two running times will be similar. However, IPOPT is designed to
   solve more general problems than L-BFGS-B, so maybe there is substantial
   overhead due to extra code.
   2. Do you expect approach 3 to yield a better solution than approach 2
   [i.e. smaller (gi)^2], and do you expect approach 4 to yield a better
   solution than approaches 3 and 2? My hope is that the additional
   constraints will result in a better solution.
   3. Do you expect approach 3 to be faster than approach 2, and do you
   expect approach 4 to be faster than approaches 3 and 2? I am hoping that
   the additional constraints reduce the search space and result in quicker
   convergence and hence a reduction in the running time.
   4. Is IPOPT appropriate for the types of problems described?
   5. Any other recommendations?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.coin-or.org/pipermail/ipopt/attachments/20150803/e7bd92e7/attachment.html>