[Csdp] A mystery

Wed Aug 5 16:48:55 EDT 2009

Hi, all,

Here's a problem I don't understand.  Consider the sparse SDP file

7
2
3       -4
1.25    1.25    -0.5    1       1       0       1       
0       1       1       1       -1
0       1       1       2       1
1       1       1       1       -1
1       1       1       2       -1.5
1       2       1       1       1
2       1       1       1       -1
2       1       1       2       1.5
2       2       2       2       1
3       1       1       1       -1
3       2       3       3       1
4       1       1       1       1
4       2       4       4       1
5       1       2       2       1
6       1       2       3       1
7       1       3       3       1

When I run csdp, or my own multi-precision version csdpgmp, I get for
the solution file

0  2/3  1/3  0  0  0  0 
1  2  2  2  2/3 
1  2  3  3  1/3 
2  1  1  1  1/2 
2  1  1  2  7/12 
2  1  2  2  1 
2  1  3  3  1 
2  2  1  1  7/2 
2  2  4  4  1/2 

(of course, **I** have translated the numbers to fractions, and chopped
off the almost-zero's to zero, thereby dropping them from the sparse
format.)

Thus the X system matrix (the first one) is

0  0  0
0  0  0
0  0  0
        0  0  0  0
        0 2/3 0  0
        0  0 1/3 0
        0  0  0  0

and the Y system matrix (the second one) is

1/2  7/12   0
7/12  1     0
 0    0     1
              7/2  0   0   0
               0   0   0   0
               0   0   0   0
               0   0   0  1/2

The duality gap is 0.

Fine.  BUT THAT'S **NOT** THE OPTIMUM.   The CORRECT .sol file would be

0  2/3  1/3  0  0  0  0 
1  2  2  2  2/3
1  2  3  3  1/3
2  1  1  1  1/2 
2  1  1  2  7/12
2  1  1  3  Sqrt[23]/12
2  1  2  2  1 
2  1  3  3  1 
2  2  1  1  7/2 
2  2  4  4  1/2 

(this differs from the incorrect SOL file only by the sqrt(23)/12 line),
so that the X system matrix SHOULD be

0    0    0
0    0    0
0    0    0
             0    0    0    0
             0   2/3   0    0
             0    0   1/3   0
             0    0    0    0

(which it was!) and the Y matrix SHOULD be

1/2            7/12    sqrt(23)/12
7/12            1           0
sqrt(23)/12     0           1
                                7/2  0   0   0
                                 0   0   0   0
                                 0   0   0   0
                                 0   0   0  1/2

This differs from the returned Y only by the sqrt(23)/12 terms.

For both the phony and true solutions we have

    2/3 F[1] + 1/3 F[2] - F[0] = X >= 0

(the eigenvalues of X are 0, 1/3, 2/3), while for the phony solution we
have

       Tr[F[0].Y] = 1/2    (wrong!)

and for the true solution

       Tr[F[0].Y] = 2/3    (larger!--and correct)

My question is:  WHY do the SDP solvers return a suboptimal solution?  I
have several proposals, below, but I am profoundly disturbed by this
behavior.

Here's where the problem comes from:  I want to find A point of the
intersection 

    (x+3/2)^2 + y^2 >= 1,
    (x-3/2)^2 + y^2 >= 1,
       x^2    + y^2 >= 1/2
       x^2    + y^2 <= 1

(in R^2) which is closest to (1,0).  (The region is non-convex, even
disconnected, so uniqueness is not guaranteed.)  Now it's easy to see
geometrically (look at a picture) that there are two such points, at x =
7/12 and y = +- sqrt(23)/12.  Mathematica solves the problem instantly
with its Minimize function.  (Well, it misses y = sqrt(23)/12.)

The way I encoded this into an SDP problem is to consider THREE UNKNOWN
POINTS, X = (x,y), (1,0) and (0,1).  How can (1,0) and (0,1) be
unknown? 

Because I consider them to be generic unknowns, call them e1 and e2, and
I place constraints on them:

     <e_i, e_j> = Kronecker-delta(i,j).

The system matrix is the Gramian of these three unknowns X, e1, e2.
Note that I can then recover x as <X,e_1> and y as <X,e_2>  This is a
general-purpose way of introducing LINEAR terms into a pure-quadratic
problem.

I then seek to maximize -|X-u|^2 (for u = (1,0)) subject to the
constraints  given by the inequalities above, and the Kronecker
constraints.  The system matrix is the Gramian of X, e1, e2, and at a
solution will look like

        X      e1     e2
      ------------------
   X  | |X|^2   x      y
   e1 |  x      1      0
   e2 |  y      0      1

This is positive semi-definite iff |X|^2 >= x^2 + y^2.  

Now it SHOULD be that |X|^2 = x^2 + y^2; if so, everything would be
fine.  I **expected** that sometimes |X|^2 > x^2 + y^2, and that does
happen, as in my current example.

I **expected** that this meant the solution "lives" in 3D, not 2D.  (In
other words, the disks I'm intersecting were actually spheres, and the
closest point was some overhead point whose projection into 2D was the
result of the computed solution.)  But that's not what happens.  (Maybe
it does in other problems, but not here.)

Now, it might seem that there are strictly feasible points for this
problem.  After all, the region is the intersection of one ball and the
complements  of 3 others, and has nonempty interior.  This would
guarantee a duality gap of 0 and convergence of CSDP's algorithm.

Unfortunately, the constraints on e1 and e2 are EQUALITY constraints,
and while there are lots of possible solutions, I don't think there are
any strictly feasible ones.  But that seems to be a copout; how could
you EVER have strictly feasible solutions in the presence of linear
EQUALITY constraints?  The problem CONVERGES QUICKLY, has a duality gap
of 0, the only problem is IT PRODUCES THE WRONG ANSWER.

Note that the gradient of the objective function points along the x-axis
at any point on the x-axis.  Since I start at (1,0), on the x-axis,
perhaps the algorithm is following it ONLY ALONG the x-axis.  This may
explain why the problem doesn't converge.  This phenomenon would be akin
to Newton's method not finding complex roots of a polynomial when you
start on the real line.

Can anybody give a better explanation?  BTW, the method works
beautifully EXCEPT for points u on the x-axis (that is, maximizing 
-|x-u|^2).  It also works on the x-axis if you first perturb the point
OFF the axis, solve THAT problem, and use the solution file as an input
guess to CSDP for the point ON the x-axis.  This is kind of like the old
linear programming paradigm, If it cycles, perturb the point and retry.

But this is still not perfect.  For example, suppose I first solve the
problem for u = (1,0.000001).  CSDP solves it easily, and gives the
correct solution.

I then switch the SDP file to use u = (1,0) (this only involves a change
in the objective function), and use the solution file for u =
(1,0.000001) as the initial .sol file.  CSDP complains that "Z is
singular" but returns a reasonably correct file.  If I do the same
process using my csdpgmp to 256 bits precision, it solves for u =
(1,0.000001) no problem; but in the second stage complains about a
"Double line search failure in corrector step", and
only returns 7 digits of precision.  (However, fully correct.)

Anyway, I don't understand why any of this SHOULD work.  SDP, as I
understand the paradigm, is a CONVEX programming method.  Here I'm
trying to find the closest point of a decidedly NON-CONVEX region.  And
yet it IS a convex problem, since the constraints and objective function
are linear functions of the inner products X, e1, e2  with themselves.
Puzzlement.

-- Ron Bruck

Oh, yeah.  So it's clear how to find Y when you know the "true"
solution.  How do you find X?  I managed to find it by perturbing u and
guessing the exact values from the solution matrix.  But how would you
do it in general?

It seems to me that if you KNOW a feasible solution, that should help
you construct X and Y (or Y and Z, the terminology is always confusing).

BTW, unless I am misunderstanding something, sdpa_gmp DOESN'T GIVE YOU
the multi-precision solution; it seems to be aimed at just guaranteeing
the solution is accurate TO STANDARD PRECISION.  I see no way to recover
information to anything more than standard precision.  I much prefer my
csdpgmp, flawed as it is.