[Coin-discuss] Clp 1.6.0 fails to make visible progress on a large model

Sebastian Nowozin nowozin at gmail.com
Tue Mar 4 06:54:40 EST 2008


Hello everybody,

for a machine learning multiclass classification model we use a large 
linear program to estimate optimal model parameters explaining given 
training data.  The model variables are highly coupled in the sense that 
for most variables there exist a constraint coupling any pair of variables.

As we do not have the training data yet, I generated models with random 
data for testing my code and the solver.  The actual data might behave 
much better due to redundancy and simplicity, however we plan to solve 
problems roughly ten times the size of the problem below (so the final 
nnz would be approximately 100 million).  I generate the model directly 
in a .cpp file, but write a MPS file for debugging purposes.

However, it seems COIN Clp 1.6.0 shipped with Osi 0.97.1 is unable to 
solve the problem.

I put two .mps.bz2 files of structurally identical problems here:
   http://user.cs.tu-berlin.de/~nowozin/mps/
The "easyone.mps" is small and all solvers I tried solve it without 
problems.  The "toughone.mps" is quite large (99100 rows, 7401 columns, 
~12.9 million nnz, you need 1GB free RAM to load/solve the model).

The output of Clp looks like this, where the number of infeasibilities 
does not seem to decrease.

Clp:import toughone.mps
At line 1 NAME          BLANK
At line 2 ROWS
At line 99104 COLUMNS
At line 6541006 RHS
At line 6541057 BOUNDS
At line 6541059 ENDATA
Problem BLANK has 99100 rows, 7401 columns and 12876400 elements
Model was imported from ./toughone.mps in 7.65248 seconds
Clp:solve
Dual of model has 7401 rows and 99100 columns
Coin0506I Presolve 7401 (0) rows, 99100 (0) columns and 12876400 (0) 
elements
Clp0014I Perturbing problem by 0.001 % of 0.73339 - largest nonzero 
change 3.02687e-05 (% 0.0846218) - largest zero change 9.99969e-05
Clp0006I 0  Obj -482.729 Primal inf 2.39382e+06 (6401)
Clp0006I 223  Obj -482.592 Primal inf 1.54823e+07 (4660)
Clp0006I 446  Obj -482.52 Primal inf 6.22071e+07 (4180)
Clp0006I 669  Obj -482.488 Primal inf 3.04425e+08 (3923)
Clp0006I 892  Obj -482.471 Primal inf 6.82636e+07 (4045)
Clp0006I 1115  Obj -482.458 Primal inf 1.77363e+08 (4052)
Clp0006I 1338  Obj -482.443 Primal inf 1.39113e+08 (4286)
Clp0006I 1561  Obj -482.43 Primal inf 2.07872e+08 (4257)
Clp0006I 1784  Obj -482.419 Primal inf 6.22298e+07 (4334)
Clp0006I 2007  Obj -482.405 Primal inf 1.17816e+08 (4379)
Clp0006I 2230  Obj -482.393 Primal inf 2.05141e+08 (4363)
Clp0006I 2453  Obj -482.381 Primal inf 7.36075e+07 (4400)
Clp0006I 2676  Obj -482.368 Primal inf 7.92307e+07 (4414)
Clp0006I 2899  Obj -482.356 Primal inf 1.06462e+08 (4462)
Clp0006I 3122  Obj -482.343 Primal inf 1.53225e+08 (4488)
Clp0006I 3345  Obj -482.33 Primal inf 1.09378e+08 (4496)
Clp0006I 3568  Obj -482.319 Primal inf 1.42576e+08 (4505)
Clp0006I 3791  Obj -482.308 Primal inf 4.68459e+08 (4677)
Clp0006I 4014  Obj -482.298 Primal inf 1.60121e+08 (4581)
Clp0006I 4237  Obj -482.289 Primal inf 4.27189e+07 (4629)
Clp0006I 4460  Obj -482.277 Primal inf 6.13483e+07 (4631)
Clp0006I 4683  Obj -482.267 Primal inf 5.522e+07 (4790)
Clp0006I 4906  Obj -482.254 Primal inf 4.94801e+07 (4755)
Clp0006I 5086  Obj -482.244 Primal inf 4.86999e+08 (4802)
Clp0006I 5269  Obj -482.236 Primal inf 1.8662e+09 (4747)
Clp0006I 5352  Obj -482.23 Primal inf 3.93626e+07 (4805)
Clp0006I 5575  Obj -482.217 Primal inf 6.28133e+07 (4862)
Clp0006I 5798  Obj -482.205 Primal inf 4.68871e+07 (4840)
Clp0006I 6021  Obj -482.192 Primal inf 3.12843e+08 (4807)
Clp0006I 6133  Obj -482.184 Primal inf 4.45966e+07 (4830)
Clp0006I 6356  Obj -482.172 Primal inf 5.81011e+07 (4898)
Clp0006I 6579  Obj -482.161 Primal inf 6.83596e+07 (4901)
Clp0006I 6693  Obj -482.156 Primal inf 3.40449e+08 (4926)
Clp0006I 6916  Obj -482.143 Primal inf 7.19014e+07 (4930)
Clp0006I 7135  Obj -482.129 Primal inf 1.2956e+08 (4975)
Clp0006I 7358  Obj -482.118 Primal inf 1.05719e+08 (5062)
Clp0006I 7581  Obj -482.106 Primal inf 7.27903e+07 (4956)
...

I tried the primal simplex, dual simplex and barrier methods with very 
similar behaviour.  Due to the large values, it seems there may be a 
scaling problem.  One nice thing I noted though is the low memory usage 
of Clp.

GLPK 4.16 nibbled away the whole night on the problem, consistently 
decreasing the number of infeasibilities (I assume it would have solved 
the problem eventually, but I aborted after 95,000 iterations and around 
10 hours of computation time).

I also tried Ilog Cplex 9.1.0, which seems to make consistent progress 
in solving the problem, but now is at ~15,000 seconds and iteration 
143,500 and still runs.  I will save the optimal solution once obtained, 
if you are interested.


I know the above problem is difficult, but it would be great if we could 
use Clp to solve it.  Do you have advice on pivoting rules, scaling and 
perturbations I should try?  I am willing to give up some accuracy for 
fast approximate solutions, as variations of the problem will be solved 
many times (for model selection and estimating generalization 
performance of the resulting classifier).  Also, the achieved objective 
is more important than the variable values, as the objective has a 
direct geometric interpretation.

Would it help if I provide a good initial basic solution? (As I do not 
know how to generate a basis for it I would just add constraints fixing 
all variables, solve, then remove the constraints and resolve.  Does 
this make sense or is there a better way?)

Thanks,
Sebastian



More information about the Coin-discuss mailing list