[Cbc] Cbc segfault after many hours of CPU time

Mon Mar 18 13:07:54 EDT 2013

Using the "deterministic parallel" feature I have now run this problem for 
hundreds of millions of nodes, so I think we can safely say that our 
segfault problem is solved.

Needless to say, we are very interested in hearing more about the "trivial 
experimental feature"; can you supply any additional information?

Allan Wechsler

From:
John Forrest <john.forrest at fastercoin.com>
To:
acw at ascent.com
Cc:
cbc at list.coin-or.org
Date:
03/11/2013 04:44 AM
Subject:
Re: [Cbc] Cbc segfault after many hours of CPU time

Status report on segfault.

Unable to reproduce with debug version.

Have reproduced segfaults three times with optimized version.  It is 
always to do with multi-threading.  Once I could see something I didn't 
like so have modified that.  However the other two times it was not 
obvious.  Looking at registers and disassembled code I could see 
segfault but going from a few instructions back it should have worked. 
So classic overwriting due to threads.  But I can't see what is wrong 
with locking/unlocking threads to stop overwriting. Will continue 
looking slowly.

However if you use -thread 104 instead of 4 that switches on 
deterministic parallel.  This is not quite as efficient (it would be a 
lot better if the effort per node was better determined e.g. as in 
Cplex's"ticks") but does not have same problem.  That has been running a 
long time without a problem (>63 million nodes).

Using my (not in svn yet) trunk and throwing every cut at problem I can 
prove solution of  499243.8 is optimal after 23 million nodes.

However looking at problem I tried another trivial experimental feature 
on problem.  There may be bugs, but I don't think so.  That took 15,368 
nodes and 54 seconds.

John Forrest

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.coin-or.org/pipermail/cbc/attachments/20130318/7e21bd69/attachment.html>