[Cbc] Cbc segfault after many hours of CPU time
acw at ascent.com
acw at ascent.com
Mon Mar 18 13:07:54 EDT 2013
Using the "deterministic parallel" feature I have now run this problem for
hundreds of millions of nodes, so I think we can safely say that our
segfault problem is solved.
Needless to say, we are very interested in hearing more about the "trivial
experimental feature"; can you supply any additional information?
Allan Wechsler
From:
John Forrest <john.forrest at fastercoin.com>
To:
acw at ascent.com
Cc:
cbc at list.coin-or.org
Date:
03/11/2013 04:44 AM
Subject:
Re: [Cbc] Cbc segfault after many hours of CPU time
Status report on segfault.
Unable to reproduce with debug version.
Have reproduced segfaults three times with optimized version. It is
always to do with multi-threading. Once I could see something I didn't
like so have modified that. However the other two times it was not
obvious. Looking at registers and disassembled code I could see
segfault but going from a few instructions back it should have worked.
So classic overwriting due to threads. But I can't see what is wrong
with locking/unlocking threads to stop overwriting. Will continue
looking slowly.
However if you use -thread 104 instead of 4 that switches on
deterministic parallel. This is not quite as efficient (it would be a
lot better if the effort per node was better determined e.g. as in
Cplex's"ticks") but does not have same problem. That has been running a
long time without a problem (>63 million nodes).
Using my (not in svn yet) trunk and throwing every cut at problem I can
prove solution of 499243.8 is optimal after 23 million nodes.
However looking at problem I tried another trivial experimental feature
on problem. There may be bugs, but I don't think so. That took 15,368
nodes and 54 seconds.
John Forrest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.coin-or.org/pipermail/cbc/attachments/20130318/7e21bd69/attachment.html>
More information about the Cbc
mailing list