[Cbc] Potential race hazard?

Mon Mar 24 14:08:38 EDT 2014

Hans,

Looking at output it may be a bug to do with maximization.

Can you reformulate the problem as a minimization problem and see if it 
works?

What mps file are you using?  I can see if I can reproduce problem.

John Forrest
On 24/03/14 16:29, hmi21 at cantab.net wrote:
> Hi John,
>
> Thank you for the quick feedback.
> I run the example mps file with the following command on the standalone
> solver: ./cbc racehazard.mps max cuts off Heur off preprocess off threads
> 6 solve
> (I've tried on machines with different numbers of cores and the issue is
> reproducible on all of them)
> With these options, CBC should not stop the search on nodes or time, so I
> don't think that's the issue.
>
> I do get a different number of nodes when I rerun the problem, so I guess
> it's using the older form of parallelisation.
>
> I've had another look at the log output when the wrong solution is found
> (see below). It seems like that the iterations are stopped because the
> solution found is within a certain gap tolerance (Cbc0011I Exiting as
> integer gap of -642429.83 less than 1e-10 or 0%). When the right solution
> is found, this doesn't happen. Cbc completes the search and exits
> (Cbc0001I Search completed - best objective -1488861, took 203 iterations
> and 56 nodes (0.03 seconds)).
> I'm happy to have a look at the source code to see what's going on. Where
> would be the best place to start?
>
> Thanks,
>
> Hans
>
>
> ----- LOG FOR WRONG COMPUTATION -----
>
> command line - ./cbc racehazard.mps max cuts off Heur off preprocess off
> threads 6 solve (default strategy 1)
> At line 1 NAME          BLANK     FREE
> At line 2 ROWS
> At line 29 COLUMNS
> At line 1523 RHS
> At line 1548 BOUNDS
> At line 1719 ENDATA
> Problem BLANK has 25 rows, 170 columns and 1325 elements
> Coin0008I BLANK read with 0 errors
> Option for cutsOnOff changed from on to off
> Option for heuristicsOnOff changed from on to off
> Option for preprocess changed from sos to off
> threads was changed from 0 to 6
> Continuous objective value is 1.53556e+06 - 0.00 seconds
> Cutoff increment increased from 1e-05 to 0.9999
> Cbc0016I Integer solution of -447863 found by strong branching after 0
> iterations and 0 nodes (0.00 seconds)
> Cbc0010I After 0 nodes, 1 on tree, -447863 best solution, best possible
> -1535555 (0.00 seconds)
> Cbc0016I Integer solution of -1469650 found by strong branching after 23
> iterations and 3 nodes (0.00 seconds)
> Cbc0004I Integer solution of -824465 found after 40 iterations and 7 nodes
> (0.01 seconds)
> Cbc0030I Thread 0 used 3 times,  waiting to start 0.0019979477,  15 locks,
> 7.1048737e-05 locked, 6.5326691e-05 waiting for locks
> Cbc0030I Thread 1 used 4 times,  waiting to start 0.0027656555,  21 locks,
> 0.0002720356 locked, 4.529953e-06 waiting for locks
> Cbc0030I Thread 2 used 2 times,  waiting to start 0.0029046535,  9 locks,
> 1.6689301e-05 locked, 0.0001001358 waiting for locks
> Cbc0030I Thread 3 used 2 times,  waiting to start 0.003194809,  10 locks,
> 1.5258789e-05 locked, 2.1457672e-06 waiting for locks
> Cbc0030I Thread 4 used 1 times,  waiting to start 0.003333807,  4 locks,
> 1.1205673e-05 locked, 9.3221664e-05 waiting for locks
> Cbc0030I Thread 5 used 0 times,  waiting to start 0.0035748482,  0 locks,
> 0 locked, 0 waiting for locks
> Cbc0030I Main thread 0.0021567345 waiting for threads,  30 locks,
> 2.2649765e-05 locked, 2.7418137e-05 waiting for locks
> Cbc0011I Exiting as integer gap of -642429.83 less than 1e-10 or 0%
> Cbc0001I Search completed - best objective -1469650, took 45 iterations
> and 9 nodes (0.01 seconds)
> Cbc0032I Strong branching done 54 times (346 iterations), fathomed 0 nodes
> and fixed 7 variables
> Cbc0035I Maximum depth 3, 0 variables fixed on reduced cost
> Cuts at root node changed objective from -1.53556e+06 to -1.53556e+06
>
> Result - Optimal solution found (within gap tolerance)
>
> Objective value:                1469650.00000000
> Lower bound:                    1469650.000
> Gap:                            0.00
> Enumerated nodes:               9
> Total iterations:               45
> Time (CPU seconds):             0.01
> Time (Wallclock seconds):       0.01
>
> Total time (CPU seconds):       0.01   (Wallclock seconds):       0.01
>
>
>
> ----- LOG FOR RIGHT COMPUTATION -----
>
> command line - ./cbc racehazard.mps max cuts off Heur off preprocess off
> threads 6 solve (default strategy 1)
> At line 1 NAME          BLANK     FREE
> At line 2 ROWS
> At line 29 COLUMNS
> At line 1523 RHS
> At line 1548 BOUNDS
> At line 1719 ENDATA
> Problem BLANK has 25 rows, 170 columns and 1325 elements
> Coin0008I BLANK read with 0 errors
> Option for cutsOnOff changed from on to off
> Option for heuristicsOnOff changed from on to off
> Option for preprocess changed from sos to off
> threads was changed from 0 to 6
> Continuous objective value is 1.53556e+06 - 0.00 seconds
> Cutoff increment increased from 1e-05 to 0.9999
> Cbc0016I Integer solution of -447863 found by strong branching after 0
> iterations and 0 nodes (0.00 seconds)
> Cbc0010I After 0 nodes, 1 on tree, -447863 best solution, best possible
> -1535555 (0.00 seconds)
> Cbc0016I Integer solution of -1469650 found by strong branching after 23
> iterations and 3 nodes (0.00 seconds)
> Cbc0004I Integer solution of -824465 found after 44 iterations and 8 nodes
> (0.01 seconds)
> Cbc0016I Integer solution of -1414633 found by strong branching after 43
> iterations and 7 nodes (0.01 seconds)
> Cbc0004I Integer solution of -1471127 found after 155 iterations and 43
> nodes (0.02 seconds)
> Cbc0038I Full problem 25 rows 170 columns, reduced to 22 rows 42 columns
> Cbc0044I Reduced cost fixing - 22 rows, 42 columns - restarting search
> Cbc0012I Integer solution of -1471127 found by Previous solution after 0
> iterations and 0 nodes (0.03 seconds)
> Cbc0010I After 0 nodes, 1 on tree, -1471127 best solution, best possible
> -1492016.7 (0.03 seconds)
> Cbc0012I Integer solution of -1488861 found by rounding after 1 iterations
> and 1 nodes (0.03 seconds)
> Cbc0030I Thread 0 used 3 times,  waiting to start 0.0013997555, 0 cpu
> time, 17 locks, 2.1934509e-05 locked, 2.6226044e-06 waiting for locks
> Cbc0030I Thread 1 used 1 times,  waiting to start 0.0020201206, 0 cpu
> time, 4 locks, 6.1988831e-06 locked, 7.1525574e-07 waiting for locks
> Cbc0030I Thread 2 used 0 times,  waiting to start 0.0020046234, 0 cpu
> time, 0 locks, 0 locked, 0 waiting for locks
> Cbc0030I Thread 3 used 0 times,  waiting to start 0.0018107891, 0 cpu
> time, 0 locks, 0 locked, 0 waiting for locks
> Cbc0030I Thread 4 used 0 times,  waiting to start 0.0016248226, 0 cpu
> time, 0 locks, 0 locked, 0 waiting for locks
> Cbc0030I Thread 5 used 0 times,  waiting to start 0.0014903545, 0 cpu
> time, 0 locks, 0 locked, 0 waiting for locks
> Cbc0030I Main thread 0.00075554848 waiting for threads,  13 locks,
> 5.4836273e-06 locked, 2.1457672e-06 waiting for locks
> Cbc0001I Search completed - best objective -1488861, took 9 iterations and
> 4 nodes (0.03 seconds)
> Cbc0032I Strong branching done 20 times (32 iterations), fathomed 2 nodes
> and fixed 0 variables
> Cbc0035I Maximum depth 1, 32 variables fixed on reduced cost
> Cbc0012I Integer solution of -1488861 found by Reduced search after 198
> iterations and 54 nodes (0.03 seconds)
> Cbc0030I Thread 0 used 13 times,  waiting to start 0.013705492,  66 locks,
> 0.00017166138 locked, 0.00041985512 waiting for locks
> Cbc0030I Thread 1 used 12 times,  waiting to start 0.014533997,  61 locks,
> 0.000644207 locked, 0.00022602081 waiting for locks
> Cbc0030I Thread 2 used 11 times,  waiting to start 0.014764071,  57 locks,
> 0.00032305717 locked, 1.0967255e-05 waiting for locks
> Cbc0030I Thread 3 used 6 times,  waiting to start 0.015071154,  29 locks,
> 8.5830688e-05 locked, 0.00019359589 waiting for locks
> Cbc0030I Thread 4 used 7 times,  waiting to start 0.01521945,  34 locks,
> 7.1048737e-05 locked, 6.3419342e-05 waiting for locks
> Cbc0030I Thread 5 used 3 times,  waiting to start 0.015607119,  15 locks,
> 9.3460083e-05 locked, 0.00033569336 waiting for locks
> Cbc0030I Main thread 0.0040099621 waiting for threads,  117 locks,
> 8.0347061e-05 locked, 0.00083732605 waiting for locks
> Cbc0001I Search completed - best objective -1488861, took 203 iterations
> and 56 nodes (0.03 seconds)
> Cbc0032I Strong branching done 128 times (469 iterations), fathomed 19
> nodes and fixed 8 variables
> Cbc0035I Maximum depth 9, 326 variables fixed on reduced cost
> Cuts at root node changed objective from -1.53556e+06 to -1.53556e+06
>
> Result - Optimal solution found
>
> Objective value:                1488861.00000000
> Enumerated nodes:               56
> Total iterations:               203
> Time (CPU seconds):             0.03
> Time (Wallclock seconds):       0.03
>
> Total time (CPU seconds):       0.04   (Wallclock seconds):       0.03
>
>
> ------------------------------------------------------
>
>> Hans,
>>
>>
>> Cbc has two parallel modes. The older and more mature method just passes
>> a subproblem to an available thread. This is efficient but is not totally
>> deterministic. It should always get the optimal solution if the complete
>> tree is searched, but may take a different number of nodes if re-run. If
>> the run is stopped on nodes or time, then the best solution returned could
>> be different. So if you are completing the search and getting different
>> solutions then that is a bug.
>>
>> The second method, which is not used as often, involves giving each
>> thread a certain amount of work and waiting for all threads to complete
>> before continuing. As not enough effort has gone into measuring the
>> "work", this is not as efficient. This should give the same results when
>> re-run.
>>
>> John Forrest
>> On 24/03/14 14:01, hmi21 at cantab.net wrote:
>>
>>> Hi,
>>>
>>>
>>> First of all a big thanks to everyone developing and maintaining CBC.
>>> It?s
>>> an incredibly powerful tool. What I find extremely useful is that CBC has
>>> been parallelised and runs large problems utilising all cores on a
>>> multicore machine. The other day, however, I?ve come across a series of
>>> problems that when rerun on a parallelised version of CBC provide
>>> different results (objective values). When run in a single thread, CBC
>>> finds the same result consistently. I?ve posted one of the problems as
>>> well as a script to run it on CBC?s bug tracking system, but haven?t had
>>> a response (https://projects.coin-or.org/Cbc/ticket/147). Note that all
>>> heuristics and cuts are turned off when running the problem. I?d like to
>>> know whether a) this is indeed a bug or just me not using the library
>>> correctly and b) how mature the parallelised code of CBC is.
>>>
>>> Any help would be appreciated!
>>>
>>>
>>> Many thanks,
>>>
>>>
>>> Hans
>>>
>>>
>>> _______________________________________________
>>> Cbc mailing list
>>> Cbc at list.coin-or.org
>>> http://list.coin-or.org/mailman/listinfo/cbc
>>>
>>>
>>>
>>
>>
>> ------------------------------
>>
>>
>> _______________________________________________
>> Cbc mailing list
>> Cbc at list.coin-or.org
>> http://list.coin-or.org/mailman/listinfo/cbc
>>
>>
>>
>> End of Cbc Digest, Vol 81, Issue 16
>> ***********************************
>>
>>
>
> _______________________________________________
> Cbc mailing list
> Cbc at list.coin-or.org
> http://list.coin-or.org/mailman/listinfo/cbc
>
>