[Coin-symphony] symphony pvm goes back to unithread

Ted Ralphs tkralphs at lehigh.edu
Thu Aug 4 18:45:08 EDT 2005


Alexandre Le Bouthillier wrote:
> After around 24h-48h of running 
> ./vrp_m_tm_cp -p 11 -F /home/$USER/data/r112.vrp -N 9 -u 833
> 
> only 1 thread remains (the last one of the pbs job).  The others stop from 
> sending any info and the vrp process doesn't appear anymore on the process of 
> other machines.
> 
> I have no error messages in the log file, pvm still have the conf for all 11 
> nodes on pvm and no error messages on the machine.  I've have try with 
> another end node with the same problem.
> 
> Is this normal, any clue ? 

Something strange is definitely going on here. Looking at your output,
the results are supposed to be reported every 600 seconds, but after
around 88K seconds, there is a large gap of almost 3000 seconds before
the next report, after which the number of candidates stays almost
constant, dropping by one or two each report. The reporting intervals
goes down after that, but is still around 900 seconds for the next
couple of reports. To be honest, I can't really come up with a
reasonable explanation for this behavior. If the candidate list were
empty or you had set a time limit, then that might explain some of it,
but as it is, I'm at a loss. If you can give me some more information,
such as more output or an indication of whether the worker processes
exited normally, that would help. I realize that increasing the
verbosity level may result in a huge output file, but if it's possible,
that might give a clue. You could also possibly turn on the warm
starting or logging features, so that the procedure might be stopped
prior to when the strange behavior starts and then restarted with a
higher verbosity level. Let me know if you need any help.

Cheers,

Ted
-- 
Dr. Ted Ralphs
Assistant Professor
Industrial and Systems Engineering
Lehigh University
(610)758-4784
tkralphs at lehigh.edu
www.lehigh.edu/~tkr2



More information about the Symphony mailing list