[Coin-bcpdiscuss] Intra-node parallelization

Fri Oct 26 15:56:47 EDT 2007

Hi Ali,

On Mon, 15 Oct 2007, alikoc at mail.utexas.edu wrote:

> Hi All,
>
> I need to build a parallel branch & price code in the following way: Each of the
> available processors will work on the same node of the B&B tree, altogether.
> After finishing the current node, the master processor will assign them a new
> node (based on the node selecetion strategy) and they will work on that node
> alltogether. This goes untill all nodes are exhausted.
>
> This type of parallelization makes sense in branch & price since the big chunk
> of the work goes to solving the subproblems, which are plenty in number and
> independent of each other. Hence, each processor will be responsible from some
> number of subproblems. At each iteration of the column generation algorithm,
> the master will send the corresponding duals, and the slaves will send back the
> corresponding columns (solutions of the subproblems). At the beginning of each
> node, the master will also send the information about the branching
> restrictions.
>
> This type of parallelization (it might be in other forms, but what I mean is
> intra-node parallelization) is inevitable if the problem is so big that even
> the parallel processing (such as the one BCP suggests) is not sufficient to
> solve it to optimality. That is, we can only solve the root node of the B&B
> tree (or up to some level) to get a better lower bound than the LP bound using
> the Dantzig-Wolfe reformulation.
>
> Also, in moderate size problems even if we solve the problem to optimality, this
> type of parallelization might prove better than the inter-node parallelization.
> This happnes if the number of nodes processed during the search is small so
> that in inter-node parallelization (such as BCP's) total idle time of the
> processors at the beginning and at the end of the search constitues a big
> portion as compared to the total solution time.
>
> For this, as far as I can see, we need to be able to assign an arbitrary number
> of processors that runs the VG_user class to each single processor that runs
> LP_user class. As far as I know the code, the situation is that BCP devotes at
> most one VG processor to each LP processor (I might be wrong).
>
> Hence, my question is that, what is the easiest way to implement this in BCP?
> (Well, easiest so that I will write the code in one week and get the results in
> the next week and hopefully present in INFORMS conference :))

I guess the answer is a bit late for that... And in any case you are right, 
BCP is designed with one VG process per LP process. In the trunk I'm currently 
experimenting with dynamic allocation of processes, though at this point I'm 
using a good chunk of the processes for strong branching. However, that code 
is not quite stable yet, and it broke cut & column generation. I need to move 
those to demand based allocation as well. Once that's done you'll be able to 
do what you want now, but it'll take a couple of months before I'll have to 
time to get it working properly...

Of course, you can always fire up a large MPI world but tell BCP that it has 
only a few procs, then communicate yourself with the other procs. Not fun, but 
does work...

Good luck,
--Laci

>
> Thanks so much,
>
> Ali
> _______________________________________________
> Coin-bcpdiscuss mailing list
> Coin-bcpdiscuss at list.coin-or.org
> http://list.coin-or.org/mailman/listinfo/coin-bcpdiscuss
>