[CHiPPS] Fwd: Questions on Blis

Fri Mar 13 13:41:44 EDT 2009

On Fri, 2009-03-13 at 14:38 +0100, Alessandro Tiberi wrote:
> Ted Ralphs ha scritto:
> > We have been testing BLIS with some fairly large instances recently,
> > and have also been running into memory problems on long runs with
> > small numbers of processors. How many rows do your instances have? How
> > many nonzeros?
> These are the average stats of our problematic instances:
> - 500K rows
> - 1M columns
> - 2.5M non zero elements
> 
> We are running Blis (mpi built) on a single machine with 16Gb Ram and 
> two quadcore (opteron), so that's a total
> of eight cores. We tried several  degrees of parallelism (# of process), 
> from 2 to 8. The bottom line seems to be
> that the more processes we use, the sooner it runs out of memory. 

That makes sense, given that CHiPPS is currently architected for a
distributed environment.  When Ted says that more processes can
alleviate the memory issue, he is thinking of distributing the processes
among different nodes in a cluster or grid, each with its own memory.
In that world, each process has some duplicate information, so loading
all threads on a single node will take more memory than running a single
thread.

> However, from Alp's log, it appears that  relatively few nodes are
> queued (about 13K nodes with 2 processes running and 7K with 6.)
> (btw, I have been told that famous commercial software has no problem 
> with these instances even if running on much lower profile platforms 
> although I did not personally try
> it)

Well, CHiPPS is a development code at this point, so there are bound to
be lots of opportunities to improve it!

The famous commercial software, however, is architected for SMP, so
aside from possibly being more memory efficient in each thread, it
doesn't have the memory issues that you see running a distributed code
in a single address space.

> 

> >  What is the platform you are running it on? We are also
> > looking at ways to reduce the memory footprint. Of course,
> > distributing the computation over more processors works to some
> > degree.
> Concerning this, I guess that when using more  processes we run out of 
> memory earlier just because there is a certain unavoidable
> amount of data duplicated, right?

Right, see above.

> >  I never really thought about the idea of using standard
> > compression software to compress the node description. I would be
> > surprised if it had much effect in most cases, since the data are
> > already compressed by the use of sparse vectors and matrices
> > everywhere and also by storing subtrees by differencing, etc.
> I do not have a very strong feeling about this (On the one hand even if 
> vectors are sparse they eventually are just a sequence of numbers, and 
> hopefully they are far enough from being uniformly distributed, so we 
> should gain something from compression...).

That's likely to be an empirical question, but probably worth a try.
Writing idle subtrees to disk is also definitely worth a try.

> However I am not ever sure that node description is the right thing to 
> compress... Any idea is greatly appreciated!
> Do you happen to know or have an estimation, on average,  of how much 
> space it takes to store a node desc, with respect to the original problem?
> 
> >  I guess
> > it should be easy to try, though. 
> I hope it is, I think I will give it a try.
>  
> > Are you using BLIS out of the box to
> > solve instances in MPS format or do you have application-specific
> > methods that you are using with BLIS?
> >   
> Essentially (apart from the addition of some heuristics) out of the box, 
> at least for now.
> 
> best,
> 
> Alessandro
> 
> _______________________________________________
> CHiPPS mailing list
> CHiPPS at list.coin-or.org
> http://list.coin-or.org/mailman/listinfo/chipps
-- 
                Matthew Saltzman

Clemson University Math Sciences
mjs AT clemson DOT edu
http://www.math.clemson.edu/~mjs