[Csdp] Puzzling result using icc

Ronald Bruck bruck at usc.edu
Sat Aug 22 00:23:46 EDT 2009


Ah, here it is!  Yes, that's appropriate.  And also the "-openmp", which should be obvious,
but:  the only documentation I could find from Intel on open mp and icc was pretty sparse; the
manpage is MUCH more useful.  What I could find seemed  to say that if you wanted to use -lmkl then you also
needed to include -lguide and -lpthread.  So this is a misconception on my part on what it takes
to use mkl and openmp together...

That said, the p4 single-threaded version is still faster than the icc version.  Until you up the dimension:
on a similar problem, but with about 1600 constrains, the icc version took 5 sec, the p4 took 20 sec.  I have two dual-core processors, so that's exactly right.  (In fact, suspiciously right.)

Actually, this sort of optimization isn't of much use to me.  I'm solving many, many problems, and it's just as efficient, with less work, to do them four at a time, and let the OS schedule them.  But of course the parallelization allows you to solve much LARGER problems.

-- Ron Bruck

----- Original Message -----
From: Brian Borchers <borchers at nmt.edu>
Date: Friday, August 21, 2009 7:48 pm
Subject: Re: [Csdp] Puzzling result using icc
To: borchers at nmt.edu, csdp at list.coin-or.org, bruck at usc.edu

> 
> I think that the reason that Ron's compiled code is slower than the 
> singlethreaded binary is probably related to 
> 
>   -DSETNUMTHREADS
> 
> You absolutely need to use this switch if you're linking with MKL.  
> Otherwise,you'll end up with n^2 threads running, where n is the 
> number of processor
> cores, and this can cause massive performance problems.  
> 
> _______________________________________________
> Csdp mailing list
> Csdp at list.coin-or.org
> http://list.coin-or.org/mailman/listinfo/csdp
> 



More information about the Csdp mailing list