[CppAD] Performance of CppAD and OpenMP

Brad Bell bradbell at seanet.com
Sat Feb 19 08:14:19 EST 2011


The test case
http://www.coin-or.org/CppAD/Doc/example_a11c.cpp.xml
is an example from the OpenMP standards document. It does not use CppAD 
at all and is intended to tests the limitations of your system and 
compiler.  It is one of the cases run by openmp/run.sh. I have found 
that, for some systems and compilers, it has the type of performance 
that you describe below.

On 2/18/2011 12:30 PM, schattenpflanze at arcor.de wrote:
> Hello,
>
> I have another question concerning the performance of CppAD when 
> OpenMP is enabled. It seems that CppAD scales very badly when the 
> number of threads and cores exceeds a certain number. I have tried to 
> construct a minimal example reproducing the issue. Running the simple 
> (and absolutely pointless) example code listed below on a machine with 
> 32 native cores (no hyperthreading, single workstation) yields the 
> following results:
> 1 thread:  8.6 seconds
> 4 threads: 2.8 s
> 8 threads: 2.2 s
> 10 threads: 2.4 s
> 12 threads: 4 s
> 14 threads: 3.8 s
> 16 threads: 4.2 s
> 24 threads: 8.1 s (!)
> 28 threads: 9.5 s (!)
>
> I am, of course, aware that additional threads cause additional 
> overhead, and that the performance does not necessarily increase with 
> the number of threads. However, this significant _decrease_ seems 
> strange. In particular, if I remove the line
> CppAD::Independent(x)
> from the code, I obtain:
> 4 threads: 0.38 s
> 8 threads: 0.20 s
> 16 threads: 0.14 s
> 24 threads: 0.12 s,
> which is the kind of scaling that I would have expected.
>
> Memory consumption seems to be low. I have tried various scheduling 
> and variable sharing policies, but the problem persists. I also attach 
> the interesting results of the CppAD openmp test script. What is the 
> reason for this behaviour and how can I counter it?
>
> Thank you and best regards,
> Peter
>
>
> Test code:
> ----------------------------------------------------
> int n_par = 45;
> CppAD::vector<AD<double> > x(n_par);
> for (int i=0; i<n_par; ++i) {
>   x[i] = i;
> }
>
> #pragma omp parallel for \
>     firstprivate(x) \
>     schedule(dynamic,1) \
>     num_threads(global_paras->n_threads)
> for (int i=0; i<1000; ++i) {
>   CppAD::Independent(x);
>   CppAD::vector<AD<double> > y(1);
>
>   y[0] = 0.0;
>   for (int i=0; i<1000; ++i) {
>     for (int j=0; j<(int)x.size(); ++j) {
>       y[0] += CppAD::pow(x[j] - y[0], 0.1);
>     }
>   }
> }
>
>
> _______________________________________________
> CppAD mailing list
> CppAD at list.coin-or.org
> http://list.coin-or.org/mailman/listinfo/cppad

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://list.coin-or.org/pipermail/cppad/attachments/20110219/f33616e2/attachment.html 


More information about the CppAD mailing list