[Clp] ClpSimplex openblas multithreaded

Peter Lichard peter.lichard at gmail.com
Mon Oct 14 11:56:43 EDT 2019


Hello,

I am solving an LP problem with a constraint matrix going up to a size of
26000x1020 with ClpSimplex and CoinPackedMatrix.
It has a density of about 0.3% . I attached an image of the density.

I need this to be as fast as possible (currently about 160ms), so I tried
configuring Clp to use openblas, hoping that SIMD-optimized code and
multi-threading would speed it up.
After building Clp (version 1.17 stable) myself, with a ./configure
--with-blas=openblas --with-blas-lib=-lopenblas, libClp.so looks indeed
linked with blas libraries.

Multi-threading aside, ltrace does not show any call to a blas function,
apart from my checks for the number of openblas threads (12, with
OPENBLAS_THREAD).

Am I missing something ? Can ClpSimplex actually make use of openblas ?
Would it even speed things up ?

Thank you for your time,
Peter.

========================================= lddtree /usr/local/
=================================
libClp.so => /usr/local/lib/libClp.so (interpreter => none)
    libCoinUtils.so.0 => /usr/local/lib/libCoinUtils.so.0
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1
        libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0
            libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
                ld-linux-x86-64.so.2 =>
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
            libgfortran.so.4 => /usr/lib/x86_64-linux-gnu/libgfortran.so.4
                libquadmath.so.0 =>
/usr/lib/x86_64-linux-gnu/libquadmath.so.0
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
========================================= lddtree
===========================================

========================================= ltrace -c ./timing_test
===============================
% time     seconds  usecs/call     calls      function
------ ----------- ----------- --------- --------------------
 28.91    3.483803         173     20116 tan
 20.33    2.449326         162     15112 memset
 17.49    2.107528         174     12052 sincos
 10.61    1.278199         174      7326 gsl_matrix_get
  4.50    0.542519         172      3151 free
  2.96    0.356870         177      2005 gsl_poly_complex_workspace_free
  2.92    0.352372         175      2005 gsl_poly_complex_workspace_alloc
  2.86    0.344272         171      2005 gsl_poly_complex_solve
  2.30    0.277552         170      1632 malloc
  2.14    0.257797         169      1519 calloc
  1.63    0.195977         166      1177 memcpy
  0.82    0.098717         178       552 _ZGVdN4v_cos
  0.79    0.094662         179       528 _ZGVdN4v_sin
  0.42    0.050496        2295        22 _ZN10ClpSimplex16initialDualSolveEv
  0.23    0.027238         176       154 __memcpy_chk
  0.16    0.019791         178       111 gsl_bspline_deriv_eval
  0.14    0.016458         748        22 _ZN10ClpSimplex4dualEii
  0.13    0.016218         176        92 cos
  0.13    0.015419         175        88 sin
  0.06    0.007715         175        44 __memset_chk
  0.05    0.005886         735         8 strcmp
  0.05    0.005594         254        22 _ZN16CoinPackedMatrixC1EbPKiS1_PKdi
  0.04    0.004901         222        22 _ZN10ClpSimplexD1Ev
  0.04    0.004332         196        22
_ZN18CoinMessageHandler11setLogLevelEi
  0.03    0.004174         189        22 _ZN8ClpModel18setPrimalToleranceEd
  0.03    0.003902         177        22 _ZN10ClpSimplexC1Eb
  0.03    0.003776         171        22 _ZN16CoinPackedMatrixD1Ev
  0.03    0.003708         168        22 _ZN8ClpModel17setMaximumSecondsEd
  0.03    0.003445         156        22
_ZN10ClpSimplex11loadProblemERK16CoinPackedMatrixPKdS4_S4_S4_S4_S4_
  0.03    0.003101         140        22 _ZN8ClpModel16setDualToleranceEd
  0.02    0.002599         173        15 gsl_bspline_ncoeffs
  0.02    0.002565         366         7 __printf_chk
  0.01    0.001369         342         4 puts
  0.01    0.001286         257         5 fflush
  0.01    0.001166         166         7 clock_gettime
  0.01    0.000994         994         1 __cxa_atexit
  0.01    0.000816         816         1 _ZNSt8ios_base4InitC1Ev
  0.01    0.000671         671         1 openblas_get_num_threads
  0.01    0.000640         640         1 openblas_get_parallel
  0.00    0.000444         222         2 gsl_bspline_knots_uniform
  0.00    0.000415         207         2 gsl_matrix_alloc
  0.00    0.000403         201         2 gsl_bspline_alloc
  0.00    0.000399         199         2 gsl_matrix_free
  0.00    0.000345         172         2 gsl_bspline_free
------ ----------- ----------- --------- --------------------
100.00   12.049860                 69971 total

========================================= ltrace result
=======================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.coin-or.org/pipermail/clp/attachments/20191014/e66b2127/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: constr_matrix.png
Type: image/png
Size: 13726 bytes
Desc: not available
URL: <http://list.coin-or.org/pipermail/clp/attachments/20191014/e66b2127/attachment-0001.png>


More information about the Clp mailing list