[Clp] ClpSimplex openblas multithreaded
John Forrest
jjhforrest at gmail.com
Mon Oct 14 12:48:50 EDT 2019
Peter,
ClpSimplex can use openblas - and it does help.
In configure command you need in CXXDEFS
-DCLP_USE_OPENBLAS=n
I use --disable-blas and add in -lopenblas in ADD_CXXFLAGS
As I am normally looking at using it in Cbc I have n as 1. Obviously
you can experiment. My quick test did not show any real gain from
setting it to 4.
John Forrest
On 14/10/2019 16:56, Peter Lichard wrote
> Hello,
>
> I am solving an LP problem with a constraint matrix going up to a size
> of 26000x1020 with ClpSimplex and CoinPackedMatrix.
> It has a density of about 0.3% . I attached an image of the density.
>
> I need this to be as fast as possible (currently about 160ms), so I
> tried configuring Clp to use openblas, hoping that SIMD-optimized code
> and multi-threading would speed it up.
> After building Clp (version 1.17 stable) myself, with a ./configure
> --with-blas=openblas --with-blas-lib=-lopenblas, libClp.so looks indeed
> linked with blas libraries.
> Multi-threading aside, ltrace does not show any call to a blas function,
> apart from my checks for the number of openblas threads (12, with
> OPENBLAS_THREAD).
>
> Am I missing something ? Can ClpSimplex actually make use of openblas ?
> Would it even speed things up ?
>
> Thank you for your time,
> Peter.
>
> ========================================= lddtree /usr/local/
> =================================
> libClp.so => /usr/local/lib/libClp.so (interpreter => none)
> libCoinUtils.so.0 => /usr/local/lib/libCoinUtils.so.0
> libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1
> libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0
> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> ld-linux-x86-64.so.2 =>
> /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
> libgfortran.so.4 => /usr/lib/x86_64-linux-gnu/libgfortran.so.4
> libquadmath.so.0 =>
> /usr/lib/x86_64-linux-gnu/libquadmath.so.0
> libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6
> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
> libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
> ========================================= lddtree
> ===========================================
>
> ========================================= ltrace -c ./timing_test
> ===============================
> % time seconds usecs/call calls function
> ------ ----------- ----------- --------- --------------------
> 28.91 3.483803 173 20116 tan
> 20.33 2.449326 162 15112 memset
> 17.49 2.107528 174 12052 sincos
> 10.61 1.278199 174 7326 gsl_matrix_get
> 4.50 0.542519 172 3151 free
> 2.96 0.356870 177 2005 gsl_poly_complex_workspace_free
> 2.92 0.352372 175 2005 gsl_poly_complex_workspace_alloc
> 2.86 0.344272 171 2005 gsl_poly_complex_solve
> 2.30 0.277552 170 1632 malloc
> 2.14 0.257797 169 1519 calloc
> 1.63 0.195977 166 1177 memcpy
> 0.82 0.098717 178 552 _ZGVdN4v_cos
> 0.79 0.094662 179 528 _ZGVdN4v_sin
> 0.42 0.050496 2295 22
> _ZN10ClpSimplex16initialDualSolveEv
> 0.23 0.027238 176 154 __memcpy_chk
> 0.16 0.019791 178 111 gsl_bspline_deriv_eval
> 0.14 0.016458 748 22 _ZN10ClpSimplex4dualEii
> 0.13 0.016218 176 92 cos
> 0.13 0.015419 175 88 sin
> 0.06 0.007715 175 44 __memset_chk
> 0.05 0.005886 735 8 strcmp
> 0.05 0.005594 254 22
> _ZN16CoinPackedMatrixC1EbPKiS1_PKdi
> 0.04 0.004901 222 22 _ZN10ClpSimplexD1Ev
> 0.04 0.004332 196 22
> _ZN18CoinMessageHandler11setLogLevelEi
> 0.03 0.004174 189 22 _ZN8ClpModel18setPrimalToleranceEd
> 0.03 0.003902 177 22 _ZN10ClpSimplexC1Eb
> 0.03 0.003776 171 22 _ZN16CoinPackedMatrixD1Ev
> 0.03 0.003708 168 22 _ZN8ClpModel17setMaximumSecondsEd
> 0.03 0.003445 156 22
> _ZN10ClpSimplex11loadProblemERK16CoinPackedMatrixPKdS4_S4_S4_S4_S4_
> 0.03 0.003101 140 22 _ZN8ClpModel16setDualToleranceEd
> 0.02 0.002599 173 15 gsl_bspline_ncoeffs
> 0.02 0.002565 366 7 __printf_chk
> 0.01 0.001369 342 4 puts
> 0.01 0.001286 257 5 fflush
> 0.01 0.001166 166 7 clock_gettime
> 0.01 0.000994 994 1 __cxa_atexit
> 0.01 0.000816 816 1 _ZNSt8ios_base4InitC1Ev
> 0.01 0.000671 671 1 openblas_get_num_threads
> 0.01 0.000640 640 1 openblas_get_parallel
> 0.00 0.000444 222 2 gsl_bspline_knots_uniform
> 0.00 0.000415 207 2 gsl_matrix_alloc
> 0.00 0.000403 201 2 gsl_bspline_alloc
> 0.00 0.000399 199 2 gsl_matrix_free
> 0.00 0.000345 172 2 gsl_bspline_free
> ------ ----------- ----------- --------- --------------------
> 100.00 12.049860 69971 total
>
> ========================================= ltrace result
> =======================================
>
> _______________________________________________
> Clp mailing list
> Clp at list.coin-or.org
> https://list.coin-or.org/mailman/listinfo/clp
>
More information about the Clp
mailing list