[Coin-discuss] problems with OSL on Red Hat [was: AAP_BP example]

Stephan Hennig mailing_list at web.de
Tue Dec 2 07:54:59 EST 2003


Laszlo Ladanyi schrieb:
> Thanks for the stack trace, Matt!
>
> I did some digging on the web and found that most likely this crash is
> caused by a bug in RedHat's libc. A number of things crash, not just osl.
> There are two solutions:
>
> 1) A workaround: set the environment variable LD_ASSUME_KERNEL to 2.4.17:
>       setenv LD_ASSUME_KERNEL 2.4.17    # for (t)csh
>       export LD_ASSUME_KERNEL=2.4.17    # for all other ..sh
>
> 2) The real solution: according to a post on 11/14/2003: "updating to the
>    glibc errata a couple days ago fixed this". The post is at
>    http://gcc.gnu.org/ml/gcc/2003-11/msg00815.html
>
> Let me know if either/both helps.
> --Laci

Because I only have a modem connection, I can apply the update not before next
week. I'll report the results then.
So far I tried the first solution (setting LD_ASSUME_KERNEL). It solves the
problem with the segfault message when running AAP with OSL. But still problem
6.1 takes very long to solve and at last stops with a wrong solution of 48.
The same seems to apply for problems 6.4, 8.2-8.4, 10.2 and 10.4. I didn't
wait for a solution for these problems, but it seems to be the same behaviour
as with 6.1. The problems not named are solved quite fast and correctly.
So what are your results with the problems in question? Is there another
problem with OSL (with getDualRays as Matthew reported)? Does this affect only
the AAP or should I return to CLP in general (OSL was faster so far)?

Regards,
Stephan Hennig


> On Mon, 1 Dec 2003, Matthew Galati wrote:
>
>> Stephan,
>>
>> I get the same SegFault using OSL on my Linux box running Redhat 9.0
>> which uses glibc-2.3.
>>
>> [mgalati at localhost Run]$ uname -a
>> Linux localhost.localdomain 2.4.20-9 #1 Wed Apr 2 13:42:50 EST 2003 i686
>> i686 i386 GNU/Linux
>> [mgalati at localhost Run]$ rpm -q glibc
>> glibc-2.3.2-27.9
>>
>> EKK0006I Optimization Solutions and Library Version 3.0 (Jan  7 2003)
>>
>> My debugger gives the following stack:
>> (gdb) where
>> #0  0x420747ae in _int_free () from /lib/tls/libc.so.6
>> #1  0x42073786 in free () from /lib/tls/libc.so.6
>> #2  0x4206b560 in fopen at GLIBC_2.0 () from /lib/tls/libc.so.6
>> #3  0x40155f16 in ekkdxt9 () from /home/mgalati/src/osl/osllib/libosl.so
>> #4  0x4015723f in ekkdxtn () from /home/mgalati/src/osl/osllib/libosl.so
>> #5  0x40157486 in ekkdxta () from /home/mgalati/src/osl/osllib/libosl.so
>> #6  0x4015757c in ekkdxt1 () from /home/mgalati/src/osl/osllib/libosl.so
>> #7  0x401585ad in ekkdxte () from /home/mgalati/src/osl/osllib/libosl.so
>> #8  0x401590dc in ekkchecklicense () from
>> /home/mgalati/src/osl/osllib/libosl.so
>> #9  0x4004d57d in ekk_initializeContext () from
>> /home/mgalati/src/osl/osllib/libosl.so
>> #10 0x40022095 in OsiOslSolverInterface::incrementInstanceCounter() ()
>> at OsiOslSolverInterface.cpp:1439
>> #11 0x400222d2 in OsiOslSolverInterface (this=0x8178848) at
>> OsiOslSolverInterface.cpp:1489
>> #12 0x080a7916 in AAP_lp::initialize_solver_interface() (this=0x8178508)
>> at /home/mgalati/COIN_EXAMPLES/AAP_BP/LP/AAP_lp.cpp:70
>> #13 0x0809f4f1 in BCP_single_environment::register_process()
>> (this=0x811a108) at
>> /home/mgalati/COIN/Bcp/Member/BCP_message_single.cpp:250
>> #14 0x08090986 in main (argc=3, argv=0xbfffd7f4) at
>> /home/mgalati/COIN/Bcp/TM/BCP_tm_main.cpp:47
>> #15 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6
>>
>>
>>
>> However, using OSL on my Linux box running Redhat 8.0, which ues
>> glibc-2.2 it runs ok. It actually doesn't solve it correctly - there
>> seems to be an issue with OSL's getDualRays return (or, how I am using
>> it) which might be related to a post I made earlier today. I will look
>> into this and get back to you.
>>
>> [mgalati at dyn035199 Run]$ uname -a
>> Linux dyn035199.ie.lehigh.edu 2.4.18-14 #1 Wed Sep 4 13:35:50 EDT 2002
>> i686 i686 i386 GNU/Linux
>> [mgalati at dyn035199 Run]$ rpm -q glibc
>> glibc-2.2.93-5
>>
>> As for the OSL issue with glibc-2.2 vs glibc-2.3, perhaps other OSL
>> users on this list can comment.
>>
>> Matt
>>
>> >> Thanks Matthew, for the AAP example. One notice to the documentation. To
>> >> run the executable in section 4 you wrote '... type Linux-g/bcps
>> >> ParamFile par.par' This is wrong in two terms.
>> >> - At first the parameter file is located in Run/ .
>> >> - Secondly, in Run/par.par paths to data files contain a leading ../ and
>> >> bcps complains not beeing able to find the data file when typing
>> >> 'Linux-g/bcps ParamFile Run/par.par' on the command line. So at last I
>> >> had to go down to Linux-g/ and then type './bcps ParamFile
>> >> ../Run/par.par' to run the example. The other option were to remove the
>> >> leading ../ in Run/par.par at AAP_datafile. Then you can run the
>> >> executable by typing 'Linux-g/bcps ParamFile Run/par.par'. Anyway the
>> >> doc should be consistent in section 4.
>> >>
>> >> There is another problem I encountered while playing with AAP_BP. I have
>> >> no clue what's wrong and I don't know if it is an error in the code. But
>> >> I'll describe it here anyway.
>> >>
>> >> I tried to run the example with CLP and OSL on different problem files
>> >> with the leading ../ removed in Run/par.par at AAP_datafile. So that
>> >> line reads for example
>> >>
>> >> AAP_datafile            Data/small/6.1.aap
>> >>
>> >> All problem files run fine with CLP. But when I use OSL (by modifying
>> >> Makefile.aap) with problem file 6.1.aap, I get the following error
>> >>
>> >> [work at localhost AAP_BP]$ Linux-g/bcps ParamFile Run/par.par
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> BCP_parameters::read_from_stream   Scanning parameter stream.
>> >> Speicherzugriffsfehler
>> >> [work at localhost AAP_BP]$
>> >>
>> >> The last message means 'memory exception fault'. Fortunately (?) it runs
>> >> without complications when I override the AAP_datafile setting with the
>> >> same problem file by issuing
>> >>
>> >> [work at localhost AAP_BP]$ Linux-g/bcps ParamFile Run/par.par AAP_datafile
>> >> Data/small/6.1.aap
>> >>
>> >> More strange, the behaviour is just the other way around with problem
>> >> file Data/small/8.1.aap or 10.1.aap. Issuing
>> >>
>> >> [work at localhost AAP_BP]$ Linux-g/bcps ParamFile Run/par.par
>> >>
>> >> works, but
>> >>
>> >> [work at localhost AAP_BP]$ Linux-g/bcps ParamFile Run/par.par AAP_datafile
>> >> Data/small/8.1.aap
>> >>
>> >> exits with the same error as above.
>> >>
>> >> The first behaviour (overriding AAP_datafile necessary for running)
>> >> applies to all 6.*.aap files. The second behaviour (overriding
>> >> AAP_datafile prevents from running) applies to all to 8.*.aap and
>> >> 10.*.aap files. I don't know if this is really related to COIN-BCP.
>> >> Anyway, it's very odd. Any ideas?
>> >>
>> >> Kind regards,
>> >> Stephan Hennig





More information about the Coin-discuss mailing list