[Coin-discuss] AAP_BP example

Matthew Saltzman mjs at ces.clemson.edu
Mon Dec 1 23:34:31 EST 2003


On Mon, 1 Dec 2003, Laszlo Ladanyi wrote:

> Thanks for the stack trace, Matt!
>
> I did some digging on the web and found that most likely this crash is caused
> by a bug in RedHat's libc. A number of things crash, not just osl. There are
> two solutions:
>
> 1) A workaround: set the environment variable LD_ASSUME_KERNEL to 2.4.17:
>       setenv LD_ASSUME_KERNEL 2.4.17    # for (t)csh
>       export LD_ASSUME_KERNEL=2.4.17    # for all other ..sh
>
> 2) The real solution: according to a post on 11/14/2003: "updating to the
>    glibc errata a couple days ago fixed this". The post is at
>    http://gcc.gnu.org/ml/gcc/2003-11/msg00815.html

BTW, the current Red Hat 9 glibc errata is glibc-2.3.2-27.9.7, released on
11/14.  If you haven't updated yet, that's the first thing to try.  I
don't know if this is one of the fixes, but many fixes are in since
glibc-2.3.2-27.9:

[mjs at paladin mjs]$ rpm -q --changelog glibc | more
* Wed Nov 12 2003 Jakub Jelinek <jakub at redhat.com> 2.3.2-27.9.7

- fix support of non-RHL kernels without NPTL support (#109904)
- make sure _dl_sysinfo_int80 is in .text not .data section (#109918)

* Wed Nov 05 2003 Jakub Jelinek <jakub at redhat.com> 2.3.2-27.9.6

- fix ftw fd leak

* Tue Nov 04 2003 Jakub Jelinek <jakub at redhat.com> 2.3.2-27.9.5

- fix getifaddrs (CAN-2003-0859)
- fix linuxthreads sigaction (#108634)
- fix glibc 2.0 stdio compatibility
- fix uselocale (LC_GLOBAL_LOCALE)

* Fri Oct 31 2003 Jakub Jelinek <jakub at redhat.com> 2.3.2-27.9.4

- backport NPTL fork locking changes (#90036)

* Thu Oct 30 2003 Jakub Jelinek <jakub at redhat.com> 2.3.2-27.9.3

- backport NPTL mutex revamp changes (on IA-32 only; see
  http://people.redhat.com/drepper/futex.pdf for details)

* Wed Oct 29 2003 Jakub Jelinek <jakub at redhat.com> 2.3.2-27.9.2

- fix sprof (#103727)
- prevent dlopening of executables
- fix glob with GLOB_BRACE and without GLOB_NOESCAPE
- fix locale printing of word values on 64-bit big-endian arches
  (#107846)
- fix getnameinfo and getaddrinfo with reverse IPv6 lookups
  (#101261)
- backport TLS handling bugfixes from the trunk
- add pthread_cond_timedwait stubs to libc.so (#102709)

* Wed Oct 08 2003 Jakub Jelinek <jakub at redhat.com> 2.3.2-27.9.1

- fix getgrouplist (#101691)
- avoid nscd lockups when using LDAP (#54697)
- fix Ukrainian collation (#83973)
- fix perror (#85994)
- allow whitespace in ld.so.conf (#86032)
- fix strxfrm (#88409)
- fix LC_CTIME localedef creation (#88978)
- fix getifaddrs (#89026, #97828)
- fix getaddrinfo memory leaks (#89448)
- fix stdio glibc 2.0 compatibility (#90077)
- fix __cxa_finalize (#90301)
- fix sprintf (#90987)
- fix seteuid/setegid (#91567)
- search system library directories in ldconfig after ld.so.conf defined
  ones (#98966)
- avoid segfaults when "upgrading" from i686 to i386 rpm (#88456)
- issue warning about broken binaries using errno/h_errno on stderr,
  not stdout (#97814)
- fix pause, close and fsync cancel type restoring (#105348)
- readd _res at GLIBC_2.0 to TLS libc.so - __res_state () was not
  introduced in glibc 2.1 but glibc 2.2 (#90002)
- use NPTL libs on i686 if either AT_SYSINFO is present (stock RHL9
  kernels), ".nptl" substring is found in uname -r (Fedora Core kernels),
  kernel is 2.5.69+ or if set_tid_address (NULL) works (e.g. RHEL3
kernels)
- run make check also with linuxthreads (on IA-32 non-FLOATING_STACKS)
  ld.so and NPTL (on IA-32 also FLOATING_STACKS linuxthreads) libraries
  and tests
- fix important hwcaps computation



>
> Let me know if either/both helps.
> --Laci
>
>
> On Mon, 1 Dec 2003, Matthew Galati wrote:
>
> > Stephan,
> >
> > I get the same SegFault using OSL on my Linux box running Redhat 9.0
> > which uses glibc-2.3.
> >
> > [mgalati at localhost Run]$ uname -a
> > Linux localhost.localdomain 2.4.20-9 #1 Wed Apr 2 13:42:50 EST 2003 i686
> > i686 i386 GNU/Linux
> > [mgalati at localhost Run]$ rpm -q glibc
> > glibc-2.3.2-27.9
> >
> > EKK0006I Optimization Solutions and Library Version 3.0 (Jan  7 2003)
> >
> > My debugger gives the following stack:
> > (gdb) where
> > #0  0x420747ae in _int_free () from /lib/tls/libc.so.6
> > #1  0x42073786 in free () from /lib/tls/libc.so.6
> > #2  0x4206b560 in fopen at GLIBC_2.0 () from /lib/tls/libc.so.6
> > #3  0x40155f16 in ekkdxt9 () from /home/mgalati/src/osl/osllib/libosl.so
> > #4  0x4015723f in ekkdxtn () from /home/mgalati/src/osl/osllib/libosl.so
> > #5  0x40157486 in ekkdxta () from /home/mgalati/src/osl/osllib/libosl.so
> > #6  0x4015757c in ekkdxt1 () from /home/mgalati/src/osl/osllib/libosl.so
> > #7  0x401585ad in ekkdxte () from /home/mgalati/src/osl/osllib/libosl.so
> > #8  0x401590dc in ekkchecklicense () from
> > /home/mgalati/src/osl/osllib/libosl.so
> > #9  0x4004d57d in ekk_initializeContext () from
> > /home/mgalati/src/osl/osllib/libosl.so
> > #10 0x40022095 in OsiOslSolverInterface::incrementInstanceCounter() ()
> > at OsiOslSolverInterface.cpp:1439
> > #11 0x400222d2 in OsiOslSolverInterface (this=0x8178848) at
> > OsiOslSolverInterface.cpp:1489
> > #12 0x080a7916 in AAP_lp::initialize_solver_interface() (this=0x8178508)
> > at /home/mgalati/COIN_EXAMPLES/AAP_BP/LP/AAP_lp.cpp:70
> > #13 0x0809f4f1 in BCP_single_environment::register_process()
> > (this=0x811a108) at /home/mgalati/COIN/Bcp/Member/BCP_message_single.cpp:250
> > #14 0x08090986 in main (argc=3, argv=0xbfffd7f4) at
> > /home/mgalati/COIN/Bcp/TM/BCP_tm_main.cpp:47
> > #15 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6
> >
> >
> >
> > However, using OSL on my Linux box running Redhat 8.0, which ues
> > glibc-2.2 it runs ok. It actually doesn't solve it correctly - there
> > seems to be an issue with OSL's getDualRays return (or, how I am using
> > it) which might be related to a post I made earlier today. I will look
> > into this and get back to you.
> >
> > [mgalati at dyn035199 Run]$ uname -a
> > Linux dyn035199.ie.lehigh.edu 2.4.18-14 #1 Wed Sep 4 13:35:50 EDT 2002
> > i686 i686 i386 GNU/Linux
> > [mgalati at dyn035199 Run]$ rpm -q glibc
> > glibc-2.2.93-5
> >
> > As for the OSL issue with glibc-2.2 vs glibc-2.3, perhaps other OSL
> > users on this list can comment.
> >
> > Matt
> >
> >
> >
> >
> >
> >
> > > Hi Stephan,
> > >
> > > I typically run everything from the Run subdir which is where I store
> > > par.par. From there, I usually type:
> > >
> > > ../Linux-g/bcps ParamFile par.par
> > >
> > > which seems to work ok. Your way should work fine as well. I'll update
> > > the doc.
> > >
> > > I have not tried out AAP using OSL. What OS are you using? More
> > > importantly, what version of glibc are you using? as I have encountered
> > > OSL errors before when using the newer glibc2.3. I will try things out
> > > with OSL later tonight and get back to you.
> > >
> > > No, the memory leak should not cause a seg-fault.
> > >
> > > Thanks,
> > > Matt
> > >
> > >
> > >
> > >
> > >> Thanks Matthew, for the AAP example. One notice to the documentation. To
> > >> run the executable in section 4 you wrote '... type Linux-g/bcps
> > >> ParamFile par.par' This is wrong in two terms.
> > >> - At first the parameter file is located in Run/ .
> > >> - Secondly, in Run/par.par paths to data files contain a leading ../ and
> > >> bcps complains not beeing able to find the data file when typing
> > >> 'Linux-g/bcps ParamFile Run/par.par' on the command line. So at last I
> > >> had to go down to Linux-g/ and then type './bcps ParamFile
> > >> ../Run/par.par' to run the example. The other option were to remove the
> > >> leading ../ in Run/par.par at AAP_datafile. Then you can run the
> > >> executable by typing 'Linux-g/bcps ParamFile Run/par.par'. Anyway the
> > >> doc should be consistent in section 4.
> > >>
> > >> There is another problem I encountered while playing with AAP_BP. I have
> > >> no clue what's wrong and I don't know if it is an error in the code. But
> > >> I'll describe it here anyway.
> > >>
> > >> I tried to run the example with CLP and OSL on different problem files
> > >> with the leading ../ removed in Run/par.par at AAP_datafile. So that
> > >> line reads for example
> > >>
> > >> AAP_datafile            Data/small/6.1.aap
> > >>
> > >> All problem files run fine with CLP. But when I use OSL (by modifying
> > >> Makefile.aap) with problem file 6.1.aap, I get the following error
> > >>
> > >> [work at localhost AAP_BP]$ Linux-g/bcps ParamFile Run/par.par
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> BCP_parameters::read_from_stream   Scanning parameter stream.
> > >> Speicherzugriffsfehler
> > >> [work at localhost AAP_BP]$
> > >>
> > >> The last message means 'memory exception fault'. Fortunately (?) it runs
> > >> without complications when I override the AAP_datafile setting with the
> > >> same problem file by issuing
> > >>
> > >> [work at localhost AAP_BP]$ Linux-g/bcps ParamFile Run/par.par AAP_datafile
> > >> Data/small/6.1.aap
> > >>
> > >> More strange, the behaviour is just the other way around with problem
> > >> file Data/small/8.1.aap or 10.1.aap. Issuing
> > >>
> > >> [work at localhost AAP_BP]$ Linux-g/bcps ParamFile Run/par.par
> > >>
> > >> works, but
> > >>
> > >> [work at localhost AAP_BP]$ Linux-g/bcps ParamFile Run/par.par AAP_datafile
> > >> Data/small/8.1.aap
> > >>
> > >> exits with the same error as above.
> > >>
> > >> The first behaviour (overriding AAP_datafile necessary for running)
> > >> applies to all 6.*.aap files. The second behaviour (overriding
> > >> AAP_datafile prevents from running) applies to all to 8.*.aap and
> > >> 10.*.aap files. I don't know if this is really related to COIN-BCP.
> > >> Anyway, it's very odd. Any ideas?
> > >>
> > >> Kind regards,
> > >> Stephan Hennig
> > >>
> > >> PS. I just read the posting '[Coin-discuss] BCP user_data memory leak /
> > >> BCP mailing list?'. Am I facing that memory leak mentioned?
> > >>
> > >> _______________________________________________
> > >> Coin-discuss mailing list
> > >> Coin-discuss at www-124.ibm.com
> > >> http://www-124.ibm.com/developerworks/oss/mailman/listinfo/coin-discuss
> > >
> > >
> >
> > --
> > Matthew Galati
> > ISE Lehigh University
> > IBM Service Parts Solutions
> > 610.758.4042 (Office)
> > 610.882.0779 (Home)
> > magh at lehigh.edu, magal11 at us.ibm.com
> > http://sagan.ie.lehigh.edu/mgalati/
> >
> > _______________________________________________
> > Coin-discuss mailing list
> > Coin-discuss at www-124.ibm.com
> > http://www-124.ibm.com/developerworks/oss/mailman/listinfo/coin-discuss
> >
>
> _______________________________________________
> Coin-discuss mailing list
> Coin-discuss at www-124.ibm.com
> http://www-124.ibm.com/developerworks/oss/mailman/listinfo/coin-discuss
>

-- 
		Matthew Saltzman

Clemson University Math Sciences
mjs AT clemson DOT edu
http://www.math.clemson.edu/~mjs



More information about the Coin-discuss mailing list