|
[Rivet] Sherpa fails to run on batch systemHolger Schulz holger.schulz at physik.hu-berlin.deWed Aug 27 11:44:44 BST 2008
Hi, I am currently experiencing some trouble trying to run Sherpa via rivet on the DESY batch farm. It's again something with loading libraries. There are no problems with Pythia6, it works if I want to run jobs interactively and if I submit them to the batch farm (PBS). However, submitting Sherpa-jobs using rivet to the batch-farm segfaults, though interactive jobs work smoothly. Both, the interactive machines and the batch farm ones are of the type sl5_amd64_gcc41. Here is the backtrace of gdb. I also enabled TRACE for AGILe.Loader: AGILe.Loader: TRACE Trying to load /afs/ifh.de/group/atlas/users/scratch/hschulz/Software/lib/libAGILeSherpa.so AGILe.Loader: TRACE Successfully loaded /afs/ifh.de/group/atlas/users/scratch/hschulz/Software/lib/libAGILeSherpa.so (0x1999ef90) AGILe.Loader: TRACE Setting AGILe module handle for /afs/ifh.de/group/atlas/users/scratch/hschulz/Software/lib/libAGILeSherpa.so (0x1999ef90) Program received signal SIGSEGV, Segmentation fault. 0x0000003add278350 in strlen () from /lib64/libc.so.6 (gdb) bt #0 0x0000003add278350 in strlen () from /lib64/libc.so.6 #1 0x00002aaaaacee47d in AGILe::Loader::loadGenLibs () from /afs/ifh.de/group/atlas/users/scratch/hschulz/Software/lib/libAGILe.so.2 #2 0x000000000040b871 in Rivet::generate () #3 0x0000000000409348 in main () (gdb) And this is what valgrind says: AGILe.Loader: TRACE Testing for /afs/cern.ch/sw/lcg/external/MCGenerators/lhapdf/5.4.0/slc5_amd64_gcc41/lib/libLHAPDF.so AGILe.Loader: TRACE Testing for /afs/cern.ch/sw/lcg/external/MCGenerators/lhapdf/5.4.0.2/lib/libLHAPDF.so AGILe.Loader: TRACE Testing for /afs/cern.ch/sw/lcg/external/MCGenerators/lhapdf/5.4.0/lib/libLHAPDF.so ==5208== Process terminating with default action of signal 11 (SIGSEGV) ==5208== Access not within mapped region at address 0x0 ==5208== at 0x4A066C2: strlen (mc_replace_strmem.c:246) ==5208== by 0x4E4B47C: AGILe::Loader::loadGenLibs(std::string const&) (in /afs/ifh.de/group/atlas/users/scratch/hschulz/Software/lib/libAGILe.so.2.0.0) ==5208== by 0x40B870: Rivet::generate(Rivet::Configuration&, Rivet::Log&) (in /afs/ifh.de/group/atlas/users/scratch/hschulz/Software/bin/rivetgun) ==5208== by 0x409347: main (in /afs/ifh.de/group/atlas/users/scratch/hschulz/Software/bin/rivetgun) ==5208== ==5208== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 5 from 1) ==5208== malloc/free: in use at exit: 274,773 bytes in 2,828 blocks. ==5208== malloc/free: 22,808 allocs, 19,980 frees, 1,353,078 bytes allocated. ==5208== For counts of detected errors, rerun with: -v ==5208== searching for pointers to 2,828 not-freed blocks. ==5208== checked 5,049,712 bytes. Could this be due to some environment variables not being set correctly? I simply source my .zshrc in the batch script. Any ideas? This leaves me absolutely clueless :( Holger
More information about the Rivet mailing list |