|
[Rivet] Small problem linking LCG libRivet to Gaudi algorithm in GaussAnton Karneyeu Anton.Karneyeu at cern.chWed Sep 7 13:39:16 BST 2011
Hi Andy, yesterday I met with Alex to see the origin of the crash. Segfault happens at this line: http://projects.hepforge.org/rivet/trac/browser/tags/rivet-1.6.0/src/Core/AnalysisHandler.cc#L269 if the type of *hobj is IProfile1D (and if the *hobj is IHistogram1D then all three dynamic_cast work as expected). The reason is not clear for me - looks like compiler bug (gcc 4.3), but can be also due to mismatch between g++ options used to build Rivet in genser repository and g++ options to link with Rivet in LHCb software, or influence of LCG AIDA implementation used in LHCb. So, I prepare a workaround to avoid subsequent dynamic_casts if preceding cast is successful: https://svnweb.cern.ch/trac/GENSER/browser/GENSER3/pkgsrc/MCGenerators/rivet/tags/rivet-1_6_0_p1/patches/patch-aa Rivet build with the patch situated here: /afs/cern.ch/sw/lcg/external/MCGenerators_hepmc2.06.05/rivet/1.6.0.p1/x86_64-slc5-gcc43-opt Alex confirmed the patched version works fine now with LHCb Rivet handler. I am not sure the above patch is a best way to fix the problem and may be there is a better way. Could you have a look? Cheers, Anton Alex Grecu: > Hi Anton, > > Sorry for reply so late but I needed to finish my presentation for > tomorrow. Unfortunately, I won't be able to report that version 1 of my > algorithm is working but at least we're debugging it! > > The procedure to run the code in my public directory implies that you > should be able to use the LHCb environment and the SetupProject command > (/afs/cern.ch/lhcb/software/releases/LBSCRIPTS/LBSCRIPTS_v6r4p1/InstallArea/scripts/SetupProject.sh). > > I hope you can use this environment if not (I see you're CMS) I can pass > by your office whenever your schedule allows it and show you everything > or even have a debugging session (of course tomorrow I'm kind of busy > the whole morning so I'd prefer to meet after lunch). Well, once the > LHCb environment is setup I issue: > $ SetupProject --build-env --nightly lhcb-head Mon Gauss HEAD > to create the building environment for our simulation project (Gauss) > using it's latest version which is built against HepMC 2.06 > and I end up in a directory like ~/cmtuser/Gauss_HEAD. It is here that > the contents of ~agrecu/public/Gen must be copied recursively. > Then I cd to > ~/cmtuser/Gauss_HEAD/Gen/GenAnalysis/cmt > and I usually issue > $cmt make clean; cmt make config; cmt make &> make.log > to rebuild the GenAnalysis package. > Then I write > $ SetupProject --nightly lhcb-head Mon Gauss HEAD > to have the run environment properly setup. Please don't ask why I have > to give almost the same command twice! All I noticed through > trial-and-error is that without the second command I cannot run the jobs > because some path is missing. I didn't have the time to dig into it but > it is possible that the system is implemented this way for a very good > reason. > Finally I use the python file in the options directory to run a Gauss > job that will generate HepMC with Pythia6 which will be fed to libRivet > through my Gaudi algorithm: > $ gaudirun.py Py6Perugia0 at 900GeV.py &> run.log > > The things occuring in the background are the following: an application > manager starts the Gauss machinery which in turns configures Pythia 6 > and initializes a few components among which my own. Then it begins > generating HepMC events which are stored in a place in memory and then, > at a specific moment of the iteration, my algorithm is called (the > execute method) which in turn calls an instance of > Rivet::AnalysisHandler which at its turn computes some observables from > the event data and then feeds these informations to the chain of > analyses that the options instructed it to load (in the present case it > is only one analysis from ATLAS). These analyses are in fact plugins for > the libRivet that are loaded at runtime. As far as I could see > everything works fine until the requested number of events was generated > and the finalize() method is called. In this method I call the > finalize() on the Rivet::AnalysisHandler instance and it is there that > the crash occurs. I'm almost convinced that it has something to do with > the way libRivet is linked against AIDA because I can further trace the > error to a dynamic_cast that it issued for an AIDA object. Now, I admit > it can be that the AIDA object is invalid anyway because this situation > was never encountered for this particular analysis (plugin) that I chose > to load and run, but at the same time I cannot stop wandering whether > the error doesn't come from some mix-up in the way type_info symbols are > mapped to classes and instantiated. I shall try to run the code with > other analyses and let you know of the result. > > Sorry for the long message and the lack of professional terms or > possibly their wrong usage (I don't consider myself an expert programmer)! > > Cheers, > Alex > > PS: Thanks for the configure flags though I guess you're using the HepMC > 2.06 when building the libRivet in LCGCMT_61. I'll try to compile > libRivet using the supplemental flags I mentioned in my first message. > > ---------------------- > LHCb Experiment > Office: 11/1-014 > Phone: +41 22 76 79058 > Postbox: F26500 > > > On 9/5/2011 17:04, Anton Karneyeu wrote: >> Hi Alex, >> >> I am forwarding your mail to Rivet team to comment about crash in >> libRivet at runtime. Could you please also prepare step-by-step >> instruction on how to run the example in >> ~agrecu/public/Gen/GenAnalysis and reproduce the crash - this will be >> very helpful. >> >> For the Rivet package installed at genser repo we do not specify >> anything related to AIDA implementation (and as I knew there is no >> such options at all), so that the internal Rivet implementation (LWH) >> is used. >> >> For the reference here is all configure options which we specify: >> >> ./configure >> >> --prefix=/afs/cern.ch/sw/lcg/external/MCGenerators/rivet/1.6.0/x86_64-slc5-gcc43-opt >> >> >> --with-hepmc=/afs/cern.ch/sw/lcg/external/HepMC/2.03.11/x86_64-slc5-gcc43-opt >> >> >> --with-boost-incpath=/afs/cern.ch/sw/lcg/external/Boost/1.44.0_python2.6/x86_64-slc5-gcc43-opt/include/boost-1_44 >> >> >> --with-fastjet=/afs/cern.ch/sw/lcg/external/fastjet/2.4.2p1/x86_64-slc5-gcc43-opt >> >> --with-gsl=/afs/cern.ch/sw/lcg/external/GSL/1.10/x86_64-slc5-gcc43-opt >> --with-lcgtag=x86_64-slc5-gcc43-opt >> --enable-unvalidated >> >> PYTHON=/afs/cern.ch/sw/lcg/external/Python/2.6.5/x86_64-slc5-gcc43-opt/bin/python >> >> >> SWIG=/afs/cern.ch/sw/lcg/external/swig/1.3.40/x86_64-slc5-gcc43-opt/bin/swig >> >> >> >> Cheers, >> Anton >> >> >> >> Alex Grecu: >>> Hi all, >>> >>> Yesterday I kept working on this issue hoping to find a solution. So I >>> tried recompiling the rivet package separately but still using the LCG >>> libraries (HepMC, Boost, GSL, so on) and then patching the LCG CMT >>> interface so that the loaded Rivet library at runtime is my own build. I >>> tried setting the CXXFLAGS environment variable to >>> "-D LWH_USING_AIDA -U __GXX_WEAK__" to force the library to be compiled >>> with LCG version of AIDA and without weak references (see >>> "https://svnweb.cern.ch/trac/gaudi/browser/Gaudi/trunk/GaudiKernel/doc/dynamic_cast.pb" >>> >>> for details about the Gaudi hack for weak references). The result is the >>> same - crash in libRivet at runtime in the finalize method when a ITree >>> leaf is dynamic_cast to IProfile1D! I don't know if I did the right >>> thing but I think I need to share with you the tests I make in order to >>> expedite the finding of a solution. Hope it doesn't spoil you weekend! >>> >>> Best regards, >>> Alex >>> >>> PS: If you have any other ideas that you think I may try, please let me >>> know! >>> >>> ---------------------- >>> LHCb Experiment >>> Office: 11/1-014 >>> Phone:+41 22 76 79058 >>> Postbox: F26500 >>> >>> >>> On 9/2/2011 19:52, Witold Pokorski wrote: >>>> >>>> I am away from cern now, but I am sure that Anton (in cc) can help. >>>> >>>> Cheers, >>>> Witek >>>> >>>> >>>> >>>> On 2 Sep 2011, at 12:58, "Alex Grecu" <Alex.Grecu at cern.ch >>>> <mailto:Alex.Grecu at cern.ch>> wrote: >>>> >>>>> Dear Witek, >>>>> >>>>> I'm Alex Grecu, the person in charge of providing LHCb with an >>>>> algorithm that allows running Rivet analyses from our >>>>> MC-generator/simulation project Gauss. I finally wrote my algorithm >>>>> and it runs perfectly till the finalize() method of the algorithm >>>>> where it either crashes or throws a bad_alloc whenever the AIDA >>>>> objects created by libRivet (LCG compiled package) are accessed. I >>>>> spent the last 3 days digging deeply into Rivet and Gauss/Gaudi >>>>> source code and I noticed that Rivet uses by default the LWH >>>>> implementation of AIDA while Gaudi uses the LCG AIDA package. I would >>>>> need a confirmation from you or your LCG experts whether this is the >>>>> case and the LCG rivet package was compiled using this LWH >>>>> implementation of AIDA rather than the LCG one. For debugging reasons >>>>> I have the current version of the algorithm in my public directory on >>>>> lxplus (~agrecu/public/Gen/GenAnalysis). The code crashes with >>>>> "Segmentation violation" in a dynamic_cast call if it is compiled >>>>> without forcing libRivet to use the LCG AIDA package (by default and >>>>> including AIDA headers from LWH/...) and it issues a bad_alloc if I >>>>> force (#define LWH_USING_AIDA 1) before including the Rivet headers - >>>>> this is the version of the code in my public directory (#includes of >>>>> AIDA objects from AIDA/). I'm at your disposal for further details >>>>> and/or live demonstrations (office 11-1-014 phone: 79058). >>>>> In short, please let me know if the LCG rivet package is compiled >>>>> with another implementation of AIDA than the LCG one or please let me >>>>> know if I'm doing something utterly wrong as I have little experience >>>>> with RTTI related issues! >>>>> Thank you in advance for any hints or details that you may provide! >>>>> >>>>> Best regards, >>>>> Alex >>>>>
More information about the Rivet mailing list |