[Rivet] Extension of mcplots/Rivet for Heavy Ion Analyses

Benedikt Volkel benedikt.volkel at cern.ch
Wed Aug 10 21:55:02 BST 2016


Dear Rivet developers,

my name is Benedikt Volkel and I'm working on a summer student
project in ALICE with Jan Fiete Grosse-Oetringhaus and Jochen Klein.
The goal is to extend the mcplots project to cover specific needs
arising from heavy-ion analyses. In particular, we want to implement
a post-processing step, which is frequently required for heavy-ion
analyses. This step must take place after the production of certain
analysis output, e.g. to combine results from different generator
runs. As mcplots is based on the standard Rivet work flow, the
questions do not apply just to mcplots but more general to Rivet. To
sketch the problem in more detail and start a discussion on a
possible implementation of a standardized post-processing step we
use the example of an R_AA analysis as a starting point.

The conceptual problem of an R_AA analysis is the combination, here
a division, of heavy ion (AA) data and pp data. The two types of data
are provided by different generator runs. We will always assume that
Rivet can figure out whether it gets events from an AA generator or
a pp generator. This differentiation could be done by evaluating the
heavy ion block 'H' in a given HepMC file and/or by reading the beam
types. We have investigated the following 3 approaches and would
like to ask for comments and feedback:

1) External script: In this approach we don't modify the Rivet
    framework at all. The analysis is run independently for the two
    generators (pp and AA), in each case only one type of histograms
    is filled while the other stays empty. In the end, we use an
    external program/script to process the YODA output of both runs
    and perform the division. This can be done by using the YODA
    library and hence easily in Python or C++.

    Comments: So far there is no standard way to distribute or
    execute such a post-processing executable. A standard work flow
    to include a post-processing step would be desirable. A
    standardized hook to execute an arbitrary external script might
    provide more flexibility because those external scripts could be
    written in Python or C++ and could have an almost arbitrary level
    of complexity.

2) Specific executable, Rivet as library: In this case I wrote an
    executable which takes care of creating one instance of
    Rivet::AnalysisHandler and manages the read-in of two HepMC
    files. I based the code on the source code of the executable
    $MCPLOTS/scripts/mcprod/rivetvm/rivetvm.cc implemented in
    mcplots. My modifications are sketched in [1]. In this way, both
    data sets are available in the same analysis and the division can
    simply be done in the finalize step.

    Comments: It is also already possible on the commandline to pass
    two or more HepMC files to Rivet for sequential processing.

3) The goal of my last approach was to enable Rivet to produce
    reasonable analysis output without external dependences.
    Furthermore, it should be possible to have asynchronous
    production of pp and heavy ion YODA files independent from each
    other, bringing those together using only Rivet. Therefore, Rivet
    was modified to allow reading back the YODA output. This
    allows us to implement also the post-processing in the analysis
    class.

    Comments: You can find the code on
https://gitlab.cern.ch/bvolkel/rivet-for-heavy-ion/tree/working.
    The basic steps can be found in [2] and more comments can be
    found directly in the source code.

    For the R_AA analysis, Rivet can be first run with a pp generator.
    In the resulting YODA file only the pp objects are filled. In a
    second run, with the AA generator, Rivet can be started passing
    the first YODA file as additional input. In the Rivet analysis
    itself the heavy ion objects are filled. However, after
    finalize(), a method Analysis::replaceByData is called. The
    objects normally produced by the analysis are replaced by those
    from the provided YODA file if they have a finite number of
    entries. Hence, after the replacement there are filled and finalized
    pp objects coming from the first run and AA objects from the second
    run. Those can now be used in an newly introduced post() method
    which manages e.g. the division of histograms in case of the R_AA
    analysis. It is also possible to provide a YODA file where both
    the pp and the AA objects are filled and the R_AA objects have to
    be calculated. No actual analysis is done (0 events from an MC), but
    init(), analyze(...), finalize() and the post() method are called. The
    histograms are booked, nothing happens in analyze(..) and
    finalize(), the corresponding histograms are replaced after finalize()
    and post() handles the division. Basically, this is a similar 
approach as
    the one in scenario 1) but no external dependences are involved. Also,
    scenario 2) remains possible if the YODA input is avoided.

All methods work and lead to the desired output. The first two do
not need an extension of the Rivet source code. While first one allows
for the largest amount of flexibility, the second one is the one
which can be implemented most quickly and where all steps can be
encapsulated in one analysis class in Rivet. However, always two MC
generators runs are required at the same time. Finally, there is one
thing all approaches have in common, though, namely the extension of
the rivet executable and related ones, in order to account for these
analyses types in the command line: 1) linking to the external
post-processing script in a consistent way, 2) parallel processing
of at least two HepMC files, 3) read-in of a YODA file.

I hope that I could explain the issues in a reasonable manner. Jan,
Jochen and me are looking forward to a fruitful discussion of how to
implement analyses like the one mentioned above in a reasonable way.
Please give us critical feedback concerning the approaches and let us
know if there are more appropriate ways of solving our problems which I
haven't accounted for yet. A consistent and straightforward way of
implementing those analyses in Rivet would be extremely helpful.

Best regards and many thanks,

Benedikt



---------------------Appendix---------------------


[1] Sketch of the modification of $MCPLOTS/scripts/mcprod/rivetvm/rivetvm.cc

---------------------------------
...

ifstream is1( file1 );
ifstream is2( file2 );

...

AnalysisHandler rivet;

HepMC::evt1;
HepMC::evt2;

rivet.addAnalyses( RAA_analysis );

    while( !is1 || !is2 ) {

...
    evt1.read(is1);
    evt2.read(is2);


    analyze(evt1);
    analyze(evt2);
    ...

}

...
---------------------------------

[2] Basics of the small extension of Rivet

Rivet::Analysis:
Introducing member: bool _haveReadData
Introducing: void post()
Introducing: void Analysis::replaceByData( std::map< std::string,
AnalysisObjectPtr > readObjects )

Rivet::AnalysisHandler:
Introducing members: std::map< std::string, AnalysisObjectPtr >
_readObjects and bool _haveReadData
Introducing: void AnalysisHandler::readData(const std::string& filename)


More information about the Rivet mailing list