[Rivet] Extension of mcplots/Rivet for Heavy Ion Analyses

Andy Buckley andy.buckley at cern.ch
Mon Aug 15 09:39:51 BST 2016


It may be what you were going to suggest/advertise, Leif, but in
HepMC3 "custom attributes" can be attached to event objects, meaning 
that generators have a place to store centrality information unlike in 
HepMC2. This may be a good motivator for HI experiment and theory to use 
HepMC3 -- at least that was the intention! And I'd be happy to extend 
Rivet to handle this info, once we have some feedback on what would be 
useful.

Andy


On 15/08/16 08:32, Leif Lönnblad wrote:
> Hi Benedict,
>
> I was also planning to do some Heavy Ion developments for rivet, and I'm
> very interested in your suggestions. I have not quite made up my mind
> about 1, 2, or 3 yet, but I agree with Andy that your option 3 may be
> combined with the planned facility for re-running finalize() on multiple
> yoda files.
>
> However, one thing that was not clear from your description is how to
> handle centrality, which is essential in any R_AA measurement. Do you
> have any ideas on that?
>
> Cheers,
> Leif
>
>
>
>
> On 2016-08-12 18:39, Benedikt Volkel wrote:
>> Hej Andy, hej Frank,
>>
>> thanks for your replies! It is great that you are interested in a
>> discussion.
>>
>> Basically, the proposed solution #2 would be a very easy solution
>> because it is both fast and simple to implement and there is no need to
>> extend Rivet. However, there are major drawbacks arising from further
>> desired capabilities which affects also proper resource management such
>> that #1 or #3 should be preferred over #2.
>>
>> Especially in the case of R_AA analysis it is interesting to combine the
>> output of different AA generators with different pp generators. If only
>> a direct read-in from HepMC files is possible, there are two ways of
>> doing that. Firstly, one can save the entire HepMC files of single runs
>> in order to pass the desired combinations to Rivet afterwards. This
>> method requires a lot of disk space and deleting the files means a
>> complete new MC run in case it is needed again. The other way is the one
>> Frank suggested. In this approach the problem is the large amount of
>> time because for every combination, two complete MC runs are required.
>>
>> To overcome the drawbacks it would be nice to be able to recycle
>> generated YODA files and to put them together afterwards in any desired
>> combination. This saves both computing power/time and disk space.
>>
>> More generally, the fact that #2 means not a full integration in Rivet
>> leads to the question of how certain pre/post-processes are handled in a
>> standardized manner and how the right ones are matched to certain
>> analyses. It might be difficult to ensure that something like
>>
>> $ rivet -a EXPERIMENT_YEAR_INSPIRE fifo.hepmc
>>
>> still works. Consequently, there might be some actual paper-based
>> analyses which cannot be handled by non-Rivet-experts.
>>
>>
>> Finally, a solution according to #3 might be preferred over #2 and #1
>> because
>>
>> -> everything, including general post-processing steps, can be handled
>> only by Rivet,
>>
>> -> resources are saved (in a more general way, not only regarding R_AA
>> analyses),
>>
>> -> there are other scenarios like those Andy mentioned which could be
>> also handled in this approach.
>>
>> What do you think about that? Again, we are glad about getting your
>> feedback and additional ideas.
>>
>> Cheers,
>>
>> Benedikt
>>
>> On 11.08.2016 22:10, Andy Buckley wrote:
>>> Hi Benedikt,
>>>
>>> Thanks for getting in touch -- sounds like a productive project.
>>>
>>> Actually, version 3 sounds a lot like what we have had in mind for
>>> some time, as part of the development branch for handling events with
>>> complex generator weightings: we planned to be able to initialise
>>> Rivet analyses with pre-existing YODA files, populating the
>>> "temporary" data objects and re-running the finalize() function.
>>>
>>> We intended this mainly so that YODA files from homogeneous,
>>> statistically independent parallel runs could be merged into one
>>> mega-run, and then the finalize() steps re-run to guarantee that
>>> arbitrarily complicated end-of-run manipulations would be correctly
>>> computed. But it sounds like it would be similarly useful for you...
>>>
>>> Thanks for the pointer to your code. Can I ask how long you will be
>>> working on this project for? I look forward to the discussion and
>>> hopefully some of us will be able to meet in person at CERN, too...
>>> but if you would still be available at the end of September you would
>>> be very welcome (and we can pay) for you to attend our 3-day developer
>>> workshop.
>>>
>>> Cheers,
>>> Andy
>>
>> On 11.08.2016 09:41, Frank Siegert wrote:
>>> Dear Benedikt,
>>>
>>> thanks for your mail and for describing your problem and solutions
>>> very clearly.
>>>
>>> I want to throw into the discussion a 4th alternative, which is
>>> somewhat similar to your #2 but doesn't need any modifications of
>>> Rivet itself. I have used this approach with a Bachelor student when
>>> we were trying to determine non-perturbative corrections with
>>> iterative unfolding, i.e. we needed both hadronised and non-hadronised
>>> events in the analysis at the same time to fill the response matrix.
>>> Thus, for us it was important to preserve the correlation between
>>> hadronised and non-hadronised event, which for you is not an issue, so
>>> maybe this method is not necessary or more complicated for you, but I
>>> thought I'd mention it nonetheless.
>>>
>>> We are running a standalone pre-processor script which combines the
>>> HepMC files from the two generator runs, and by using appropriate
>>> particle status codes embeds the non-hadronised event into the
>>> hadronised one. We then wrote an analysis plugin including a custom
>>> projection, which can extract separately (based on the particle
>>> status) the non-hadronised event and the hadronised event from the
>>> same HepMC file. This allowed us not just to fill the two sets of
>>> histograms in the same run (and divide them in finalize), as you would
>>> want to do it, but also fill a response matrix with correlated events,
>>> which you probably don't care about.
>>>
>>> So basically all you would need is a pre-processing script to combine
>>> the HepMC files, which could possibly be included in your HI generator
>>> interface and thus not disrupt the workflow. But maybe this is too
>>> complicated, and given that you don't need the correlations you might
>>> be better off with your approach #1.
>>>
>>> Cheers,
>>> Frank
>>
>>>
>>>
>>> On 10/08/16 21:55, Benedikt Volkel wrote:
>>>> Dear Rivet developers,
>>>>
>>>> my name is Benedikt Volkel and I'm working on a summer student
>>>> project in ALICE with Jan Fiete Grosse-Oetringhaus and Jochen Klein.
>>>> The goal is to extend the mcplots project to cover specific needs
>>>> arising from heavy-ion analyses. In particular, we want to implement
>>>> a post-processing step, which is frequently required for heavy-ion
>>>> analyses. This step must take place after the production of certain
>>>> analysis output, e.g. to combine results from different generator
>>>> runs. As mcplots is based on the standard Rivet work flow, the
>>>> questions do not apply just to mcplots but more general to Rivet. To
>>>> sketch the problem in more detail and start a discussion on a
>>>> possible implementation of a standardized post-processing step we
>>>> use the example of an R_AA analysis as a starting point.
>>>>
>>>> The conceptual problem of an R_AA analysis is the combination, here
>>>> a division, of heavy ion (AA) data and pp data. The two types of data
>>>> are provided by different generator runs. We will always assume that
>>>> Rivet can figure out whether it gets events from an AA generator or
>>>> a pp generator. This differentiation could be done by evaluating the
>>>> heavy ion block 'H' in a given HepMC file and/or by reading the beam
>>>> types. We have investigated the following 3 approaches and would
>>>> like to ask for comments and feedback:
>>>>
>>>> 1) External script: In this approach we don't modify the Rivet
>>>>    framework at all. The analysis is run independently for the two
>>>>    generators (pp and AA), in each case only one type of histograms
>>>>    is filled while the other stays empty. In the end, we use an
>>>>    external program/script to process the YODA output of both runs
>>>>    and perform the division. This can be done by using the YODA
>>>>    library and hence easily in Python or C++.
>>>>
>>>>    Comments: So far there is no standard way to distribute or
>>>>    execute such a post-processing executable. A standard work flow
>>>>    to include a post-processing step would be desirable. A
>>>>    standardized hook to execute an arbitrary external script might
>>>>    provide more flexibility because those external scripts could be
>>>>    written in Python or C++ and could have an almost arbitrary level
>>>>    of complexity.
>>>>
>>>> 2) Specific executable, Rivet as library: In this case I wrote an
>>>>    executable which takes care of creating one instance of
>>>>    Rivet::AnalysisHandler and manages the read-in of two HepMC
>>>>    files. I based the code on the source code of the executable
>>>>    $MCPLOTS/scripts/mcprod/rivetvm/rivetvm.cc implemented in
>>>>    mcplots. My modifications are sketched in [1]. In this way, both
>>>>    data sets are available in the same analysis and the division can
>>>>    simply be done in the finalize step.
>>>>
>>>>    Comments: It is also already possible on the commandline to pass
>>>>    two or more HepMC files to Rivet for sequential processing.
>>>>
>>>> 3) The goal of my last approach was to enable Rivet to produce
>>>>    reasonable analysis output without external dependences.
>>>>    Furthermore, it should be possible to have asynchronous
>>>>    production of pp and heavy ion YODA files independent from each
>>>>    other, bringing those together using only Rivet. Therefore, Rivet
>>>>    was modified to allow reading back the YODA output. This
>>>>    allows us to implement also the post-processing in the analysis
>>>>    class.
>>>>
>>>>    Comments: You can find the code on
>>>> https://gitlab.cern.ch/bvolkel/rivet-for-heavy-ion/tree/working.
>>>>    The basic steps can be found in [2] and more comments can be
>>>>    found directly in the source code.
>>>>
>>>>    For the R_AA analysis, Rivet can be first run with a pp generator.
>>>>    In the resulting YODA file only the pp objects are filled. In a
>>>>    second run, with the AA generator, Rivet can be started passing
>>>>    the first YODA file as additional input. In the Rivet analysis
>>>>    itself the heavy ion objects are filled. However, after
>>>>    finalize(), a method Analysis::replaceByData is called. The
>>>>    objects normally produced by the analysis are replaced by those
>>>>    from the provided YODA file if they have a finite number of
>>>>    entries. Hence, after the replacement there are filled and finalized
>>>>    pp objects coming from the first run and AA objects from the second
>>>>    run. Those can now be used in an newly introduced post() method
>>>>    which manages e.g. the division of histograms in case of the R_AA
>>>>    analysis. It is also possible to provide a YODA file where both
>>>>    the pp and the AA objects are filled and the R_AA objects have to
>>>>    be calculated. No actual analysis is done (0 events from an MC), but
>>>>    init(), analyze(...), finalize() and the post() method are called.
>>>> The
>>>>    histograms are booked, nothing happens in analyze(..) and
>>>>    finalize(), the corresponding histograms are replaced after
>>>> finalize()
>>>>    and post() handles the division. Basically, this is a similar
>>>> approach as
>>>>    the one in scenario 1) but no external dependences are involved.
>>>> Also,
>>>>    scenario 2) remains possible if the YODA input is avoided.
>>>>
>>>> All methods work and lead to the desired output. The first two do
>>>> not need an extension of the Rivet source code. While first one allows
>>>> for the largest amount of flexibility, the second one is the one
>>>> which can be implemented most quickly and where all steps can be
>>>> encapsulated in one analysis class in Rivet. However, always two MC
>>>> generators runs are required at the same time. Finally, there is one
>>>> thing all approaches have in common, though, namely the extension of
>>>> the rivet executable and related ones, in order to account for these
>>>> analyses types in the command line: 1) linking to the external
>>>> post-processing script in a consistent way, 2) parallel processing
>>>> of at least two HepMC files, 3) read-in of a YODA file.
>>>>
>>>> I hope that I could explain the issues in a reasonable manner. Jan,
>>>> Jochen and me are looking forward to a fruitful discussion of how to
>>>> implement analyses like the one mentioned above in a reasonable way.
>>>> Please give us critical feedback concerning the approaches and let us
>>>> know if there are more appropriate ways of solving our problems which I
>>>> haven't accounted for yet. A consistent and straightforward way of
>>>> implementing those analyses in Rivet would be extremely helpful.
>>>>
>>>> Best regards and many thanks,
>>>>
>>>> Benedikt
>>>>
>>>>
>>>>
>>>> ---------------------Appendix---------------------
>>>>
>>>>
>>>> [1] Sketch of the modification of
>>>> $MCPLOTS/scripts/mcprod/rivetvm/rivetvm.cc
>>>>
>>>> ---------------------------------
>>>> ...
>>>>
>>>> ifstream is1( file1 );
>>>> ifstream is2( file2 );
>>>>
>>>> ...
>>>>
>>>> AnalysisHandler rivet;
>>>>
>>>> HepMC::evt1;
>>>> HepMC::evt2;
>>>>
>>>> rivet.addAnalyses( RAA_analysis );
>>>>
>>>>    while( !is1 || !is2 ) {
>>>>
>>>> ...
>>>>    evt1.read(is1);
>>>>    evt2.read(is2);
>>>>
>>>>
>>>>    analyze(evt1);
>>>>    analyze(evt2);
>>>>    ...
>>>>
>>>> }
>>>>
>>>> ...
>>>> ---------------------------------
>>>>
>>>> [2] Basics of the small extension of Rivet
>>>>
>>>> Rivet::Analysis:
>>>> Introducing member: bool _haveReadData
>>>> Introducing: void post()
>>>> Introducing: void Analysis::replaceByData( std::map< std::string,
>>>> AnalysisObjectPtr > readObjects )
>>>>
>>>> Rivet::AnalysisHandler:
>>>> Introducing members: std::map< std::string, AnalysisObjectPtr >
>>>> _readObjects and bool _haveReadData
>>>> Introducing: void AnalysisHandler::readData(const std::string&
>>>> filename)
>>>> _______________________________________________
>>>> Rivet mailing list
>>>> Rivet at projects.hepforge.org
>>>> https://www.hepforge.org/lists/listinfo/rivet
>>>
>>>
>>
>> _______________________________________________
>> Rivet mailing list
>> Rivet at projects.hepforge.org
>> https://www.hepforge.org/lists/listinfo/rivet
>
>
> _______________________________________________
> Rivet mailing list
> Rivet at projects.hepforge.org
> https://www.hepforge.org/lists/listinfo/rivet


-- 
Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
Particle Physics Expt Group, University of Glasgow


More information about the Rivet mailing list