[Rivet] Thoughts on a Rivet 2.2.2 release?

Andy Buckley andy.buckley at cern.ch
Fri May 8 17:16:07 BST 2015


On 04/05/15 15:11, Frank Siegert wrote:
> Hi Andy,
> 
>> Another thing on my to-do list is Frank's suggestion about making
>> --assume-normalized the default behaviour of yodamerge. I think we need
>> to discuss that properly: there is a tension between practicality and
>> principle in the heuristics we use, which can only be properly fixed by
>> storing the pre-finalize data objects and re-running finalize on the
>> merged end-of-loop histos. But let's have the discussion... maybe we can
>> make a decision in the next ~week in time for 2.2.2?
> 
> Already at the end of the ~week time frame, sorry...
> So what is the principle due to which the current heuristics default
> is better? If we can't agree on a good default then we could just get
> rid of the heuristics completely and let the user specify whether the
> input yoda files come from identical runs with different random seeds
> or whether they are from different processes which should be added.

Hi Frank,

I'm not sure my argument is very good. Actually I'm not sure *any*
guesswork like this is a good idea! But at least the current approach
*does* allow for the possibility of both normalized and unnormalized
histo merging.

The choice is just between the direction of fallback if a check for
equal normalizations fails: I took the approach that if the histos to be
merged have sufficiently different normalizations, then I assume that
they are "raw", i.e. unnormalized and should just be added together. If
I was to assume that despite the mismatched integrals, the histos really
are normalized, then there is no path by which to combine unnormalized
histos: that obviously has drawbacks.

But maybe I misunderstood: did you mean that we should check for the
"ScaledBy" attribute, and if there is one then we assume (by default,
presumably switchable) that equal norms were intended, while if there
isn't such an attribute then they *must* be raw histos? There are
downsides to this approach , too, but if you think it would b a better
match to real use-cases then I won't object to switching the logic
around -- provided that we add an --assume-unnormalized switch to
re-enable the current behaviour.

Any way like this is a hack: we have to guess what the ScaledBy *meant*
semantically, while the finalize code really *knows* what is being done
to make the final histos. And we're never going to be able to correctly
add Scatters this way. So we *need* to get the re-engineering work done
to allow "proper" re-running of finalize with the merged pre-finalize
data objects "bootstrapped" from file.

Andy

PS. I'm not even daring to think about the distinction between
hetero/homogeneous merging at the moment! Any thoughts on how we should
approach that? But I do advocate that -- for now -- when you need to
know what you're doing rather than just making a quick plot, it's better
to write a little script than to fire off yodamerge and hope that it'll
do the right thing for your data objects. We could add little merging
routines to the YODA Python library to help with such scripts -- feel
free to contribute! (And also to the yoda.plotting library...)

-- 
Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
Particle Physics Expt Group, University of Glasgow


More information about the Rivet mailing list