<div dir="ltr"><div><div>Hi all,<br><br></div>Just to add my 2c to this disucssion: I think we should remove any guesswork in yodamerge. Forcing the user to specify exactly what s/he wants should prevent bugs and miscues. Unexpected results are a big downside to providing "magic" scripts that automatically figure out what the user is most likely to want.<br><br>Taking this maybe one step further: my opinion is that yoda definitely shouldn't make any physics-motivated assumptions about its inputs. It makes sense to me to have a rivet-merge-runs script or the like, but it should live in rivet-, not yoda-space.<br><br></div><div>Regards,<br></div><div><br></div>Chris<br><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, May 8, 2015 at 6:16 PM, Andy Buckley <span dir="ltr"><<a href="mailto:andy.buckley@cern.ch" target="_blank">andy.buckley@cern.ch</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 04/05/15 15:11, Frank Siegert wrote:<br> > Hi Andy,<br> ><br> >> Another thing on my to-do list is Frank's suggestion about making<br> >> --assume-normalized the default behaviour of yodamerge. I think we need<br> >> to discuss that properly: there is a tension between practicality and<br> >> principle in the heuristics we use, which can only be properly fixed by<br> >> storing the pre-finalize data objects and re-running finalize on the<br> >> merged end-of-loop histos. But let's have the discussion... maybe we can<br> >> make a decision in the next ~week in time for 2.2.2?<br> ><br> > Already at the end of the ~week time frame, sorry...<br> > So what is the principle due to which the current heuristics default<br> > is better? If we can't agree on a good default then we could just get<br> > rid of the heuristics completely and let the user specify whether the<br> > input yoda files come from identical runs with different random seeds<br> > or whether they are from different processes which should be added.<br> <br> </span>Hi Frank,<br> <br> I'm not sure my argument is very good. Actually I'm not sure *any*<br> guesswork like this is a good idea! But at least the current approach<br> *does* allow for the possibility of both normalized and unnormalized<br> histo merging.<br> <br> The choice is just between the direction of fallback if a check for<br> equal normalizations fails: I took the approach that if the histos to be<br> merged have sufficiently different normalizations, then I assume that<br> they are "raw", i.e. unnormalized and should just be added together. If<br> I was to assume that despite the mismatched integrals, the histos really<br> are normalized, then there is no path by which to combine unnormalized<br> histos: that obviously has drawbacks.<br> <br> But maybe I misunderstood: did you mean that we should check for the<br> "ScaledBy" attribute, and if there is one then we assume (by default,<br> presumably switchable) that equal norms were intended, while if there<br> isn't such an attribute then they *must* be raw histos? There are<br> downsides to this approach , too, but if you think it would b a better<br> match to real use-cases then I won't object to switching the logic<br> around -- provided that we add an --assume-unnormalized switch to<br> re-enable the current behaviour.<br> <br> Any way like this is a hack: we have to guess what the ScaledBy *meant*<br> semantically, while the finalize code really *knows* what is being done<br> to make the final histos. And we're never going to be able to correctly<br> add Scatters this way. So we *need* to get the re-engineering work done<br> to allow "proper" re-running of finalize with the merged pre-finalize<br> data objects "bootstrapped" from file.<br> <br> Andy<br> <br> PS. I'm not even daring to think about the distinction between<br> hetero/homogeneous merging at the moment! Any thoughts on how we should<br> approach that? But I do advocate that -- for now -- when you need to<br> know what you're doing rather than just making a quick plot, it's better<br> to write a little script than to fire off yodamerge and hope that it'll<br> do the right thing for your data objects. We could add little merging<br> routines to the YODA Python library to help with such scripts -- feel<br> free to contribute! (And also to the yoda.plotting library...)<br> <span class="im HOEnZb"><br> --<br> Dr Andy Buckley, Lecturer / Royal Society University Research Fellow<br> Particle Physics Expt Group, University of Glasgow<br> </span><div class="HOEnZb"><div class="h5">_______________________________________________<br> Rivet mailing list<br> <a href="mailto:Rivet@projects.hepforge.org">Rivet@projects.hepforge.org</a><br> <a href="https://www.hepforge.org/lists/listinfo/rivet" target="_blank">https://www.hepforge.org/lists/listinfo/rivet</a><br> </div></div></blockquote></div><br></div></div></div></div>