[Rivet] Compatibility of YODA reference data from Rivet with HEPData

Peter Skands peter.skands at monash.edu
Mon Jun 18 03:21:08 BST 2018


Hi all,

Thanks Graeme for the new script. David/Jon/Rivet, any reactions? Would it,
e.g., be possible to put the status output by this script together with the
‘VALIDATED’ / ‘UNVALIDATED’ comment in the Rivet analysis summaries? That
would at least give everyone a direct and immediately visible tag
accessible when just looking at analyses in a browser?

Cheers,
Peter


On 15 June 2018 at 11:20:25 pm, Graeme Watt (graeme.watt at durham.ac.uk)
wrote:

Dear Peter (and Rivet developers),

No, I don't think this is something that Joanne (or Keith) could help with.

https://github.com/HEPData/miscellaneous/blob/master/scripts/rivet-diffhepdata-all

I wrote another script "rivet-diffhepdata-all" that loops over all Rivet
analyses listed in http://rivet.hepforge.org/analyses.json and compares
each Rivet .yoda file with the HEPData download.  It calls functions from
the previous "rivet-diffhepdata" script which in turn calls "yodadiff".

I added the URL option to pass the Rivet analysis name to HEPData when
requesting a YODA conversion, e.g.

https://hepdata.net/record/ins319520?format=yoda&rivet=ALEPH_1991_S2435284

This is necessary, for example, for Rivet analysis names containing the
SPIRES ID rather than the INSPIRE ID.  The "rivet-mkanalysis" script could
be modified accordingly:

https://rivet.hepforge.org/trac/browser/bin/rivet-mkanalysis#L139
hdurl = "http://www.hepdata.net/record/ins%s?format=yoda&rivet=%s" %
(ANAINSPIREID, ANANAME)

The web directory http://ippp.dur.ac.uk/~watt/RivetDiffHEPData/Rivet-2.6.0/
contains the output of running:

rivet-diffhepdata-all -r ../Rivet-2.6.0/analyses -d HEPDataYoda -o
YodaDiffOutput > rivet-diffhepdata-all.txt

The summary line doesn't look too promising:

"Of 359 Rivet analyses in ../Rivet-2.6.0/analyses, 66 (18.4%) were
compatible and 293 (81.6%) were incompatible."

Here, compatibility is defined as a zero exit status returned by
"yodadiff".  Only 31 of these 293 incompatible analyses are missing a
HEPData record.

Note that the "yodadiff" script gives a ZeroDivisionError in the function
"eq(a, b)" when "a" and "b" have opposite sign due to the return value:

return abs(float(a) - float(b))/(float(a) + float(b)) < opts.TOL

For example, d01-x01-y01 of ATLAS_2013_I1190187.yoda distributed with Rivet
has yerr- = 6.8 and yerr+ = -6.8.  The HEPData table (
http://www.hepdata.net/record/ins1190187?version=1&table=Table1 ) gives
both yerr- and yerr+ as 1.212930e+01, so it seems that the Rivet .yoda file
contains only the statistical error (with a wrong sign for yerr+).  The
Rivet .yoda file also assigns an artificial bin width of 1 for sqrt(s),
whereas the HEPData table does not assign a bin width for sqrt(s).  Looking
at the YODA export from the old HepData site:

http://hepdata.cedar.ac.uk/view/ins1190187/d1/yoda

again there are zero xerr- and xerr+ values, but the AIDA export:

http://hepdata.cedar.ac.uk/view/ins1190187/d1/aida

has a unit bin width written by "AidaFormatter.java" with a comment (by
Andy?):

// If there's only one bin and it has no width, give
// it unit width so that it can be filled and the height
// doesn't go mad when Rivet tries to use it.

I would argue that any construction of artificial bin widths would be
better handled on the Rivet side rather than in the HEPData export of the
data.  Removing artificial bin widths from the Scatter objects in the Rivet
.yoda files would resolve some of the incompatibilities between HEPData and
Rivet, but many more would remain.  I don't propose a universal solution
for now, but just want to start by identifying where the differences lie.
It will be interesting to monitor whether the degree of incompatibility,
now easily quantified by my new "rivet-diffhepdata-all" script, improves
for subsequent Rivet releases.

Best regards,
Graeme Watt (HEPData)


On 25/05/18 23:17, Peter Skands wrote:

Hi All,

Agree it sounds like it could make sense as part of Rivet, and could make a
real difference by making the step to check consistency almost trivial;
easy to run when developing / releasing analyses. In the latter case,
feedback could go back to the people submitting a new analysis if there is
a discrepancy, which would then put the burden of ensuring consistency
mainly on the people who contribute the analyses, without adding
significantly to the Rivet / HepData authors. Regarding the existing / old
analyses, Graeme, any chance you think of the person in Durham having a go
at the backlog of old analyses, or would that be too challenging for her?
It might be worth preparing an example, show it to Keith, and see what he
thinks?

Cheers,
Peter

—

*PETER SKANDS* Associate Professor


*School of Physics and Astronomy* Monash University
10 College Walk, Clayton Campus
Melbourne, VIC 3800
Australia

T: +61 3 990 53692 <//+61%203%20990%2053692>
E: peter.skands at monash.edu
W: skands.physics.monash.edu

On 26 May 2018 at 2:52:23 am, Graeme Watt (graeme.watt at durham.ac.uk) wrote:

Dear David,

OK, here's another much simpler Python script that downloads the YODA
file from HEPData and then calls yodadiff:

https://github.com/HEPData/miscellaneous/blob/master/scripts/rivet-diffhepdata

I'm not sure if this even warrants a script, given that the
functionality of:

  rivet-diffhepdata ATLAS_2017_I1614149.yoda -i 1614149

could be obtained with two commands, e.g.

  curl -L https://hepdata.net/record/ins1614149?format=yoda | tar zx
  yodadiff ATLAS_2017_I1614149.yoda HEPData-ins1614149-v2-yoda.yoda

The yodadiff script gives more detailed output of differences than my
previous script, but it does not compare annotations (maybe this
functionality could be added to yodadiff as an option?).  Also, the
yodadiff script flags additional analysis objects (like covariance
matrices) that are present in the HEPData YODA file but not in the Rivet
YODA file (as is the case for ATLAS_2017_I1614149), whereas my previous
script was specifically written to ignore these differences.  I suppose
this is OK if the user just ignores the warnings from yodadiff about
additional analysis objects in the HEPData YODA file.

Best regards,
Graeme


On 25/05/18 11:01, David Grellscheid wrote:
> Hi Graeme,
>
> Yes, I did mean yodadiff, sorry! There's no way it will start to include
> a Hepdata download option, it is meant to do one job of comparing two
> yoda files, which may have nothing at all to do with HEP or Rivet.
>
> The download option could be provided (in rivet/bin, not yoda/bin!) by a
> thin layer over the top of yoadiff, though. If you'd like to adapt your
> script in that way, we'd be happy to include it in the rivet distribution.
>
> See you,
>
> David
>
>
> On 22/05/2018 12:20, Graeme Watt wrote:
>> Dear David,
>>
>> Thanks, I think you mean yodadiff (not yodacmp).  You're right, it looks
>> like yodadiff does much the same as my script, other than the HEPData
>> download and optional comparison of annotations.  Maybe these features
>> could be added to yodadiff?  In any case, it would be good to make such
>> comparisons part of the validation and release procedure of each new
>> Rivet analysis, which is apparently not happening at the moment (other
>> than possibly for ATLAS analyses).
>>
>> Best regards,
>> Graeme
>>
>>
>> On 22/05/18 09:54, David Grellscheid wrote:
>>> Hi Graeme,
>>>
>>> thanks for your email, we have started discussing it. Just one technical
>>> point, YODA comes with a yodacmp script already, which addresses many of
>>> your technical issues with comparisons. Maybe you can use that
>>> internally in your script?
>>>
>>> See you,
>>>
>>>   David
>>>
>>>
>>> On 21/05/2018 20:16, Graeme Watt wrote:
>>>> Dear Rivet developers,
>>>>
>>>> I wrote a Python script to compare a YODA reference data file, intended
>>>> for inclusion in Rivet, with the corresponding YODA file downloaded
from
>>>> HEPData:
>>>>
>>>>
https://github.com/HEPData/miscellaneous/blob/master/scripts/yoda_compare_hepdata.py
>>>>
>>>>
>>>>    Example usage:  ./yoda_compare_hepdata.py ATLAS_2017_I1614149.yoda
-i
>>>> 1614149 -a
>>>>
>>>> This means: compare a local YODA file "ATLAS_2017_I1614149.yoda" with a
>>>> YODA file downloaded from the HEPData record with INSPIRE ID "1614149"
>>>> and also compare YODA annotations "-a".  Since the HEPData YODA file
>>>> might contain additional analysis objects compared to the Rivet YODA
>>>> file, and since there might be inconsequential rounding errors or
>>>> differences in number formats, comparison using a simple "diff" of
.yoda
>>>> files is not always adequate.
>>>>
>>>> I had a few problems with the YODA 1.7.0 software when writing the
>>>> Python script, which could perhaps be improved in future:
>>>>
>>>> * Calling dump() on Scatter objects does not output the same format as
>>>> in the input .yoda files, e.g. dumping a Scatter2D gives "HISTO1D" in
>>>> the output and the central value of the x bin is not output.  This
might
>>>> be due to deficiencies in
>>>> https://yoda.hepforge.org/trac/browser/src/WriterFLAT.cc .
>>>> * I expected to be able to check (fuzzy) equality of two Scatter
objects
>>>> using "s == s1", which seems to be implemented in C++ but not in
Python.
>>>> * Checking (fuzzy) equality of two Point2D objects using "p == p1" is
>>>> implemented in Python, but it only compares the x axes and not the y
>>>> axes.  Similarly, for Point3D objects, the (fuzzy) equality operator
>>>> "==" only compares the x and y axes, but not the z axes.
>>>> * In the end, I just copied the definition of "fuzzyEquals" from the
C++
>>>> code into my script and did my own comparisons, without relying on the
>>>> "==" operator for Point or Scatter objects.
>>>>
>>>> Recall that Holger Schulz made some similar comparisons in 2016 between
>>>> YODA reference data files from Rivet and from the old HepData, where he
>>>> found significant inconsistencies:
>>>>
>>>> https://www.hepforge.org/lists-archive/rivet/2016-October/007318.html
>>>>
https://rivet.hepforge.org/trac/browser/contrib/devscripts/HepDataConsistency
>>>>
>>>>
>>>>
>>>> While fixing all Rivet/HEPData inconsistencies is probably unrealistic
>>>> for now, we could at least start by ensuring that analyses added to new
>>>> Rivet releases include a YODA file that's compatible with the YODA
>>>> download given by HEPData.  My new script should be useful for these
>>>> purposes.  (This work was prompted by a conversation with Peter Skands
>>>> and Jon Butterworth last month in Durham.)  You're welcome to modify my
>>>> script as you need and include it in a future Rivet release.  The
script
>>>> could be run by the experiment contact persons before they upload new
>>>> analyses to the Rivet "contrib" area.  It could also be run (perhaps in
>>>> a more automated way) by the Rivet developers when moving analyses from
>>>> the "contrib" area to a new Rivet release.  I just ran the script for
>>>> all the new analyses in the current Rivet "contrib" area (
>>>> https://www.hepforge.org/archive/rivet/contrib/ ) and it already turned
>>>> up some useful information:
>>>>
>>>> * ATLAS_2014_I1310835, ATLAS_2017_I1614149, and ATLAS_2017_I1624693,
are
>>>> all compatible with HEPData.
>>>> * ATLAS_2016_I1502620 has multiple .yoda files which can't be handled
by
>>>> my script.
>>>> * ATLAS_2017_I1625109 showed some apparent inconsistencies but the
>>>> problem looks to be on the HEPData side.  The dataset index starts at 0
>>>> (instead of 1) due to a bug in the hepdata-converter package (which
I'll
>>>> fix) and the year in the analysis name is 2018 (instead of 2017) taken
>>>> from the journal publication.
>>>> * ALICE_2017_I1620477 is incompatible with HEPData.
>>>> * CMS_2016_I1487277 is compatible with HEPData.
>>>> * CMS_2012_I1111014 and CMS_2016_I1491950 are incompatible with
HEPData.
>>>> * CMS_2017_I1499471, CMS_2017_I1635889, CMS_2018_I1662081 and
>>>> CMS_2018_I1663958 are all missing from HEPData.
>>>> * CMS_2017_I1605749 has a HEPData record, but the YODA conversion fails
>>>> due to a problem with the original HEPData submission.
>>>> * LHCF_2015_I1351909 is compatible with HEPData, but the annotations
>>>> differ.
>>>> * LHCF_2016_I1385877 is incompatible with HEPData.
>>>>
>>>> I hope this gives you something to discuss at this week's Rivet
>>>> workshop! :-)
>>>>
>>>> Best regards,
>>>> Graeme Watt (HEPData)
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Rivet mailing list
>>>> Rivet at projects.hepforge.org
>>>> https://www.hepforge.org/lists/listinfo/rivet
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.hepforge.org/lists-archive/rivet/attachments/20180617/80a5b60a/attachment.html>


More information about the Rivet mailing list