[Rivet] Hepdata numbering

WATT, GRAEME graeme.watt at durham.ac.uk
Wed Feb 22 00:12:32 GMT 2017

Previous message: [Rivet] Hepdata numbering
Next message: [Rivet] Hepdata numbering
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Dear Andy,

The change to the YODA export of multidimensional tables from the new HEPData site is a secondary issue, which has no practical consequences for existing Rivet analyses as far as I know. The wider issue is the inconsistency of path names in YODA files included in Rivet analysis (where people often choose their own ‘x’ and ‘y’ values) and YODA files exported from the old HepData site (where the ‘x’ and ‘y’ values are the number of the axes within a table, usually 01). I see from the Rivet mailing list that Holger made a study showing these inconsistencies last October: https://www.hepforge.org/lists-archive/rivet/2016-October/007318.html

I think it should be part of the validation procedure for a new Rivet analysis that the YODA file matches the HepData/HEPData export. I would like to keep the automatic path names on the HEPData side, so that custom path names would need to be handled on the Rivet side. Allowing an override in the HEPData input file for new records would not fix the inconsistencies with existing records. These are not new issues and I’ve raised them with you (and Holger) at various points in the last few years, e.g. from an email to you in 2014:

---

On 9 Jun 2014, at 11:09, Graeme Watt <Graeme.Watt at durham.ac.uk<mailto:Graeme.Watt at durham.ac.uk>> wrote:

However, I'm a bit uncomfortable about writing a path that doesn't
correspond to the internal HepData IDs. Could this be better handled on
the Rivet side? For example, if a mapping between the HepData histogram
names and the Rivet histogram names was specified in the Rivet analysis,
then calling _hist1 = bookHisto1D(toHepDataIndices("d03-x01-y01")) would
be equivalent to _hist1 = bookHisto1D("d02-x01-y01"), where
"toHepDataIndices" is a function giving the mapping. Here, I'm looking
at slide 35 of your Rivet tutorial given at CERN on 21st November 2013.
Of course, such a mapping could just be left to the user and that might
be the best solution.

—

Best regards,
Graeme

On 21 Feb 2017, at 20:30, Andy Buckley <andy.buckley at cern.ch<mailto:andy.buckley at cern.ch>> wrote:

On 16/02/17 12:18, Graeme Watt wrote:
Dear All,

This was a conscious decision to improve the YODA export of
multidimensional tables, so that we now write the appropriate YODA
object for the number of independent variables, rather than always a
Scatter2D object:

https://github.com/HEPData/hepdata-converter/issues/5#issuecomment-135375309

I checked with Andy that he agreed with this decision (in an email sent
on 27th August 2015).

Aha. Yes, seemed like a good idea... but an option to get the backward-compatible format as well would help a lot, in this migration phase. I don't know how set-up we are to use 1D and 3D scatters at the moment.

Most (or even all?) existing HepData tables exported as YODA for use in
Rivet analyses will only have one independent variable and one dependent
variable (x01-y01).

Definitely not all! And in some places I think they are being used for reasons other than encoding 2D histograms... Chris? I think a wider discussion is needed.

More generally, I think this flags up that the dataset/axis naming in HepData was always a bit of a hack. Maybe the input format could now let the experiments specify their own names? I am certainly not welded to the d,x,y format that I cooked up one afternoon many years ago...

I suspect that Rivet analyses containing path names
with something different were prepared independently from the
corresponding HepData record and so the path names don't match anyway
(even on the old HepData site).

This is exactly what we're trying to avoid: we don't want there to be *any* such analyses.

Please let me know if you're aware of
any existing Rivet analyses with path names containing something
different than "x01" that correspond to a HepData table with more than
one independent variable. These are the only cases that would be
affected by the change, and I'm not aware of any so far (and neither was
Andy when I asked him back in 2015).

Ah, 2015: that's why I don't remember! It's actually a bit difficult to work it out from the code, but there are a lot of x02 etc. in our ref data folder -- 848 of them, to be precise (cf. yodals *.yoda | grep x0[2345] | wc -l)

But maybe we are not using those particular histograms... Chris/Holger, could you take a look at the Rivet MC output files from the pre-release testing using the command above, to see if any of our *output* uses second, third, etc. x-axis IDs?

It's been requested in the past to allow an option in the HepData input
file to allow some override of the automatic path names, but I think it
would be better to allow some kind of "mapping" to be coded within the
Rivet analysis between the HepData histogram names and the Rivet
histogram names, for cases where they don't match. Andy made some
comments on this last week:
https://www.hepforge.org/lists-archive/rivet/2017-February/007602.html

Well the "mapping" here *is* the HepData names. We just have functions like bookHisto1D(1,2,3) as syntactic sugar for bookHisto1D("d01-x02-y03"). The numerical names have the benefit of being easy to loop over, too -- but loopable numeric components in more "custom" names would also be very workable, IMHO.

Thanks,
Andy

On 16/02/17 11:15, David Grellscheid wrote:
Hi Graeme,

do I understand correctly that the new Hepdata engine has changed the
numbering on existing archived datasets and not just the new ones
coming in?

Thanks,

David

On 16/02/2017 10:59, Christian Gutschow wrote:
Dear Graeme,

thanks for your quick reply. Yes exactly, four 1-D histograms with
varying x and y path fields is what I’m after. I suppose I can work
around that for this analysis, but as ATLAS Rivet contact I can
already see this seemingly minor feature cause an awful lot of
frustration elsewhere.

There are already many Rivet analyses relying on path names
containing not just the "x01" that would need changing and I’m not
sure this is a path we wanna head down to be honest. On the
experiment side, we have elaborate MC validation frameworks that make
use of Rivet analyses and their existing path names, so I’m worried
that changes to the now well-established naming scheme will cause
havoc all over the place…

In fact, what will happen when (not if) we sync the Rivet reference
data repository against HEPData? I’m fairly sure this will break
everything and we’ll receive lots of abuse...

I’m cc’ing the Rivet list as I think the Rivet developers (and users)
need to be aware of this and this needs to be discussed. I wonder
whether a comprise is feasible where we just add an additional
'qualifier' field to the YAML input, that would allow us to set the
d01-x01-y01 style YODA names?

Cheers,
Chris

On 16 Feb 2017, at 10:17, Graeme Watt
<Graeme.Watt at durham.ac.uk<mailto:Graeme.Watt at durham.ac.uk><mailto:Graeme.Watt at durham.ac.uk>> wrote:

Dear Chris,

Good question. On the old HepData site, a data table could be
defined as "*data: x : x : y : y" (in the oldhepdata format,
corresponding to two independent variables and two dependent
variables), giving the YODA path names you mention, i.e. four 1-D
histograms: y1(x1), y2(x1), y1(x2), y2(x2), written as Scatter2D
objects. But an alternative interpretation of such a table might be
two 2-D histograms, y1(x1,x2) and y2(x1,x2). To remove the
ambiguity, we use only the latter interpretation on the new HEPData
site, writing two Scatter3D objects with path names d01-x01-y01 and
d01-x01-y02.

Since I think you want four 1-D histograms rather than two 2-D
histograms, you need to define two different tables, each with one
independent variable and two dependent variables, giving YODA path
names:
d01-x01-y01
d01-x01-y02
d02-x01-y01
d02-x01-y02
So the new HEPData site will always define YODA path names with "x01"
and it is not possible to get the path names mentioned in your
email. I hope this is not a problem for you, but you might need to
modify the path names in an existing Rivet analysis.

Best regards,
Graeme

On 15/02/17 23:34, Christian Gutschow wrote:
Hi,

I’m trying to work out how to write a YAML input file that will be
interpreted as table with 2 different independent variables, each
with two dependent variables. In YODA language the idea would be
something like:

d01-x01-y01
d01-x01-y02
d01-x02-y01
d01-x02-y02

I’d already be happy if perhaps you could point me to an example
entry where this has been achieved, so I can take a look at the
corresponding YAML file.

Many thanks in advance!

Cheers,
Chris

—

Dr. Christian Gütschow

Department of Physics and Astronomy
University College London
Gower Street
London WC1E 6BT

> D10 Physics Building
> +44 (0)20 7679 3775
> chris.g at cern.ch<mailto:chris.g at cern.ch><mailto:chris.g at cern.ch>

—

Dr. Christian Gütschow

Department of Physics and Astronomy
University College London
Gower Street
London WC1E 6BT

> D10 Physics Building
> +44 (0)20 7679 3775
> chris.g at cern.ch<mailto:chris.g at cern.ch><mailto:chris.g at cern.ch>

_______________________________________________
Rivet mailing list
Rivet at projects.hepforge.org<mailto:Rivet at projects.hepforge.org>
https://www.hepforge.org/lists/listinfo/rivet

--
Dr Andy Buckley, Lecturer / Royal Society University Research Fellow
Particle Physics Expt Group, University of Glasgow

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.hepforge.org/lists-archive/rivet/attachments/20170222/26aaad5a/attachment.html>

Previous message: [Rivet] Hepdata numbering
Next message: [Rivet] Hepdata numbering
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Rivet mailing list