[Rivet] Bug in analysis MC_GENERIC

Andy Buckley andy.buckley at cern.ch
Sat Jul 12 23:18:55 BST 2014


On 07/07/14 17:19, David Bjergaard wrote:
> Hi,
> 
> Here's the output of rivet-cmphistos:
>> $ rivet-cmphistos Rivet.yoda 
>> terminate called after throwing an instance of 'YODA::WeightError'
>>   what():  Undefined weighted variance
>> Aborted (core dumped)
> My directory contains:
>> -rw-rw-r--   1 dave dave 2.1K Jul  7 12:11 MC_GENERIC_EtaChPMRatio.dat
>> -rw-rw-r--   1 dave dave 3.6K Jul  7 12:11 MC_GENERIC_Phi.dat
>> -rw-rw-r--   1 dave dave  20K Jul  7 12:11 MC_GENERIC_PtCh.dat
>> -rw-------   1 dave dave 119K Jun 30 16:46 Rivet.yoda
> 
>> [snip]
>> One of the other developers has told me that they got YODA exception
>> errors when trying to plot your file. I recently made some changes to
>> how YODA creates x,y values and errors where there are not enough stats
>> in a bin to calculate a correct mean or variance -- for now setting them
>> to zero but maybe later we'll do something cleverer.
> I think this is the cause,  I'm conflicted over whether or not the
> YODA exceptions are annoying or a good feature.

Ditto! Some design discussion needed...

> In this case, I think
> setting to zero is OK for now, but maybe the histogram structure could
> handle NaN and inf properly.  In this case (the variance), it should be
> NaN and not zero.

Or inf, since there is total uncertainty about how localised the
distribution is? Depends on whether e.g. computing a single-bin chi2
from a NaN/inf variance distribution should give delta/inf = 0 because
there is no significant deviation, or NaN to indicate that that bin is
problematic.

The former would "just work" (for that use case at least), while the
other way would need a special is-NaN test to skip adding that bin to
the total chi2, modify the number of degrees of freedom, etc. So I
_slightly_ lean toward the "inf way". Both inf and NaN would need to be
checked for plotting, since we don't like to *actually* draw infinitely
big error bars ;-)

> In the case of computing fractions, I would really
> like it if +/-inf when to overflow/underflow rather than crashing with
> an exception.

Taken on board, thanks. It does still require that the user (which could
be e.g. a plotting program) makes explicit tests for these values,
though: in general those special numbers won't be appropriate inputs.
And Leif spent a bit of time already *removing* NaNs from YODA (although
I forget exactly where...)

> The alternative view is that YODA exceptions are there for a reason and
> its the programmer's responsibility to handle them appropriately.  They
> force you to avoid ill-defined situations which is usually a good thing.

Yep. Either way there is some pain, so it's not an easy decision, but we
should discuss, decide, and then use the chosen approach consistently.
Having a switch between exception/inf+nan/zero strategies might be an
option... maybe. In reality underpopulated bins are a fact of life --
overall I think that it's a good thing that YODA doesn't sweep them
under the carpet, and that we pass on that information to the next stage
of processing rather than having to guess that there were no fills.

Andy

-- 
Dr Andy Buckley, Royal Society University Research Fellow
Particle Physics Expt Group, University of Glasgow / PH Dept, CERN


More information about the Rivet mailing list