|
[Rivet] Bug in analysis MC_GENERICAndy Buckley andy.buckley at cern.chSat Jul 12 23:18:55 BST 2014
On 07/07/14 17:19, David Bjergaard wrote: > Hi, > > Here's the output of rivet-cmphistos: >> $ rivet-cmphistos Rivet.yoda >> terminate called after throwing an instance of 'YODA::WeightError' >> what(): Undefined weighted variance >> Aborted (core dumped) > My directory contains: >> -rw-rw-r-- 1 dave dave 2.1K Jul 7 12:11 MC_GENERIC_EtaChPMRatio.dat >> -rw-rw-r-- 1 dave dave 3.6K Jul 7 12:11 MC_GENERIC_Phi.dat >> -rw-rw-r-- 1 dave dave 20K Jul 7 12:11 MC_GENERIC_PtCh.dat >> -rw------- 1 dave dave 119K Jun 30 16:46 Rivet.yoda > >> [snip] >> One of the other developers has told me that they got YODA exception >> errors when trying to plot your file. I recently made some changes to >> how YODA creates x,y values and errors where there are not enough stats >> in a bin to calculate a correct mean or variance -- for now setting them >> to zero but maybe later we'll do something cleverer. > I think this is the cause, I'm conflicted over whether or not the > YODA exceptions are annoying or a good feature. Ditto! Some design discussion needed... > In this case, I think > setting to zero is OK for now, but maybe the histogram structure could > handle NaN and inf properly. In this case (the variance), it should be > NaN and not zero. Or inf, since there is total uncertainty about how localised the distribution is? Depends on whether e.g. computing a single-bin chi2 from a NaN/inf variance distribution should give delta/inf = 0 because there is no significant deviation, or NaN to indicate that that bin is problematic. The former would "just work" (for that use case at least), while the other way would need a special is-NaN test to skip adding that bin to the total chi2, modify the number of degrees of freedom, etc. So I _slightly_ lean toward the "inf way". Both inf and NaN would need to be checked for plotting, since we don't like to *actually* draw infinitely big error bars ;-) > In the case of computing fractions, I would really > like it if +/-inf when to overflow/underflow rather than crashing with > an exception. Taken on board, thanks. It does still require that the user (which could be e.g. a plotting program) makes explicit tests for these values, though: in general those special numbers won't be appropriate inputs. And Leif spent a bit of time already *removing* NaNs from YODA (although I forget exactly where...) > The alternative view is that YODA exceptions are there for a reason and > its the programmer's responsibility to handle them appropriately. They > force you to avoid ill-defined situations which is usually a good thing. Yep. Either way there is some pain, so it's not an easy decision, but we should discuss, decide, and then use the chosen approach consistently. Having a switch between exception/inf+nan/zero strategies might be an option... maybe. In reality underpopulated bins are a fact of life -- overall I think that it's a good thing that YODA doesn't sweep them under the carpet, and that we pass on that information to the next stage of processing rather than having to guess that there were no fills. Andy -- Dr Andy Buckley, Royal Society University Research Fellow Particle Physics Expt Group, University of Glasgow / PH Dept, CERN
More information about the Rivet mailing list |