[Rivet] aidamerge

Andy Buckley andy.buckley at ed.ac.uk
Sun Aug 21 11:04:52 BST 2011


Hi Hannes,

We understand that being able to merge histograms is an important 
feature. At the moment, however, you can't do it properly and generally, 
because our data storage does not support it. This is an historical 
accident rather than a design intent, but we only have a small developer 
team, and despite serious intentions to fix this for several *years* 
there has been much more demand for new analyses and new analysis 
functionality than for updated histogram persistency. We can only do so 
much at once, and until recently all use-cases could make do with simply 
using long runs.

I'm pleased to say that the replacement which will support all this and 
more (there are *many* things that we don't like about AIDA) is very 
well underway, but for now you have to make sure that the example 
approximate-merging script is doing what you need it to do. It's not 
ideal and we're working to improve on that situation, but it takes time. 
This is a physics package, not Microsoft Word -- physics users need to 
be prepared to open the code a little bit and do some debugging or 
hacking... especially when that code was only intended as an example.

In your problem case I suspect that the merging algorithm *is* 
appropriate, but that the numbers being entered are extremely large... 
so large that they are overflowing the Python (double precision) float 
type! So this anyway needs more debugging from your side than just 
sending us the crash traceback... a *minimal* set of AIDA files that 
reproduce the problem would help, or alternatively just inserting a 
print statement or two before the offending line.

In short, we need more information to help in this case, and for the 
general solution we're working on it. This will still take a while 
because a lot of migration and testing needs to happen. Telling us that 
it's not good enough won't make that happen faster, I'm afraid -- some 
development help in the last two years would have done, but it's even 
too late for that now. We'll at some point have a beta release with this 
feature, so I hope you'll be able to give us some feedback.

Best wishes,
Andy

PS. You mentioned being able to make ratios in the plotting phase... 
well, you can, but again it involves writing code. And I don't see that 
changing: we can't make a script with a "magically do what I want" 
option flag! Allowing merging *before* the finalise step is something we 
would like to do, but which requires a *lot* of design and planning. If 
you have any bright ideas about how this can be done nicely (i.e. 
uniformly for all analyses, and in the user-rather-than-developer mode) 
then please get in touch :)


On 21/08/11 08:15, Hannes Jung wrote:
> Dear Andy et all
>
> thanks a lot for your mail and your explanations.
>
> I understand that aidamerge is not an official script.
> However, in a usable histogramming package one must be able to add
> histograms at the end and possible errors must be trated, otherwise the
> package is not really usable for users... it might be fine for developers.
> In Root as well as in the old hbook/paw package we had options which
> treated adding histograms properly, of course one has to be careful to
> add the proper things, and adding ratios might be tricky.... but then
> there must be an option to do the ratio while plotting.
>
> There is of course an issue, whether whatever is added does make sense.
> But the package should work, or give an error message and treat that
> somehow, but not just crashing....
>
> The Aida package is nice and I like very much the rivet-mkhtml script
> which make life much easier.... but I do not want to develop the
> histogramming package, I just want to use it and to be sure that it does
> what it is supposed to do....
>
> Don't understand me wrong, I appreciate very much the help and support I
> get and got in the past solving problems with the histogramming
> package... but... I just want to use it.....
>
> Thanks a lot for your support
>
> Cheers
>
> Hannes
>
> On 20.08.2011, at 19:30, Andy Buckley wrote:
>
>> Hi Hannes and all,
>>
>> Please, note that the aidamerge script is not an official Rivet
>> script: it's an example of how you might write a script to do some
>> *approximate* statistical merging of independent runs.
>>
>> Because we don't currently store enough information to do the merging
>> exactly, this script has to make some guesses, and that's why we don't
>> offically support it. So you should make sure that it's doing merging
>> appropriate for your data -- using it blindly *will* lead to errors.
>>
>> So have a look in the code. From my own glances inside it, the
>> approximate merging algorithm used assumes that the samples you are
>> merging of the same size (you could add scale factors to a local copy
>> if you need them), and that you are either merging normalised
>> histograms or profile histograms -- if your data is of a different
>> type, most notably un-normalised histograms, then the assumed error
>> scaling will be incorrect. You mentioned ratios, Hannes: I *think* the
>> scaling is probably correct, i.e. more data makes the values converge
>> to the (weighted) mean of the runs and the errors get smaller as
>> 1/sqrt(N)... but it depends on exactly what you're doing.
>>
>> Andy
>>
>>
>> On 20/08/11 13:50, Hannes Jung wrote:
>>> Hi Daniel
>>>
>>> hm... it seems the messages below come from somewhere else....
>>> it didn't change even when setting the error to 1 instead of 1E308....
>>>
>>> Does anyone knows how to fix this ? ....
>>>
>>> Cheers
>>> Hannes
>>>
>>> On 20.08.2011, at 14:32, Daniel Weyh wrote:
>>>
>>>> Ok, ... I don't know at all what the plottings themselve do.
>>>> Probably using another float 1e+20 or somethin instead of 1e308 will
>>>> not cause an overflow...
>>>>
>>>> But, I'm afk at the moment... Sry
>>>>
>>>> Am 20.08.2011 um 14:20 schrieb Hannes Jung <hannes.jung at cern.ch
>>>> <mailto:hannes.jung at cern.ch>
>>>> <mailto:hannes.jung at cern.ch>>:
>>>>
>>>>> Hi Daniel again
>>>>>
>>>>> maybe I was too fast,.... the aidamerge did work, but when plotting
>>>>> it i get the follwoing errors:
>>>>>
>>>>> Plotting
>>>>> cascade-uPDFs/FWDCENTPHENO/Delta_phi_Delta_eta_eq_10_Et_gt_10_GeV.dat
>>>>> (33 remaining)
>>>>> Plotting
>>>>> cascade-uPDFs/FWDCENTPHENO/Delta_phi_Delta_eta_eq_10_Et_gt_30_GeV.dat
>>>>> (32 remaining)
>>>>> Error: cannot convert float NaN to integer
>>>>> Error: cannot convert float NaN to integer
>>>>> Plotting
>>>>> cascade-uPDFs/FWDCENTPHENO/Delta_phi_Delta_eta_eq_2_Et_gt_10_GeV.dat
>>>>> (29 remaining)
>>>>> Error: cannot convert float infinity to integer
>>>>>
>>>>> and then rivet-mkhtml gets stuck...
>>>>> Hm....
>>>>>
>>>>> thanks a lot
>>>>> Cheers
>>>>> Hannes
>>>>>
>>>>> On 20.08.2011, at 13:53, Daniel Weyh wrote:
>>>>>
>>>>>> Sry, I didn't know where you got your copy from.
>>>>>> It is uploaded to SVN (r3300).
>>>>>> Please check this out or look at
>>>>>> <http://projects.hepforge.org/rivet/trac/browser/contrib/aidamerge><http://projects.hepforge.org/rivet/trac/browser/contrib/aidamerge>http://projects.hepforge.org/rivet/trac/browser/contrib/aidamerge
>>>>>>
>>>>>> Hope it helps,
>>>>>> Daniel
>>>>>>
>>>>>>
>>>>>> Am 20.08.2011 um 13:35 schrieb Hannes Jung
>>>>>> <<mailto:hannes.jung at cern.ch>hannes.jung at cern.ch
>>>>>> <mailto:hannes.jung at cern.ch>>:
>>>>>>
>>>>>>> Hi Daniel
>>>>>>>
>>>>>>> thanks a lot..... I guess this should work.... I just don't know
>>>>>>> where to change what...
>>>>>>> could you perhaps tell me a bit more what to change in which line,
>>>>>>> or perhaps upload the patched version somewhere ?
>>>>>>>
>>>>>>> Thanks very much
>>>>>>> cheers
>>>>>>> hannes
>>>>>>>
>>>>>>> On 20.08.2011, at 13:27, Daniel Weyh wrote:
>>>>>>>
>>>>>>>> Dear Hannes,
>>>>>>>>
>>>>>>>>> Dear Riveties
>>>>>>>>>
>>>>>>>>> adding several aida files works fine, only in some cases I get
>>>>>>>>> the error:
>>>>>>>>>
>>>>>>>>> Traceback (most recent call last):
>>>>>>>>> File "./aidamerge", line 65, in <module>
>>>>>>>>> sum_err2 += h.getBin(i).getErr()**2
>>>>>>>>> OverflowError: (34, 'Numerical result out of range')
>>>>>>>>>
>>>>>>>>> I guess it comes when a histo is not properly filled (fro example
>>>>>>>>> when a ratio is taken).
>>>>>>>>> Is there a way to prevent these error message, and to continue
>>>>>>>>> with the program ?
>>>>>>>>
>>>>>>>> I added a patch to catch the exception, use float('inf') during
>>>>>>>> summing up and in the write out step this 'inf' is converted to a
>>>>>>>> vee..eery large float.
>>>>>>>> Does this work for you?
>>>>>>>>
>>>>>>>> @others: Is this the way it should work - or should we somehow
>>>>>>>> exclude such bins?!?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Daniel
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ***********************************************************************
>>>>>>> Hannes Jung
>>>>>>> Email:
>>>>>>> <mailto:Hannes.Jung at cern.ch><mailto:Hannes.Jung at cern.ch>Hannes.Jung at cern.ch
>>>>>>> <mailto:Hannes.Jung at cern.ch>
>>>>>>> mobile :+49 40 8998 93741
>>>>>>> <http://www.desy.de/~jung><http://www.desy.de/~jung>http://www.desy.de/~jung
>>>>>>>
>>>>>>> Tel: +49 (0) 40 8998 3741 (DESY)
>>>>>>> Tel: +41 22 76 62602 (CERN)
>>>>>>> CERN - PH
>>>>>>> 42-2-033
>>>>>>> CH-1211 Genève 23
>>>>>>> Switzerland
>>>>>>> ***********************************************************************
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> ***********************************************************************
>>>>> Hannes Jung
>>>>> Email: <mailto:Hannes.Jung at cern.ch>Hannes.Jung at cern.ch
>>>>> <mailto:Hannes.Jung at cern.ch>
>>>>> mobile :+49 40 8998 93741
>>>>> <http://www.desy.de/~jung>http://www.desy.de/~jung
>>>>> Tel: +49 (0) 40 8998 3741 (DESY)
>>>>> Tel: +41 22 76 62602 (CERN)
>>>>> CERN - PH
>>>>> 42-2-033
>>>>> CH-1211 Genève 23
>>>>> Switzerland
>>>>> ***********************************************************************
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>>
>>> ***********************************************************************
>>> Hannes Jung
>>> Email: Hannes.Jung at cern.ch <mailto:Hannes.Jung at cern.ch>
>>> <mailto:Hannes.Jung at cern.ch>
>>> mobile :+49 40 8998 93741
>>> http://www.desy.de/~jung
>>> Tel: +49 (0) 40 8998 3741 (DESY)
>>> Tel: +41 22 76 62602 (CERN)
>>> CERN - PH
>>> 42-2-033
>>> CH-1211 Genève 23
>>> Switzerland
>>> ***********************************************************************
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Rivet mailing list
>>> Rivet at projects.hepforge.org <mailto:Rivet at projects.hepforge.org>
>>> http://www.hepforge.org/lists/listinfo/rivet
>>
>>
>> --
>> Dr Andy Buckley
>> SUPA Advanced Research Fellow
>> Particle Physics Experiment Group, University of Edinburgh
>>
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>
>
> ***********************************************************************
> Hannes Jung
> Email: Hannes.Jung at cern.ch <mailto:Hannes.Jung at cern.ch>
> mobile :+49 40 8998 93741
> http://www.desy.de/~jung
> Tel: +49 (0) 40 8998 3741 (DESY)
> Tel: +41 22 76 62602 (CERN)
> CERN - PH
> 42-2-033
> CH-1211 Genève 23
> Switzerland
> ***********************************************************************
>
>
>
>


-- 
Dr Andy Buckley
SUPA Advanced Research Fellow
Particle Physics Experiment Group, University of Edinburgh

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the Rivet mailing list