|
[Rivet] mcplotsAndy Buckley andy.buckley at ed.ac.ukWed Jun 1 11:24:16 BST 2011
On 01/06/11 11:43, Peter Skands wrote: > Hi Hendrik, > > Thanks for the tips, I hope we can implement most of them! > >> I've looked at this for a while, and since the problem still exists, I >> thought I'd better mention it. What is the technical problem you >> experience? Maybe we can help you solving it? > > Yes, our problem is in batching Rivet :) So far, the message I got from > Andy was that if we came up with a solution he'd be happy to implement > it, provided it was the solution he wanted :) Well, that's certainly true. Unfortunately no-one has come up with a fully worked-out and comprehensive system which will really do what is needed! > Our production system is > up and running, but we are not able to combine the outputs of several > runs yet. That's essentially the limiting factor, since we need to keep > the running time of each job "reasonable". Why? I know that you're using BOINC (for *all* generation?), but if the VM can be suspended and resumed arbitrarily then is there a problem if an MC job takes a couple of days? > For us, this also affects > producing histograms with Alpgen, for instance, where we need to merge > the output of several runs. We are close, though, but the solution we > have so far is not fully general. (Save event numbers and cross sections > for each generator run and combine assuming errors in quadrature, which > will only be correct for a subset of distributions, but among them are > some of the important ones that really need higher stats so I was > content to do this as a first step.) I understood that some work in this > direction has also gone on inside Rivet, but that a general solution is > also there at least a few releases away? Actually, I don't know of *any* analysis system which can deal with this problem exactly -- the really big problem is that in general the finalising step of a Rivet analysis might do *anything*. So if you batch the event loops to increase statistics, then in general you would need to then gather the interim data members of each analysis class, somehow know how to appropriately combine them, and then let the arbitrary histogram combination/normalising/mangling take place. That's a really hard problem to solve without making simple things incredibly hard. As you've noted, for plots which are just normal or profile histograms (and in the normal case you don't care about normalisation), then you can get adequate (but inexact) statistical merging by adding/averaging the bin values in each histo across your different runs... easiest if the runs are all the same size. But it will only work for a subset of distributions, so you'd need to hard-code which those are into the mcplots backend... and anything else will need longer runs to get appropriate statistics. We're going to extend the histogramming a bit, to hopefully make this sort of combination exact for the same sort of subset of plots. But I have no idea how to solve it in general... at least not without transforming Rivet into an unuseable abomination! Thought-out solutions to this set of problems are still more than welcome ;) Andy -- Dr Andy Buckley SUPA Advanced Research Fellow Particle Physics Experiment Group, University of Edinburgh The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
More information about the Rivet mailing list |