[Rivet] Madgraph (and alpgen) in Rivet

Andy Buckley andy.buckley at cern.ch
Tue Jan 6 00:55:40 GMT 2015
Previous message: [Rivet] Madgraph (and alpgen) in Rivet
Next message: [Rivet] Madgraph (and alpgen) in Rivet
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Gavin,

I think there can be cases where this won't work, due to weights other
than event weights in observable construction, and maybe the many cases
where the appropriate sum of weights is not the total (but I could be
wrong about that).

I think the neatest way is for each analysis to "secretly" store a
hidden cross-section variable (as a YODA object), plus event number and
weight-sum Counters. Chris & David have been finishing off the first
stage of work on weight vectors & storing enough temporary histo status
to allow re-running the finalize step. Providing the right temporary
data objects to allow that will be a much more long-term solution than
hacking some intermediate state into extra annotations. Could you or
others help with it?

I guess for now you do know (or can find) the cross-section x_i for each
n-parton run, and N_i is stored for each histogram (and bin). It's just
s_i, the sumr of weights for run i, that we don't currently store:
right? I'm a bit vague about the role of N, though: in your example all
N events in a run go into one bin, so is it really the bin or the run N
that you want? And is it really N that matters, or the sum of weights in
the bin/histo? Just for clarity...

Cheers,
Andy


On 05/01/15 16:29, Gavin Hesketh wrote:
> So I took a very quick look at this, and it seems the simplest thing to
> do is to scale plots by crossSection()/sumOfWeights() before
> normalising. This could be done in each analysis that calls nomalize(),
> or more simply in the normalisze function itself in Analysis.cc, eg:
> 
> 
> void Analysis::normalize(Histo1DPtr histo, double norm, bool
> includeoverflows) {
>   if (!histo) {
>     MSG_ERROR("Failed to normalize histo=NULL in analysis " << name() <<
> " (norm=" << norm << ")");
>     return;
>   }
>   MSG_TRACE("Normalizing histo " << histo->path() << " to " << norm);
>   try {
>     histo->setAnnotation("normalization", norm);
>     histo->setAnnotation("includeoverflows", includeoverflows);
>     histo->scaleW(crossSection()/sumOfWeights());
>   } catch (YODA::Exception& we) {
>     MSG_WARNING("Could not scale histo " << histo->path());
>     return;
>   }
>   try {
>     histo->normalize(norm, includeoverflows);
>   } catch (YODA::Exception& we) {
>     MSG_WARNING("Could not normalize histo " << histo->path());
>     return;
>   }
> }
> 
> The user could then figure out the rest at the yodamerge step (or,
> better still, the script could be updated to use the new annotations
> along with the ScaledBy annotation). For unit-normalised plots this
> process is quite trivial, but the extra annotations would be needed to
> deal with other situations. Some protection would need to be added for
> the case where the user does not provide a cross section of course!
> 
> Are there any obvious drawbacks to this approach?
> 
> cheers,
> Gavin
> 
> 
> 
> 
> On 23/12/14 17:24, Gavin Hesketh wrote:
>> ah, I wondered if that number would be useful :) So if it is possible to
>> write the cross section and sum of weights to each yoda file, I think
>> we'd have everything needed. I'd be happy to look into this, bearing in
>> mind that I've only just started using yoda :) So if you have some
>> pointers, that would help.
>>
>> cheers,
>> Gavin
>>
>>
>> On 23/12/14 17:13, Andy Buckley wrote:
>>> yodamerge already knows the pre-normalization numbers in each run's
>>> histos before merging, via the ScaledBy annotation. I think I got the
>>> scaling logic correct so that if you pass the scalefactors on the
>>> yodamerge command line, they will be applied after unrolling the
>>> normalization and before the merging and re-normalization.
>>>
>>> But I might have missed something. Will read more closely later but have
>>> to run now...
>>>
>>> Andy
>>>
>>>
>>> On 23/12/14 17:07, Gavin Hesketh wrote:
>>>> Hi Andy,
>>>> don't think yodamerge does what is needed. If we had the xsec and
>>>> sum of
>>>> weights stored in the yoda file, and the area of each plot before
>>>> unit-normalising, it would be possible. The cross section for each run
>>>> alone does not help.
>>>>
>>>> Ok, simplest case scenario: we have two jet slices we need to combine
>>>> (eg W+0jet, W+1jet), with cross sections x1 and x2. Each sample is a
>>>> different size, and has sum of weights s1 and s2. To combine them, we
>>>> need to scale each sample to the same lumi (1 pb-1, say). The generated
>>>> lumi for each sample is L1=s1/x1 and L2=s2/x2. So we need to divide
>>>> each
>>>> plot by L1 or L2.
>>>>
>>>> Now say we have a plot which has two bins. Due to the cuts, the first
>>>> bin is only filled by sample 1 (with N1 events), and the second bin is
>>>> only filled by sample 2 (with N2 events). N1!=s1 and N2!=s2, due to the
>>>> cuts.
>>>>
>>>> After scaling and merging the two samples, the first bin will have
>>>> N1*x1/L1 events, and the second will have N2*x2/L2.
>>>>
>>>> Then after unit-normalising the merged plot, the bins will contain:
>>>> 1) N1*x1/s1 / (N1*x1/s1 + N2*x2/s2)
>>>> 2) N2*x2/s2 / (N1*x1/s1 + N2*x2/s2)
>>>>
>>>>
>>>> With the current code, the plots are unit-normalised at the end of each
>>>> run (ie for each sample). After merging the samples, the plot contents
>>>> are (1,1). Redoing the unit-normalisation, they will be (0.5, 0.5). ie
>>>> there is no way to get the right answer...
>>>>
>>>> In order to get the right answer using yodamerge, we need N for each
>>>> plot, x and s for each run. Storing just x and s is not enough.
>>>>
>>>> Alternatively, as with the hack I've implemented, we normalise each
>>>> plot
>>>> to N.x/s, and can then do anything needed with yodamerge.
>>>>
>>>> Hope that's clearer :)
>>>>
>>>> Gavin
>>>>
>>>>
>>>>
>>>> On 23/12/14 16:29, Andy Buckley wrote:
>>>>> Hi Gavin,
>>>>>
>>>>> Have you tried yodamerge? It allows you to give multiplicative
>>>>> coefficients for the merging (even of unit-normalised histograms and
>>>>> profiles) of YODA files from different np-X runs.
>>>>>
>>>>> You would still need to know the xsecs for each run, but that is not
>>>>> unusual! In future we'll make this a bit more automatic by writing the
>>>>> xsec into each .yoda file as one of several "hidden" data structures,
>>>>> but that's not in place yet -- you know the usual manpower story. We'd
>>>>> be *very* happy to have some effort donated if you need this to work,
>>>>> though!
>>>>>
>>>>> Let me know if yodamerge works for you, or if not, what I've
>>>>> misunderstood this time ;-)
>>>>>
>>>>> Merry Christmas!
>>>>> Andy
>>>>>
>>>>>
>>>>> On 23/12/14 16:06, Gavin Hesketh wrote:
>>>>>> Hi,
>>>>>> We are validating madgraph for ATLAS, and are having to generate
>>>>>> samples
>>>>>> of jet multiplicities. This causes problems for some analyses where
>>>>>> histograms are unit-normalised or ratios are taken at the end of a
>>>>>> run.
>>>>>> I've raised this problem in the past for alpgen, and unfortunately
>>>>>> seems
>>>>>> like it's not going to go away any time soon...
>>>>>>
>>>>>> So I wondered if there are good ideas for how to handle it, or if
>>>>>> yoda
>>>>>> has some features I'm not aware of that we can use for work-arounds?
>>>>>>
>>>>>>
>>>>>> An example: I'm looking at ATLAS_2013_I1217867 (kt splitting scales
>>>>>> in W
>>>>>> events). At the end of the run, each plot is unit-normalised:
>>>>>>       normalize(_h_dI[flav][i], 1.0, false);
>>>>>> So, when later adding up 0-jet, 1-jet, 2-jet, 3-jet samples, we
>>>>>> end up
>>>>>> with a histogram with area 4, and completely the wrong shape.
>>>>>>
>>>>>> To solve this, I've hacked the routine to do a more standard
>>>>>> cross-section normalisation:
>>>>>>         double normfac = crossSection()/sumOfWeights();
>>>>>>         scale(_h_dI[flav][i], normfac);
>>>>>> then will yodamerge the jet samples and write a little yoda script to
>>>>>> unit-normalise the plot at the end.
>>>>>>
>>>>>> While I think it would be a nightmare to try to generalise and
>>>>>> solve for
>>>>>> all possible situations, it would be very useful to have a rivet
>>>>>> command-line flag which turns off unit-normalisation / ratios, and
>>>>>> just
>>>>>> normalises all plots to cross sections. Still not a small amount of
>>>>>> work
>>>>>> to modify all existing analyses of course... Could then provide an
>>>>>> example script to merge and unit-normalise plots in yodamerge.
>>>>>>
>>>>>> But, like I said, I'm hoping someone has a better idea...
>>>>>>
>>>>>> Merry Christmas :)
>>>>>> Gavin
>>>>>> _______________________________________________
>>>>>> Rivet mailing list
>>>>>> Rivet at projects.hepforge.org
>>>>>> https://www.hepforge.org/lists/listinfo/rivet
>>>>>
>>>>>
>>>
>>>
>> _______________________________________________
>> Rivet mailing list
>> Rivet at projects.hepforge.org
>> https://www.hepforge.org/lists/listinfo/rivet
> _______________________________________________
> Rivet mailing list
> Rivet at projects.hepforge.org
> https://www.hepforge.org/lists/listinfo/rivet


-- 
Dr Andy Buckley, Royal Society University Research Fellow
Particle Physics Expt Group, University of Glasgow / PH Dept, CERN
Previous message: [Rivet] Madgraph (and alpgen) in Rivet
Next message: [Rivet] Madgraph (and alpgen) in Rivet
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Rivet mailing list