[Rivet] [Rivet-svn] r2394 - in trunk: include/Rivet src/Core src/Tools

Andy Buckley andy at insectnation.org
Fri Apr 9 00:11:49 BST 2010


On 08/04/10 23:31, Frank Siegert wrote:
> Hi guys,
> 
> Here some explanations about this commit.
> Getting frustrated that a simple (diphoton, no jets) analysis in Rivet is 
> taking longer than the actual event generation in Sherpa, I started 
> profiling our AnalysisHandler::analyze() phase today. What I found is 
> quite simple, you can find kcachegrind input files for all of these in 
> http://www.ippp.dur.ac.uk/~fsiegert/tmp/profile/ if you want to look at it 
> yourself. The starting point (callgrid.out.0initial) was: 
> D0_2010_S8570965::analyze() takes up only 11% of the runtime of 
> AnalysisHandler::analyze()!!!

Wow. Profiling was long overdue!

> 1. The getParticleNamesMap() function in ParticleName.hh is ridiculously 
> expensive and called even more ridiculously often. It alone was 
> responsible for >50% of analyze()!
> I have changed that into a static map which is initialized upon 
> AnalysisHandler construction. If somebody has a better idea for that I'm 
> happy to hear it.
> -> callgrind.out.1particlenames

This was on my private list for overhauling: it just went up several
notches.

> 2. Now the GenEvent copying in Rivet::Event takes up 30% of analyze().
> I have changed that such that the Event is only copied if we have to 
> rotate it or change the units. Otherwise the external const reference is 
> used.
> -> callgrind.out.2geneventref

Perfect.

> 3. Now Log::getLog takes up 20% of analyze().
> This is caused by the map searching, not any actual output. I haven't 
> solved that yet, as that would probably involve some rewrite. Instead I've 
> used a simple hack (prof_getlog.patch) to disable the log searching for 
> further profiling. If anybody has a good idea here, please shout.
> -> callgrind.out.3getlog

I think we should nick some nice logging macros from ATLAS to kill
logging CPU time associated with messages below the current level. This
will take some migration but can speed up a lot.

The map search... bloody hell. I'll have a think: when I wrote it I had
no idea what would be a slow part.

> 4. The output of beam pairs to TRACE level in Event::_geNormAlignment 
> takes up 15% now. I have removed it.
> -> callgrind.out.4nooutputinalign

We could reinstate this with the logging macros for protection.

> 5. Now beam projection and sqrtS() are 20% of analyze().
> I don't think we need to continuously monitor the event beams, but one 
> initial check is enough. So I have removed that from the analyze() phase.
> -> callgrind.out.5onlyinitialbeamcheck

Why are they so slow?! It's a super-simple projection, and sqrt(s) is
arithmetic... in fact, it should be just one addition most of the time. Hmm.

> Bottom line: D0_2010_S8570965::analyze() actually takes 94% of the total 
> analyze() time now.

Cool, though I think we need to tidy up your trail of devastation ;)

> I have committed the proposed fixes for 1, 2, 4 and 5. Would somebody be 
> willing to look for a proper solution to 3?

I guess this is me. Macros are easy but need migration in the code. But
since production runs tend to be ~silent, they would save all this CPU.
Otherwise it's a logging system redesign: argh.

Thanks Frank, this is awesome... I should have done it a long time ago.
Following up will take some effort, but is a good target for a 1.2.2
release I think, in combination with getting the UNVALIDATED analyses
into production.

Andy


More information about the Rivet mailing list