[Rivet] [Fastjet] FastJet "robot" downloads blocked

Matteo Cacciari cacciari at lpthe.jussieu.fr
Fri Apr 19 14:00:30 BST 2013


Hi Frank.

I'm around all afternoon.

From what I see in the logs, it may be that the issue is not so much the
User-Agent, but rather the lack of an Accept header. Once I have an IP I
can tell for sure.

As Gavin said, out server was...downgraded a few weeks ago, so the problem
may originate from a new set of security rules included at tht moment.

See you later,
Matteo



On 19/04/2013 14:55, Frank Siegert wrote:
> Hi Gavin, all,
> 
> Thanks for looking into it. I have also not noticed any problems with
> wget, only with the Python library used in Rivet's bootstrap script
> (urllib2). One gets a 403 error as reply -- I thought this might
> happen to deny access to (Python) bots. I have tried to work around
> this by specifying the User-Agent header in the urllib2 request, but I
> still got a 403.
> 
> I'm preparing a minimal script to reproduce this, will send it to you
> in a few minutes. The problem I had was that the rejection seemed to
> be dynamic, it worked one day, but not the other. Since we had several
> users report the same issue in the last weeks we thought it probably
> makes sense to contact you.
> 
> Matteo: If you are around for the afternoon tutorials at MC4BSM we
> could also have a quick look at this together.
> 
> Cheers,
> Frank
> 
> 
> On 19 April 2013 14:03, Gavin Salam <gavin.salam at cern.ch> wrote:
>> Hi Andy,
>>
>> This is weird -- I know nothing about any robot blocking and we don't have a
>> robots.txt file at all. Do you have a small example script to illustrate the
>> problem (or is the easiest option to try rivet's bootstrap script)? I just
>> checked things with wget and that worked fine. But our web servers did get
>> updated (downgraded really) lately, so something might have changed relative
>> to a few weeks ago.
>>
>> Cheers,
>> Gavin
>>
>>
>>
>> On 4/19/13 12:42 PM, Andy Buckley wrote:
>>>
>>> Hi Gregory, Gavin, et al,
>>>
>>> We've noticed recently that the FastJet website blocks "robot" downloads
>>> of the FastJet tarball, e.g. http://fastjet.fr/repo/fastjet-3.0.3.tar.gz
>>>
>>> Unfortunately this means that the Rivet bootstrap script can fail if it
>>> tries to download and build FastJet, rather than using the LCG installed
>>> copy from AFS. We're using Python's urllib2 to do the fetching... is
>>> there anything weor you can do to not fall foul of this blocking? (I'm
>>> not sure if urllib2 automatically respects robots.txt files, but if you
>>> want to give us a special unblocked User-Agent name to use, I'm sure we
>>> can manage to update our script accordingly)
>>>
>>> Thanks!
>>> Andy & co
>>>
>>
>>
>> _______________________________________________
>> Rivet mailing list
>> Rivet at projects.hepforge.org
>> http://www.hepforge.org/lists/listinfo/rivet
> _______________________________________________
> Fastjet mailing list
> Fastjet at projects.hepforge.org
> http://www.hepforge.org/lists/listinfo/fastjet
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2945 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://www.hepforge.org/lists-archive/rivet/attachments/20130419/982fc8bb/attachment.bin>


More information about the Rivet mailing list