Dave's Free Press: Journal

violence, pornography, and rude words for the web generation

 

Recent posts

(subscribe)

Recently commented posts

(subscribe)

Journals what I read

geeky politics rant silly religion meta music perl weird drinking culture london language transport olympics hacking media maths sport web photography etiquette spam amazon film bastards books bryar holidays palm telecoms cars travel yapc bbc clothes rsnapshot phone whisky security home radio lolcats deafness environment curry art work privacy iphone linux unix go business engineering kindle gps economics latin anglo-saxon money bramble cars environment electronics
Tue, 21 Oct 2008

Thanks, Yahoo!

[originally posted on Apr 3 2008]

I'd like to express my warm thanks to the lovely people at Yahoo and in particular to their bot-herders. Until quite recently, their web-crawling bots had most irritatingly obeyed robot exclusion rules in the robots.txt file that I have on CPANdeps. But in the last couple of weeks they've got rid of that niggling little exclusion so now they're indexing all of the CPAN's dependencies through my site! And for the benefit of their important customers, they're doing it nice and quickly - a request every few seconds instead of the pedestrian once every few minutes that gentler bots use.

Unfortunately, because generating a dependency tree takes more time than they were allowing between requests, they were filling up my process table, and all my memory, and eating all the CPU, and the only way to get back into the machine was by power-cycling it. So it is with the deepest of regrets that I have had to exclude them.

Cunts.

[update] For fuck's sake, they're doing it again from a different netblock!

Posted at 17:35 by David Cantrell
keywords: geeky | meta | perl | rant
Permalink | 1 Comment

Can't you set an apache rewrite rule based on their UserAgent or something ? Just redirect them to a blank page.

Posted by vegiVamp on Wed, 22 Oct 2008 at 09:29:20


Sorry, this post is too old for you to comment on it.

Archive