<?xml version="1.0" encoding="iso-8859-1"?> 
<rdf:RDF
	xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
	xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
	xmlns:dc="http://purl.org/dc/elements/1.1/" 
	xmlns:dcterms="http://purl.org/dc/terms/" 
	xmlns:admin="http://webns.net/mvcb/"
	xmlns:thr="http://purl.org/rss/1.0/modules/threading/"
	xmlns:pb="http://www.ideaspace.net/users/wkearney/schema/postback/" 
	xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" 
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:mt="http://movabletype.org/"
	xmlns:foaf="http://xmlns.com/foaf/0.1/" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:html="http://www.w3.org/TR/REC-html40/"
	xmlns="http://purl.org/rss/1.0/"
> 

<rdf:Description rdf:about="http://www.ideaspace.net/users/wkearney/archives/entries/000269.html"> 
	<title>Throttle that web server!</title>
	<link>http://www.ideaspace.net/users/wkearney/archives/entries/000269.html</link>
	<description>Overnight I discovered a certain server down in New Zealand has seen fit to start trying to spider one of...</description> 

	<dc:creator>wkearney</dc:creator> 
	<dc:date>2003-04-17T13:13:02-05:00</dc:date> 
	<dc:identifier>http://www.ideaspace.net/users/wkearney/archives/entries/000269.html</dc:identifier>
	<dc:language>en-us</dc:language>

	 
	<dc:subject>Geek</dc:subject>

	

	

	
	
	
	<dcterms:abstract>Overnight I discovered a certain server down in New Zealand has seen fit to start trying to spider one of...</dcterms:abstract> 
	<dcterms:created>2003-04-17T13:13:02-05:00</dcterms:created> 
	<dcterms:isPartOf rdf:resource="http://www.ideaspace.net/users/wkearney/" /> 

	<mt:body><![CDATA[<p>Overnight I discovered a certain server down in New Zealand has seen fit to start trying to spider one of my servers.  The idiots.  They're spidering one of their own sites that has an external link to one of mine.  Their spider is making it worse by somehow bastardizing the URLs and recursing into non-existent subdirectories.  So I'm seeing all sorts of wasted bandwidth and a generally cluttered up server log (~512k so far).</p>

<p><strong>UPDATE</strong>: The admins on the box have contacted me.  They've shut down the spider (htdig) and offered an apology.  Way to go folks!  Running these machines can be a tough job.  I hope they have some luck getting it reconfigured properly.</p>]]></mt:body>
	<mt:excerpt>Overnight I discovered a certain server down in New Zealand has seen fit to start trying to spider one of...</mt:excerpt> 
	<mt:more><![CDATA[<p>What to do?  Well, using apache deny directives is a good start.  Trouble is their spider must not be alone.  Another one of their hosts, from a different subnet, is likewise making these erroneous requests.  So if I block entire subnet ranges or individual IP addresses I'm potentially faced with a lot of work.  I'm also faced with other legitimate users on those subnets getting blocked.  I'd heard about <br />
<a href="http://www.snert.com/Software/mod_throttle/index.shtml#ThrottlePolicy">mod_throttle</a> some time ago.  It's been on my 'to check out' list for ages.  Now, it seems, I have need for it.</p>

<p>Initial setup looks good.  I've got it blocking IP addressess if they request 'too often'.  I'm sure I'll have to tweak these numbers a bit.  Not to mention refine which directories get this treatment and which don't.  As I get a better grip on it I'll be sure to report my experiences.</p>]]></mt:more>
	<mt:keywords></mt:keywords> 
	<mt:entryID>269</mt:entryID>

	<mt:entryPrev>268</mt:entryPrev>
	<mt:entryNext>270</mt:entryNext>

	<html:link rel="prev" type="application/xml" href="http://www.ideaspace.net/users/wkearney/archives/entries/000268.html.xml" title="When potato guns go bad" />
	<html:link rel="next" type="application/xml" href="http://www.ideaspace.net/users/wkearney/archives/entries/000270.html.xml" title="weblogger navel gazing gone bad" />
	
	<mt:author>wkearney</mt:author> 
	<mt:authorNickname>Bill Kearney</mt:authorNickname> 
	<mt:authorEmail>wkearney@ideaspace.net</mt:authorEmail>
	<mt:authorURL rdf:resource="http://www.ideaspace.net/users/wkearney" /> 
	
	<foaf:name>wkearney</foaf:name> 
	<foaf:mbox rdf:resource="mailto:wkearney@ideaspace.net" /> 
	<foaf:nick>Bill Kearney</foaf:nick> 
	<foaf:homepage rdf:resource="http://www.ideaspace.net/users/wkearney" />
	
	<rdfs:seeAlso rdf:resource="http://www.ideaspace.net/users/wkearney/xml/index.rdf" />
	<admin:generatorAgent rdf:resource="http://www.movabletype.org/?v=2.64" /> 
</rdf:Description>
</rdf:RDF>