# robots.txt for http://www.pantos.org # Detailed philosophy at the end of this file User-agent: * Crawl-delay: 120 Disallow: / User-agent: Slurp Disallow: /brea/ User-agent: HTTrack Disallow: /brea/ User-agent: msnbot Disallow: /brea/ User-agent: Lycos_Spider Disallow: /brea/ User-agent: W3C-checklink Disallow: /brea/ User-agent: ia_archiver Disallow: /brea/ User-agent: Googlebot Disallow: /brea/ User-agent: Teoma Disallow: /brea/ User-agent: LinkAlarm Disallow: /brea/ User-agent: SideWinder Disallow: /brea/ User-agent: ZyBorg Disallow: /brea/ # Misc agents User-agent: INFOMINE Disallow: /brea/ User-agent: Pompos Disallow: /brea/ User-agent: heritrix Disallow: /brea/ User-agent: psbot Disallow: /brea/ User-agent: IRLbot Disallow: /brea/ User-agent: Argus Disallow: /brea/ User-agent: BruinBot Disallow: /brea/ User-agent: OmniExplorer_Bot Disallow: /brea/ User-agent: Gaisbot Disallow: /brea/ User-agent: schibstedsokbot Disallow: /brea/ User-agent: GOFORITBOT Disallow: /brea/ User-agent: ichiro Disallow: /brea/ User-agent: http://www.almaden.ibm.com/cs/crawler Disallow: /brea/ User-agent: * Disallow: /brea/ # As of 25 September 2005, the policy regarding crawlers/spiders # has fundamentally shifted. The prior philosophy was to allow # all UAs by default. Sadly, ill-behaved robots are becoming # more commonplace, and some people are sucking down this entire # site in one huge blast. Alternately, some link checkers are # operating on a daily (or more-than-daily!) basis. (C'mon, # links don't rot *that* fast!!) # # Reluctantly, the policy now is to disallow all UAs by default, # and to welcome the most well-behaved robots to spider the site. # ("Well-behaved" includes have a link to a page that describes # the operation of your spider, and provides a way to contact # a human operator in the event of a problem. In general, access # is now limited to a few commerical crawlers, and to UAs that # represent/support an academic (as opposed to commercial) # research effort. # # Pointedly *disallowed* are browser pre-fetchers (which is a # *really* stupid idea), and *any* UA that lies about who/what # it is. (Under the theory that lying is necessary only when # the truth would exclude you.) # # If you are a human reading this text, and you are operating # a well-behaved robot, you may send email to terry@pantos.org # to request permission to access the site. # # OTOH, if you are operating some misbehaved brat of a robot, # don't be surprised if you suddenly find yourself getting a # pile of 403 status codes--and not just for your robot! #