[Update 2/14/06: checked again today and problem appears fixed. Score one for Google.]
Last week, Google released an update to Google Sitemaps that added a robots.txt checker tool. I started experimenting with it today and found a rather odd behavior: the tool reports access allowed for terms that clearly should not be. I think I've got this boiled down to the simplest case, which appears to have to do with terms starting with “Web” (”W” has to be capitalized). I don't know whether this is domain specific, so I'd love to hear back whether you are able to reproduce this or not. Here's the setup:
Enter the following into the large robots.txt area:
User-agent: *
Disallow: /Web.htm
Then enter the following into the URL area:
http://www.[your domain here].com/Web.htm
Now, click “Check” and watch it report “Allowed”.
If you happen to see this and know someone at Google, pass it along.
posted @ Friday, February 10, 2006 10:57 AM