Implicit relationship between search crawler and content providers...

Posted on January 11th

A discussion at work today, along with my earlier post about Phil's issues with Google and MSN, had me searching like crazy for this comment:

"I think they're also vulnerable to being locked out of sites because they've missed the implicit bargain for search engines that my site being crawled results in listing for me. Without any public facing services how do I find out what value I am getting from allowing WebFountain to crawl my site ?" --- Matthew Walker's comment on a Searchblog piece about WebFountain

At what point do you block a search crawler? If you don't recognize it? If you don't like the results provided by the search engine it is crawling for?



Write a Comment

Take a moment to comment and tell us what you think. Some basic HTML is allowed for formatting.

Reader Comments

I've had to block crawlers that were hitting my site at a ridiculous rate, consuming server CPU to the detriment of other visitors. But that's rare.

My default impulse is to allow everybody, and only block certain people if they become a problem.

/robots.txt:
# allow everything

User-agent: *
Disallow: