Implicit relationship between search crawler and content providers…

A discussion at work today, along with my earlier post about Phil’s issues with Google and MSN, had me searching like crazy for this comment:

“I think they’re also vulnerable to being locked out of sites because they’ve missed the implicit bargain for search engines that my site being crawled results in listing for me. Without any public facing services how do I find out what value I am getting from allowing WebFountain to crawl my site ?” — Matthew Walker’s comment on a Searchblog piece about WebFountain

At what point do you block a search crawler? If you don’t recognize it? If you don’t like the results provided by the search engine it is crawling for?

Implicit relationship between search crawler and content providers…

One Response

  1. I’ve had to block crawlers that were hitting my site at a ridiculous rate, consuming server CPU to the detriment of other visitors. But that’s rare.

    My default impulse is to allow everybody, and only block certain people if they become a problem.

    /robots.txt:
    # allow everything

    User-agent: *
    Disallow:

    Maurits January 12, 2006 at 6:09 pm #

Leave a Reply