Implicit relationship between search crawler and content providers…

A discussion at work today, along with my earlier post about Phil’s issues with Google and MSN, had me searching like crazy for this comment:

“I think they’re also vulnerable to being locked out of sites because they’ve missed the implicit bargain for search engines that my site being crawled results in listing for me. Without any public facing services how do I find out what value I am getting from allowing WebFountain to crawl my site ?” — Matthew Walker’s comment on a Searchblog piece about WebFountain

At what point do you block a search crawler? If you don’t recognize it? If you don’t like the results provided by the search engine it is crawling for?

Author: Duncan Mackenzie

I’m the Developer Lead for the Channel 9 team, formerly worked on MSDN as a developer, content strategist and author.

One thought on “Implicit relationship between search crawler and content providers…”

  1. I’ve had to block crawlers that were hitting my site at a ridiculous rate, consuming server CPU to the detriment of other visitors. But that’s rare.

    My default impulse is to allow everybody, and only block certain people if they become a problem.

    /robots.txt:
    # allow everything

    User-agent: *
    Disallow:

Leave a Reply