Checking how your page appears in the cache of Google or Live is one way to check how you appear to crawlers, but it doesn’t work great when you are making changes or running in development. One handy way is to check your site using Lynx, like Joshua mentions in this post on Mix Online.
The content on the site was ending up in the index of search engines anyway, through the virtue of RSS feeds and incoming links… but the value of your site to crawlers is going to be much lower than it should be if they don’t see any content when they visit. As I said earlier… we always knew this would be a problem, but I guess we just didn’t get around to fixing it before pushing out a full three sites using AJAX based paging. Last week I had a meeting with a SEO consultant and they pointed out the exact issue I’ve been describing. Well… given a long weekend… and no interest in working on my actual planned tasks… I decided to implement two features to help how our sites appear to crawlers.
First, I added some code that swaps out our fancy Ajax entry list with a simple ASP.NET repeater if the browser doesn’t appear to be one that is supported by Microsoft Atlas, making our site usable to other browsers (Atlas supports the bulk of users, but not all) and also making our content visible to a crawler. So far, I only output the first page of any given entry list, but that makes the results go from blank to this:
Next, I added an XML sitemap, following the specs from sitemaps.org, by outputting a sitemap index at http://<site>/sitemapindex.ashx and then outputting a series of sitemaps (by page #) from http://<site>/sitemap.ashx?page=<number> (see Mix’s sitemap index, and sitemap as an example). Finally, I put a link to the sitemap index into the robots.txt file for each site.
Between the two, I’m hoping our content will get indexed better by a variety of search engines, resulting in more people finding us when searching for relevant topics. These changes also help to make us a little bit more usable to some users, but that is another area where we need to do a lot more work. If these changes improve our accessibility that’s great, but I’d hate to even suggest that they get us anywhere near our goals in that area.