Dynamically generated robots.txt and sitemap.xml files in ASP.NET

A robots.txt file and a sitemap are important additions to a site from an SEO point of view. You can of course just serve static files but it’s easy to imagine of cases in which you’d want the content of these to be dynamically generated:

1. You are regularly adding/changing site content. Certainly on hirespace.com, we’re adding new venue pages all the time, so our sitemap changes frequently.

2. You have some kind of test/pre-production environment set up for your site that you don’t want google crawling and indexing. In this case, your robots.txt file needs to change depending on what environment you’re in.

It’s easy in ASP.NET to do this. In principle you just set up a controller and view for each page and create the content dynamically inside the index action on the controller. Then set up routes to point yoursite.com/sitemap.xml and yoursite.com/robots.txt to the right actions. There’s one further issue which is that by default IIS will try and handle urls with a dot in using the default static file handler, so you also need to add custom handlers to your web config as I’ll show below.

For example, I created a controller called RobotsController with the following action. It outputs a plain text document in the expected format for a robots.txt file that varies depending on the deployment environment.

public virtual ActionResult Index()
{
     string robotsResult;

     switch (Config.DeploymentContext)
     {
         case DeploymentContextType.Test:
         robotsResult = "User-agent: * \n Disallow: /";
         break;
         case DeploymentContextType.Live:
         robotsResult =
         "User-agent: * \n Disallow: /Account";
         break;
         default:
         robotsResult = "User-agent: * \n Disallow: /";
         break;
     }

     return Content(robotsResult, "text/plain");
}

Now we add the route to our RouteConfig class:

 routes.MapRoute(
                name: "Robots.txt",
                url: "Robots.txt",
                defaults: new { controller = "Robots", action = "Index" });

So far so simple. The only problem now, is that when you navigate to yoursite.com/robots.txt, IIS sees the dot in the url and expects a static file. Since there is no such file, it fails and returns a file not found error.

I found the solution to this issue on StackOverflow (of course – yay!) and it’s dead simple. All you need to do is add a handler to your web config (original StackOverflow answer here):

    
        
〈system.webServer〉
    〈handlers〉
        〈add name="Robots-ISAPI-Integrated-4.0" path="/robots.txt" verb="GET" type="System.Web.Handlers.TransferRequestHandler" preCondition="integratedMode,runtimeVersionv4.0" /〉
        ...
    〈/handlers〉
〈/system.webServer〉

Similarly, if you’re generating a sitemap dynamically, add a route:

 routes.MapRoute(
                name: "Sitemap.xml",
                url: "Sitemap.xml",
                defaults: new { controller = "Sitemap", action = "Index" });

and a handler:

    
        
   〈system.webServer〉
    〈handlers〉
        〈add name="Sitemap-ISAPI-Integrated-4.0" path="/sitemap.xml;" verb="GET" type="System.Web.Handlers.TransferRequestHandler" preCondition="integratedMode,runtimeVersionv4.0" /〉
        ...
    〈/handlers〉
〈/system.webServer〉
    

And that’s it! Google and other search engine bots will pick up your dynamically generated sitemap and robots files, and you’ll reap the SEO benefits 🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s