Search Engine Friendly Links in C# ASP .NET

2006-10-02 modified 2009-05-31 | by Kory Becker

Introduction

When designing a large C# ASP .NET web application, containing hundreds or thousands of dynamically generated pages, a common after thought is how the search engine robots will handle your site. Often, generated pages contain numerous parameters in the URL query string, including content ID, user ID, variables, state transition place holders, and other data. While the search engine robots are getting better at handling these URLs, some engines may simply ignore the pages. This could result in a large portion of your web application not being crawled and indexed. However, creating search engine friendly links in your C# ASP .NET web application is actually a lot easier than you may think.

Comparing Search Engine Friendly URLs to Parameter Clogs

An example of a poorly designed URL

http://www.mysite.com/myapp?q=3&state=3&userid=21&pass=&day=2&request=5

An example of a search engine friendly URL

http://www.mysite.com/ShowPage3.aspx

In the first example, the URL contains several parameters which may be skipped entirely by the search engines. Not to mention, the data is also insecure and may be reverse engineered by a malevolent user. It also may make your site appear more confusing and harder to work with.

The second example compresses the parameter information into a single digit (or combination of digits). The important part is that there are no query string parameters. Instead, the application code will pull out the digit 3 and interpret it properly to display the correct data. Variables may be held within the session state (ie. Session[“MyUserID”] = 2) or application state (ie. Application[“GlobalID”] = “ABC”).

Regular Expressions in C# ASP .NET

The foundation for creating a search engine friendly web application in C# ASP .NET sits upon the library System.Text.RegularExpressions and the Global.ascx.cs file. In a nutshell, when a web browser requests a page from your site, you receive the call in Application_BeginRequest within the Global.ascx file. You will execute a regular expression on the URL to determine if this is a request for your search engine friendly link or some other page.

Specifically, in the URL, we will determine if a page name followed by a digit exists by using a regular expression. You can make the page and id combination as complicated as you wish by adjusting the format of the regular expression. In our basic example, we will be using “page(\d+).aspx”. This tells us to look for any URL containing “page” followed by a digit, followed by “.aspx”, such as page2.aspx to load some content with an identifier of 2.

For those of you using Visual Studio 2005, you may need to override this function manually. The definition is:

protected void Application_BeginRequest(Object sender, EventArgs e)
{
    ...
}

Taking Advantage of Application_BeginRequest and Global.asax

This is where the magic happens. By combining the regular expression with incoming.RewritePath(), we will actually redirect the page content on the server to a different URL using a parameter in the query string, while still leaving the original URL in the web browser. At the top of your Global.asax.cs, you will need to include the following:

1	using System.Text.RegularExpressions;

Next, here is your Application_BeginRequest() function:

protected void Application_BeginRequest(Object sender, EventArgs e)
{
   HttpContext incoming = HttpContext.Current;
   string oldpath = incoming.Request.Path.ToLower();
   string pageid; // page id requested

   Regex regex1 = new Regex(@"page(\d+).aspx",  RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
   MatchCollection matches1 = regex1.Matches(oldpath);

   if (matches1.Count > 0)
   {
    // It's a page view request.
    // Extract the page id and display the proper page.
    pageid = matches1[0].Groups[1].ToString();

    // We have the pageid. Now secretly redirect the request to the actual page.
   // The original URL in the web browser remains the same, but the content comes from the new URL.
    incoming.RewritePath("ViewPage.aspx?pageid=" + pageid);
   }
}

In this example, notice how RewritePath redirects the request on the server to a file called ViewPage.aspx and passes a query string parameter. On the server, we will actually see the parameter in the URL. We can then process it correctly to display the content. The original URL requesting “page4.aspx” is actually a ficticious page name and does not actually exist. We could even name “page4.aspx” to “banana4.aspx”, it doesn’t matter. As long as we have a regular expression to catch the alias, we can redirect the content displayed correctly. With a little imagination, you can design an incredibly enhanced ASP .NET web application with page aliases all over.

RewritePath and Response.Redirect and Apples to Oranges

Notice the usage of incoming.RewritePath in the example above. Why didn’t we use Response.Redirect(url)? The reason is that using the Redirect function issues a command response to the web browser as a 404 redirect. This tells the user that the page is no longer available at this URL and that you are being redirected. Obviously, we do not want this to occur. Instead, we use the RewritePath function to change the URL on the server-side only, we receive the parameters in the query string on the server, display the proper content, and the user’s URL never changes.

Redirecting a User from .HTML to .ASPX

Using the above technique we can even redirect requested content of a .html page to our ASP .NET .aspx page. This may come in handy if you are moving from a static html web site to a C# ASP .NET web application. In this scenerio, your URLs will be changing from .html to .aspx. How can you safely redirect users who request the .html pages from the search engines and send them to your .aspx pages? Using the Response.Redirect or incoming.RewritePath of course.

If you want to notify the search engines to update their links, you should use the Response.Redirect() function. If you do not want to notify the engines and instead keep the .html links as an alias, then use the incoming.RewritePath() function. Here is an example that goes in your Application_BeginRequest() function as described above:

// Handle redirects from old HTML pages.
if (Request.Url.AbsoluteUri.ToLower().IndexOf("sample.html") > -1 || Request.Url.AbsoluteUri.ToLower().IndexOf("sample2.html") > -1)
{
    Response.Redirect("/Sample.aspx");
}

Notice how sample.html is being redirected to Sample.aspx. The forward slash in the front indicates the page is located at the root directory of the application on the web server. This is a very handy technique if you do not have full control of your web host machine and can not create an IIS redirect.

Summary

Creating search engine friendly links is a powerful technique in C# ASP .NET web applications. A properly developed web application should very rarely contain URL parameters. Instead, aliases should be created to reference dynamic content. You’ll receive more search engine traffic, you’ll have a better designed application, and ultimately, you’ll provide a better user experience.

About the Author

This article was written by Kory Becker, software developer and architect, skilled in a range of technologies, including web application development, machine learning, artificial intelligence, and data science.

Software Development, Programming, AI

Programming.NET