Logo
programming4us
programming4us
programming4us
programming4us
Windows XP
programming4us
Windows Vista
programming4us
Windows 7
programming4us
Windows Azure
programming4us
Windows Server
programming4us
Windows Phone
 
 
Windows Server

Microsoft Content Management Server : Increasing Search Accuracy by Generating Search Engine Specific Pages

4/5/2012 5:52:36 PM
As our menus on our templates display not only the name of the current posting but links to other postings, we are creating additional noise for the search results. The way to overcome this is to determine when a page is requested by a crawler and then hide the listing controls. Because search engines traverse sites from links within the page, we cannot simply remove the navigational controls leaving only the main posting content: we must also create hyperlinks to the channel items (postings and channels) within the current channel.

Interrogating the User Agent

To determine if the site is being crawled, we will create a helper method that checks the Request.Header["User-Agent"] value and compares it to a list of known search agents stored in the web.config file.

Firstly, we have to set up a list of search agents in the web.config file. Under the <configuration> | <appSettings> element, we will insert an <add> element:

<!-- SharePoint, Google, MSN Search (Separated by | )-->
<add key="SearchUserAgents" value="MS Search|GoogleBot|msnbot" />

Next, we will create a helper class in the Tropical Green project. To do this, follow these steps:

1.
Open the TropicalGreen solution in Visual Studio .NET.

2.
In the TropicalGreen project, create a new folder called Classes.

3.
Right-click on the Classes folder and choose Add | Add Class.

4.
Enter the name SearchHelper.cs and click OK.

5.
Import the System.Web and System.Configuration namespaces:

using System;
using System.Text.RegularExpressions;
								using System.Web;
								using System.Configuration;
								namespace TropicalGreen.Classes
{
  /// <summary>
  /// Summary description for SearchHelper.
  /// </summary>
  public class SearchHelper
  {
   ... code continues ...
  }
}

Let’s add a static method called IsCrawler(), which returns true if the Request.Header["User-Agent"] matches a value specified in the web.config:

public static bool IsCrawler()
{
  // Get the user agent string from the web.config
  string strUserAgents =
    ConfigurationSettings.AppSettings.Get("SearchUserAgents");

  // is it null? is it an empty string?
  if(strUserAgents != null && strUserAgents != "")
  {
    // Regular expression to identify all robots
    // robot strings need to be separated with "|" in the web.config
    Regex reAllSearch = new Regex("("+strUserAgents+")",
        RegexOptions.Compiled);

    // Get the current user agent
    string CurrentUserAgent = HttpContext.Current.Request.UserAgent;
    return reAllSearch.Match(CurrentUserAgent).Success;
  }

  // agents are not specified in the web.config
  return(false);
}

Hiding Navigational Elements

For each of the navigation controls in the Tropical Green solution, we will need to add a check during their loading or rendering to see if the page is being crawled.

The easiest way to do this is in the Page_Load() or Render() method of the control. We want to check if the current request is from a crawler and hide the control if it is, otherwise load it as normal:

if (!TropicalGreen.Classes.SearchHelper.IsCrawler())
{
  // Bind the Data
}
else
{
  // Is being crawled so hide the user control
  this.Visible = false;
}

Creating a Posting/Channel Listing User Control

Now we have hidden the navigational controls, we must still provide a mechanism for the crawlers to traverse the site. The simplest way is to build a user control that lists the postings and channels in the current channel but does not include their display name.

Let’s now build the CrawlingNavigationControl user control:

1.
Open the TropicalGreen project in Visual Studio .NET.

2.
Under the user controls folder, create a new user control and give it the name CrawlingNavigationControl.

3.
Switch to the code-behind file (CrawlingNavigationControl.ascx.cs).

4.
Import the following namespaces:

using System.Text;
using Microsoft.ContentManagement.Publishing;

We will add logic to the Page_Load() of all templates and channel rendering scripts to create a new literal control that contains the hyperlinks. The hyperlink labels will be text that is ignored by the indexer (noise words). In this case, we use the word “and”.

private void Page_Load(object sender, System.EventArgs e)
{
  // Is the site being crawled?
						if(TropicalGreen.Classes.SearchHelper.IsCrawler())
						{
						// declare & instantiate a string builder to hold the hyperlinks
						StringBuilder sb = new StringBuilder();
						// loop through all the channel items in the current channel
						foreach(ChannelItem item in CmsHttpContext.Current.Channel.AllChildren)
						{
						// append the hyperlink creating a unique label
						sb.Append("<a href=\"" + item.Url + "\">and</a>");
						}
						// instantiate a new literal control
						Literal litLinks = new Literal();
						// set the literal controls text
						// to be the text from the string builder
						litLinks.Text = sb.ToString();
						// add the literal control to the control collection
						this.Controls.Add(litLinks);
						}
}

					  

To enable the CrawlingNavigationControl, you should add it to the bottom of every template file. Alternatively, if your site uses a footer user control or a header user control on every template, you could place the CrawlingNavigationControl in there.

Other -----------------
- Microsoft Content Management Server : Configuring Templates to Allow Postings to Return Accurate Last Modified Time
- Active Directory Domain Services 2008 : Modify a Computer Object’s Delegation Properties & Modify a Computer Object’s Location Properties
- Active Directory Domain Services 2008 : Modify a Computer Object’s General Properties & View a Computer Object’s Operating System Properties
- Windows Server 2008 Server Core : Working at the Command Prompt (part 2) - Tracking Command Line Actions with the DosKey Utility
- Windows Server 2008 Server Core : Working at the Command Prompt (part 1)
- Sharepoint 2007 : Customizing a SharePoint Site - Modify a Content Type
- Microsoft BizTalk 2010 : Consuming ASDK-based Adapters - ASDK tools and features
- Microsoft Dynamics AX 2009 : Working with Data in Forms - Creating custom filters
- Microsoft Dynamics AX 2009 : Working with Data in Forms - Handling number sequences
- BizTalk 2006 : Deploying and Managing BizTalk Applications - Administrative Tools (part 3) - ExplorerOM
 
 
Top 10 video Game
-   The Elder Scrolls Online | Creating a Character on the console version
-   Hunger Games: Mockingjay Part 2 | First Look
-   Final Fantasy XV Episode Duscae | Version 2.0
-   Mirror's Edge Catalyst
-   Call of Duty: Black Ops 3 Reveal-Trailer
-   LEGO Marvel’s Avengers
-   Dark Souls 3 (PS4, Xbox One)
-   The Last Guardian - E3
-   Final Fantasy 15: Episode Duscae | 'It Takes Two' Sidequest
-   Armikrog [PC] Beak-Beak's Blog
-   Project CARS [WiiU/PS4/XOne/PC] Racing Icon Car Pack
-   Sonic Boom: Fire & Ice [3DS] Debut
Popular tags
Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Biztalk Exchange Server Microsoft LynC Server Microsoft Dynamic Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 windows Phone 7 windows Phone 8
programming4us programming4us
 
Popular keywords
HOW TO Swimlane in Visio Visio sort key Pen and Touch Creating groups in Windows Server Raid in Windows Server Exchange 2010 maintenance Exchange server mail enabled groups Debugging Tools Collaborating
programming4us programming4us
PS4 game trailer XBox One game trailer
WiiU game trailer 3ds game trailer
Trailer game
 
programming4us
Women
programming4us
Windows Vista
programming4us
Windows 7
programming4us
Windows Azure
programming4us
Windows Server
programming4us
Windows Phone