High availability is an
important topic for most server-based products. High availability is
often expressed in terms of the percentage of “uptime.” This refers to
time when the server is accessible and functional for users versus time
when it is not accessible.
SharePoint
provides many mechanisms for achieving high availability. This section
covers some of the more common tools and configurations you can
implement to maximize and secure the operation of your installation.
Examining the Management Pack
There is a free download
available for a SharePoint management pack for Microsoft’s System Center
Operations Manager (SCOM). The management pack is titled Microsoft
SharePoint 2010 Products Management Pack and is available at http://download.microsoft.com.
This management pack needs to
be installed on a SCOM server that is monitoring the SharePoint
deployment. It deploys several rules that monitor all aspects of a
SharePoint farm and enables you to attach custom actions or workflows
when each rule is violated. There are three PPS specific triggers
included that allow you to create notification rules when they are
tripped, as follows:
Service Availability: Monitors the PPS Service and triggers an action when the service stops running.
Database Availability: Monitors the availability of PPS Service Application databases and triggers an action if the database becomes unavailable.
Unattended Service Account Status:
Monitors the validity of the Unattended Service Account and triggers an
action if the account no longer authenticates for some reason. Often
this catches expired passwords or domain connectivity problems.
Tip
SCOM used to be known as Microsoft Operations Manager (MOM) and is still frequently referred to as a MOM Pack.
Examining Network Load Balancing
SharePoint 2010 supports most standard network load-balancing schemes. Load balancing
refers to distribution of the workload across multiple computers. A
load-balancing scheme can be implemented either through a hardware or
software solution. The configuration of the network load-balancing
scheme is handled completely externally from SharePoint.
To use network load
balancing effectively, it is important to configure Alternate Access
Mappings (AAM) in SharePoint. AAMs are important because they allow
SharePoint to properly recognize where users are coming from, especially
in cases where there are multiple ways to access a SharePoint web
application. In deployments involving reverse proxies or load balancing,
the URL that the user enters could differ from the URL that is passed
to SharePoint. Hence when SharePoint generates links, it needs to know
both what possible public
entry points exist for the site (Public URLs) and what potential URLs
could be given by the load balancer (Internal URLs) and the mappings
between them.
Tip
If you do not configure
AAM for a web application, traffic may be diverted to one web front-end
(WFE) server or you may experience Internet Information Services (IIS)
or SharePoint errors.
Go to a site collection
in a web application browser and look at the URL. Does the address match
what you are attempting to use as a hostname? If yes, your AAM
configuration is correct. Does the address revert to the name of a web
application? If yes, check your AAM settings again.
For more information on configuring AAM, see Microsoft SharePoint 2010 Unleashed (0672333252).
Configuring Multiple Application Servers
SharePoint automatically
configures multiple application servers. When requests come to the
application servers, SharePoint applies a round-robin scheduling
mechanism. With SharePoint 2010, round-robin scheduling is the only
available load-balancing scheme. The scheme is implemented on a per
service level, not at a machine level. For example, this means that
Excel Services and PPS both have their own schedules. The first request
to Excel Services will go to the first Excel Services service that was
started, and the first request for PPS will go to the first PPS service
that was started.
Round-Robin Scheduling
Round-robin scheduling
is a simple form of scheduling that alternates requests between the
available application servers that are running the PPS Monitoring
Service.
Take a look at the server farm configuration in Figure 1, which includes one WFE server with three application servers.
The
first request that requires a call to the application server will go to
Application Server 1. The second request that requires a call to the
application server will go to Application Server 2, and the third
request will go to Application Server 3. When a fourth request comes in,
this fourth request is routed to Application Server 1, and the cycle
repeats.
Round-robin scheduling
is one of the simplest forms of load balancing available. It works well
in a SharePoint farm because the servers do not need to communicate
detailed information about their current states. Often this detailed
server information is outdated by the time an available application
server has been identified to handle the call.
When a number of failures to a
web service occur, SharePoint posts an error to the Unified Logging
Service (ULS) trace logs and the Application event log. In addition,
SharePoint stops the service and prevents it from servicing further
requests. To resume its function, the service must be restarted.
Starting and stopping the PPS
Monitoring Services is the best way to shape the load on the application
servers in a farm. There should be no downtime when you do this.