Voice resiliency in Lync Server 2010 is achieved by
providing endpoints with a primary and backup registrar service. The
registrar service existed in Office Communications Server 2007 R2 as
part of the Front End Service, but has been separated into its own role
in Lync Server 2010 to provide failover capabilities for voice features.
When Lync endpoints sign in, they are informed through in-band
signaling of both a primary and backup registrar pool associated with
their account. The primary registrar pool will typically be the Front
End pool where the user account is homed, except in branch office
scenarios. There are two different voice resilience scenarios that
should be accounted for: datacenter survivability and branch site
survivability.
Datacenter Survivability
To provide the failover capability when the primary
pool is unavailable, each Front End pool can be assigned a backup pool.
This can be another pool in the same site, or more commonly will be a
pool in a separate datacenter across a WAN link. Using a separate
datacenter ensures the voice services are available in the event of a
primary datacenter site failure. When assigned, as clients sign in, they
will receive information about which pool is the primary and which pool
is the backup. Figure 1 shows what happens in a datacenter failover scenario.
Note
Under normal operating circumstances, endpoints use
the registrar service located on the primary pool. In the event the
endpoint cannot contact the primary pool, it will attempt to contact the
backup. If the backup pool determines the endpoint’s primary pool is
indeed unavailable, it will accept the registration for the user.
As indicated previously, the backup pool maintains a
monitor to check whether the primary pool is available at all times.
This monitoring is accomplished through the use of heartbeat messages
exchanged between the two pools across a WAN link. Only after the backup
pool stops receiving heartbeat messages from the primary will it mark
the pool as offline and begin accepting user registrations. The default
timeout interval for the heartbeat messages is 300ms, but it can be
modified by an administrator if a longer or shorter timeout period is
required.
Pools in different sites can serve as backup
registrars for each other. For example, Company ABC can have Lync Server
2010 Front End pools in both San Francisco and Chicago. Each pool can
serve as a backup registrar for the other so that users will still have
voice services through the opposite registrar in the event of a pool
failure.
When clients are using a backup registrar service,
not all features are available. Features that are available in a
failover scenario include
PSTN Calls—
Outbound calls should have no issues, but inbound call availability is
dependent on the PSTN carrier service delivering inbound calls to the
backup location. The capability for this feature varies depending on
location and carrier.
Internal Calls— Internal voice calls are possible between users in the same site and to additional sites.
Call Control—
Users are able to use basic call features such as hold and transfer.
Advanced features such as call forwarding, simultaneous ringing, and
team call are also available in a failover scenario.
Instant Messaging— Instant messaging service is available, but only between two parties. No instant messaging conferencing services are available.
Audio/Video Calls— Audio and video calls are between two parties only. Audio/video conferencing services are unavailable.
Call Detail Records— As long as the backup pool is associated with a CDR, collector data is still added.
Features that are unavailable to users in a failover scenario include
Conferencing Auto Attendant— The dial-in conferencing attendant service and all scheduled meetings with the conference bridge are unavailable.
Conferencing—
Any type of conferencing is unavailable. This applies to instant
message, meeting, and audio/video conferences involving more than two
parties.
Presence-Based Routing— Because
presence data is unavailable, calls to users homed on the failed pool
do not have calls routed around this presence. For example, if a user
sets the status to Do Not Disturb, instant message and phone calls are
still delivered.
Call Park— The ability to park calls is unavailable to users homed on the failed pool.
Response Group Service— Any workflows and queues associated with the failed pool are unavailable. Agents will be unable to sign in.
Call Forwarding Settings—
Although call forwarding capabilities remain in effect during a
failover, users are unable to update or change their call forwarding
settings.
Voicemail Delivery— Assuming the users’ Exchange servers were part of a site failure, new voicemail messages cannot be delivered.
Voicemail Retrieval— Again, assuming the users’ Exchange servers were part of a site failure, voicemail messages cannot be accessed.
Tip
If resilient voice services are required,
organizations should plan to designate a backup registrar pool in the
event of a primary pool outage. The WAN link between the two pools
should be sufficient to support the additional voice traffic and ideally
be resilient.
Branch Site Survivability
Very similar to how voice resiliency can be achieved
between multiple datacenters, branch office survivability depends
heavily on using a separate registrar service. The size of the branch
site plays a large factor in determining how the resiliency is achieved.
For branch sites of more than 5,000 users, it makes
sense to deploy an entire Lync Server 2010 Front End pool in a highly
available configuration to where user accounts are homed. Just as in a
datacenter, failover scenario user accounts use the local pool as a
primary registrar and a pool across the WAN link as a backup registrar. Figure 2 shows how the primary and backup registrars operate for a branch office user.
Medium-sized sites of 1,000 to 5,000 users can deploy
a survivable branch server, which is really a Front End pool with a
collocated Mediation Server role. User accounts are homed to the
survivable branch server, but still receive the majority of services
from an associated Front End pool in a central site. The survivable
branch server is designated as the primary registrar for the users so
that when they sign in, they will register to a local server.
Conferencing services and web component services are still always
accessed across a WAN link to the pool associated with the branch site.
The survivable branch server is typically paired with a local IP/PSTN
gateway to provide branch users with a local route the PSTN for inbound
and outbound calls.
For branch office deployments of 25 to 1,000 users, dedicated hardware devices called survivable branch appliances
are created by third-party Microsoft partners. These survivable branch
appliances are similar to a survivable branch server, but include a
registrar service, Mediation server, and IP/PSTN gateway all in one
hardware device. These devices are typically more economical than
deploying a separate server and IP/PSTN gateway, so it makes sense to
leverage these devices in small branch offices. Like the survivable
branch server, users in the branch office use the survivable branch
appliance as the primary registrar service.
In the event of a WAN outage, users in branch sites
will have continued voice services through the use of a survivable
branch server or appliance because the primary registrar service is
local to the site. Not all features are available to users when the WAN
link is unavailable. Figure 3 shows how a branch office user remains connected to their local registrar service when a WAN link is unavailable.
The features available during a WAN outage include
PSTN Calls— Inbound and outbound calls are possible because the local IP/PSTN gateway is available and not dependent on the WAN link.
Internal Calls—
Internal voice calls between users within the branch site have no
issues. Calls to users in another site must leverage PSTN rerouting to
be completed.
Call Control—
Users are able to use basic call features such as hold and transfer.
Advanced features such as call forwarding, simultaneous ringing, and
team call are also available in a failover scenario.
Instant Messaging—
Instant messaging services are available, but only between two parties
within the branch site. No instant messaging conferencing services are
available.
Audio/Video Calls— Audio and video calls are between two parties only within the branch site. Audio/video conferencing services are unavailable.
Call Detail Records— Call detail records continue to queue on the primary registrar and are delivered after the WAN link is restored.
Audio Conferencing—
Audio conferencing attendants can be used, but only by dialing the PSTN
access numbers so that calls are routed through the local PSTN gateway
to their associated datacenter site.
Voicemail Retrieval—
Assuming the Exchange infrastructure exists in a primary datacenter,
users can still dial the PSTN number for Outlook Voice Access to
retrieve voicemail.
Voicemail Deposit—
Assuming the Exchange infrastructure exists in a primary datacenter,
survivable branch servers and appliances can reroute voicemail delivery
across the PSTN. This is an Exchange Unified Messaging directory auto
attendant that only accepts voicemail and does not allow transfer to
users.
Features that are unavailable to users in a failover scenario include
Cross-Site Communication— Any form of communication to other sites is unavailable.
Conferencing—
Any type of conferencing is unavailable. This applies to instant
message, meeting, and audio/video conferences involving more than two
parties because the multipoint control units (MCUs) exist at the
datacenter pool where user accounts are homed.
Presence Based Routing—
Because presence data is unavailable, calls to users homed on the
inaccessible pool across the WAN link will not have calls routed around
this presence. For example, if a user sets the status to Do Not Disturb, instant message and phone calls will still be delivered.
Call Park— Parking calls are unavailable to users homed on the pool, which is inaccessible across the WAN link.
Response Group Service— Any workflows and queues associated with the pool across the WAN link are unavailable.
Call Forwarding Settings—
Although call forwarding capabilities remain in effect during a
failover, users are unable to update or change their call forwarding
settings.
In the event of the pool, survivable branch server,
or survivable branch appliance in the branch site becoming unavailable,
users will begin to use the backup registrar service located across the
WAN link. In this scenario, users will not experience any loss in
functionality. There might be additional bandwidth used on the WAN link,
but clients can still access the PSTN through IP/PSTN gateways located
in the backup site.