5. Flexible Single Master Operations (FSMO) Placement
The placement of FSMO role holders, sometimes
referred to as operations masters, is an important design consideration.
Although FSMO placement is important to ensure high accessibility by
clients, DCs, and GCs for such things as password changes, new account
creations, Group Policy modifications, and Time Services, recovery of
FSMO role holders is equally critical.
A brief review of FSMO roles will help determine
where the DCs hosting those roles should be placed. The sections
following describe the potential impact to the environment at the loss
of each FSMO role holder. This helps determine the importance of
availability of that DC; what kind of support effort needs to be
expended to keep it online in case of a failure (that is, does it
require immediate attention or can it be handled on a lower severity
level.) This also helps determine whether another DC should seize the
role, and if so, when that should be done.
DCPromo
When demoting a DC that is an FSMO role holder, the
roles should be transferred to another DC first. However, the demotion
process identifies any FSMO roles held by the DC and transfers them to
another DC. This usually works successfully, but the DC that DCPromo
chooses to receive the roles might not be the one you want. Make sure to
transfer the roles before demoting a DC if possible. If the DC has to
be forcefully demoted, or if the DC can never be brought back online,
then seizure is the solution to making the roles available again.
Forest-Wide Roles
These roles relate to operations in the schema and
configuration naming contexts (NCs), and apply to every DC in the
forest. As such, any DC in any domain in the forest can hold these
roles. The two forest-wide roles are schema master and domain naming
master.
Schema Master
Availability of the schema master is required only
when modifications to the schema take place. This could include
execution of Exchange's ForestPrep utility, which adds classes and
attributes to the schema in preparation for installing Exchange;
execution of the ADPrep utility to prepare a Windows 2000 forest to be
upgraded to 2003; execution of Windows 2003's Domain Rename operation;
and installation of third-party applications that modify the schema.
However, on a day-to-day operational level, loss of this role holder
will not affect the user population.
If the original role holder can come back online
before any schema modifications must take place, don't move the role to
another DC. In most cases, the schema master can be left offline until
the original is restored.
Domain Naming Master
Contact with the domain naming master is required to
create or delete domains (during DCPromo) and for the Domain Rename
process as well as other operations that require modifications of the
domain structure. Loss of the domain naming master role holder does not
have an immediate impact on the forest and usually does not require
seizing the role to another DC that has good network connectivity to the
other DCs as well as the server resources to handle the extra load. The
domain naming master role must be held by a GC server.
note
If you have a mixed Windows 2000 and 2003 domain in
the forest (that is, Windows 2000 and Windows Server 2003 DCs are in the
domain), you must put the domain-naming master on a Windows Server 2003
DC so it can support application partitions.
Domain-Wide Roles
The Relative IDentifier (RID) master, Primary Domain
Controller (PDC) Emulator, and infrastructure master are the three
operation master roles whose scope is the domain. Thus, each domain
contains DCs with these roles.
RID Master
Each security principal (user, computer, or group) is
identified in AD by a unique Security IDentifier (SID). The SID
consists of two parts: a domain SID (SID that is unique for the domain)
and an RID. All security principals in the domain contain the same
domain SID and the unique RID, which forms the object's unique SID.
RIDs are assigned by a DC in the domain in which the
security principal resides. Because RIDs must be unique, a single DC
holds the FSMO “RID master” role. The RID master is a single source of
generating RIDS and handing them out to the DCs, thus ensuring that two
DCs don't give out the same RID to an object.
The RID master allocates blocks of RIDs to DCs to
allow them to create new accounts (user and computer) and groups, and
assign a unique SID to each account. The RID master allocates blocks of
500 RIDs at a time to each DC and when that block is 50% depleted,
another block is allocated to the DC. Thus, the DC has a considerable
buffer to allow it to create accounts even if connectivity to the RID
master is broken. Loss of this role holder is not critical unless a
large number of accounts are created, such as during a migration, or an
application is run that creates large numbers of accounts. The size of
the RID pool allocated to each DC can be modified in the Registry at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\RID Values\by setting the RID Block Size (REG_DWORD)
to a value greater than 500. Setting it to less than 500 leaves the
default setting of 500 in place. Microsoft recommends leaving this at
the default, but if you do modify it, note that setting it abnormally
high can have “adverse effects on the domain longevity,” although the
reasoning for these recommendations is not given (see Microsoft KB
article 316201, “RID Pool Allocation and Sizing Changes in Windows 2000
SP4”).
note
Windows 2000 preSP4 reallocated the RID pool to a DC
when it was 80% depleted, or contained about 100 RIDs. When performing
migrations or other automated tasks in which large numbers of users,
groups, or computer accounts were created, this could deplete the RID
pool quickly. Lowering the threshold for a refresh of the pool is
intended to minimize the probability of exhausting the RID pool and
prevent creating accounts on a DC if the RID master cannot be contacted
or the RID pool cannot be refreshed in time.
The threshold for Windows 2000 SP4 and Windows Server 2003 RID masters to refresh the DCs' RID pool is now 50%.
One issue that surfaced in Windows 2000 was the case
of an RID master being brought back online when the role had been
seized. This sometimes caused duplicate SIDs to be assigned because the
old RID master had not replicated to find out that it wasn't the RID
master anymore. Windows 2000 postSP2 and Windows 2003 changed this
behavior by requiring the RID master to do one full synch with its NC
(domain) before advertising itself as a RID master and handing out RIDs.
PDC Emulator
The PDC Emulator has a number of critical roles, many of which affect users. These functions include
Password changes:
Password changes are recorded on the PDC Emulator as well as the
authenticating DC to ensure the new password can be used until it is
fully replicated to all DCs.
Group Policy editing:
Group Policy Objects (GPOs) are modified preferentially on the PDC to
permit a single source for changes and reduce the probability of losing
changes to the “last writer.”
Time Services:
The PDC in each domain is responsible for time synchronization of all
DCs, who in turn are authoritative time sources for all clients in the
domain. In a multiple-domain forest, the root domain PDC is the time
source for PDCs in other domains in the forest.
Account lockouts:
Processing account lockouts prevents security attacks that attempt to
guess accounts and passwords on multiple DCs before the account lockout
could otherwise be processed in normal replication.
Acting as the domain master browser:
This is a role that has been associated with the PDC since Windows NT 4.0.
The PDC Emulator plays a number of roles, and the
list is growing. You will probably note additional functions given to
the PDC Emulator as Microsoft develops the operating system (OS) and
finds additional need for a single source for functions.
Thus, the PDC Emulator failure is immediately visible
to users in a mixed-mode domain or in a domain supporting downlevel
clients because it has security implications, can cause browsing
failures, and could cause time sync failures, possibly resulting in
authentication failures and security breakdowns. The PDC Emulator is
perhaps the most critical of all role holders and should be brought back
online via transfer or seizure if it will be offline for an extended
period of time. Each organization must define this period.
Infrastructure Master
The infrastructure master is responsible for
resolving interdomain lookups. If a user from the Americas domain is
added to a group in the Europe domain, the infrastructure master
compares the user-group references that it knows about for objects in
its domain with what a GC knows about those objects. If the GC has
different information, the infrastructure master updates its data.
Loss of the infrastructure master is not serious and
won't affect users. For example, suppose an account was created for
Abigail Witbeck in the North America domain, was added to the
LondonUsers security group in the Europe domain, and replicated to other
DCs in the domain as well as GCs in the forest. Abigail sends a request
that her username be changed to Shanna Witbeck (as she prefers using
her middle name), so you change the account to Shanna Witbeck. If the
infrastructure master is unavailable in the North America domain, the
LondonUsers group still contains the object “Abigail Witbeck.” This
poses no security risk, and would cause confusion only if an
Administrator was observing the group membership before the
infrastructure master came back online to make the change. Note that the
infrastructure master in the domain that the group lives in (Europe
domain in this example) is responsible for updating the name change in
the group membership.
The infrastructure master should not be on a GC. This
would cause the infrastructure master to fail to update other DCs,
because it updates only data that differs from the GC. If the
infrastructure master is a GC, there is no difference and the domain DCs
don't get the update.
In two instances, the infrastructure master role is irrelevant:
A single domain in which there are no cross-domain references and each DC is essentially a GC.
A
multiple domain forest in which every DC is designated as a GC; thus
there is no difference between what a DC has and what a GC has.
With the FSMO roles defined, let's examine how to
determine where to place the FSMO role holder DCs to make sure they are
able to efficiently serve their purposes.
Placement of FSMO Role Holders
In general, the placement of DCs holding FSMO roles should
Have sufficient resources (memory, processor,
disk space, and so on) to handle the load. The PDC Emulator demands
more resources, typically, than the other role holders, and generally is
the only FSMO that requires special consideration for hardware
resources.
Have sufficient network
connectivity to be accessible to clients, servers, and DCs in the domain
or forest. Placing FSMO role holders at the well-connected hub sites is
a common practice. Remember that there is only one PDC Emulator in the
domain. If you have a single-domain model and have clients all over the
world, connectivity to the PDC could be an issue. Make certain the PDC
is located in a well-connected site.
Have
sufficient security to protect role holders against malicious physical
attacks. Just like any DC, an FSMO role holder must be protected against
the possibility of thieves stealing the server or the disk. There have
been a number of instances of thieves stealing the disk drive out of a
DC or GC server, and thus a copy of all usernames and other sensitive
data. Don't overlook this security measure.
Have
sufficient support resources—personnel and hardware replacements—to
mitigate outages. Make sure the after-hours support team has the
training and information they need to resolve problems should they occur
over nights and weekends. Make sure you have good coordination between
shifts.
Define FSMO outage in the SLA
(Service Level Agreement) so everyone knows how to handle outages (this
is related to the previous item). Determine what conditions should be
present for each FSMO role holder to justify a seizure of the role to
another machine. Make sure your staff understands these rules.
Define
and maintain one or more “standby” DCs to transfer FSMO roles to in
case of an outage that requires transfer or seizure of the role(s).
These DCs should have the resources and network connectivity required to
handle the load that the role carries.
Transfer and Seizure of Roles
To move roles from one DC to another, they should be
“transferred.” This is accomplished via the Active Directory Users and
Computers snap-in for domain-wide roles, the Active Directory Domains
and Trusts snap-in for the domain naming master role, or the Schema
Manager snap-in for the schema master role. The bad thing about using
these snap-ins is that there are three different snap-ins to change the
five roles. The NTDSUtil.exe tool, available in the Windows 2003 support
tools, is my personal favorite because it displays all current role
holders and permits easier recognition of role holders and transfer or
seizure to new DCs. Another tool, Replication Monitor, available in the
Windows 2000 and 2003 support tools, not only allows you to see who the
role holders are for all five roles and transfer the roles, but it also
allows you to see whether the current role holders can be contacted. In
the Replication Monitor application, you can add a server using the Add
Monitored Server menu option. Once added, right-click on the server
icon, go to Properties, and then select the FSMO tab. The role holders
are listed along with a Query button. Clicking this button causes that
server to query the FSMO role holder to see whether it can be contacted.
tip
The fastest way to find who the role holders are for the five FSMO roles is using the Netdom command:
C:\>netdom /query fsmo
Schema owner qtest-dc22.Qtest.cpqcorp.net
Domain role owner qtest-dc22.Qtest.cpqcorp.net
PDC role qtest-dc22.Qtest.cpqcorp.net
RID pool manager qtest-dc22.Qtest.cpqcorp.net
Infrastructure owner qtest-dc5.Qtest.cpqcorp.net
The command completed successfully.
Transferring roles requires that the existing role
holder be online and accessible during the transfer process. The role is
moved to another DC and the original DC relinquishes the role. However,
in the case of a DC that is unavailable because of hardware failure,
network failure, and so forth, transferring the role is not possible.
Seizure of roles can be executed via the snap-in or
NTDSUtil, which simply assigns a particular DC to be the new role holder
and advertises that fact to the other DCs in the domain or forest as
needed. The danger, of course, is when the original comes back online
and it doesn't know of the role change. This scenario has been modified
somewhat in Windows 2003 to reduce problems that occurred in Windows
2000, such as duplicate RIDs being assigned.
Whenever a seizure is attempted via the snap-in or
NTDSUtil, a transfer is always attempted first. If the transfer fails,
the seizure proceeds. A seizure should only be used when it's critical
that the role holder comes back online without waiting for the original.
note
Since the early days of Windows 2000, Microsoft has
always recommended that FSMO role holders never come back online after
their role has been seized. Although Windows 2003 and Windows 2000 SP3+
has made the AD more tolerant of this situation, it's best to be safe
and just wipe and reload the machine, cleaning the objects out of the
AD. (See Microsoft KB article 216498, “How to Remove Active Directory
objects after an unsuccessful DC demotion,” for more information.) Thus,
in determining whether a role should be seized, assess the impact to
the environment by going without that role as opposed to wiping and
reloading a DC or GC and cleaning up the AD.
The policy of FSMO role seizure should be defined in an SLA. This definition will vary from environment to environment.