3. BizTalk Group Disaster Recovery Procedures
A disaster recovery event for the BizTalk Group
consists of restoring the BizTalk Group databases as well as related
non-BizTalk databases on the destination system SQL Server instances.
This also includes any DTS packages as well as SQL Agent jobs that exist
in the source system (production).
The first step is to ensure that the last backup set
has been restored to all SQL Server instances that are part of the
destination system. This can be confirmed by reviewing the
Master.dbo.bts_LogShippingHistory table that is populated by the Get
Backup History SQL Agent job. When a backup is successfully restored,
the Restored column is set to 1, and the RestoreDateTime is set to the
date/time the restore was completed. When all of the databases that are
part of a backup set have been successfully restored, the backup set ID
is written to the Master.dbo.bts_LogShippingLastRestoreSet table. Once
you have confirmed that available backup files have been applied, follow
these steps on each SQL Server instance in the destination system:
Navigate to the SQL Agent Jobs view.
Right-click and select Disable Job to disable the following SQL Agent jobs:
Right-click BTS Log Shipping—Restore To Mark and select Start Job.
Once
you have verified that the job BTS Log Shipping—Restore To Mark has
completed, copy the script and XML files UpdateDatabase.vbs and
SampleUpdateInfo.xml to the server where the SQL Server instance is
running and execute the following command:
cscript UpdateDatabase.vbs SampleUpdateInfo.xml
NOTE
On 64-bit servers, run the UpdateDatabase.vbs script from a 64-bit command prompt.
As promised, we next cover the disaster recovery
procedures for the BizTalk runtime servers. Later subsections cover
disaster recovery procedures for BAM and EDI functionality.
4. BizTalk Runtime Server Disaster Recovery Procedures
The BizTalk runtime servers in the destination system
should have BizTalk Server 2009 as well as any required third-party
adapters or software installed using the same guidelines for the
production BizTalk runtime servers. There are generally two methods for
setting up the BizTalk runtime servers:
Method 1: Restore BizTalk Group, configure BizTalk servers in BizTalk Group, and deploy applications.
Method 2:
Configure disaster recovery BizTalk servers in production BizTalk
Group, disable services, keep the server up to date, and run an update
script to update locations of databases in the destination system.
Both methods have advantages and disadvantages, which
you'll find out more about in our detailed discussion of these methods
next.
4.1. Method 1
To proceed with method 1, first verify that
procedures to restore the BizTalk Group databases and related
application databases have been completed. Once completed, proceed with
restoring the BizTalk runtime servers using method 1. Method 1 has all
software preinstalled, but not configured, and without any applications
deployed on the BizTalk servers in the destination system. When the
BizTalk Group is restored in the destination system and the BizTalk
severs are configured using Configuration.exe, select Join for the
BizTalk Group, not Create. The first server configured should have the
master secret restored on it and then designated as the master secret
server for the BizTalk Group using the Enterprise SSO management tools.
Once all of the BizTalk servers are configured in the BizTalk Group at
the destination system, deploy the BizTalk applications (assemblies and
bindings).
While many of the steps can be scripted, this method
essentially brings online a new environment when recovering from a
disaster. At the same time, it reduces the amount of ongoing maintenance
work for the destination system to a degree, since just the latest
version of the application is deployed.
4.2. Method 2
Method 2 also has all software preinstalled, but
takes it a step further and actually configures the BizTalk servers in
the destination system to be member servers in the production BizTalk
Group. Applications (assemblies and bindings) are deployed to the
destination system BizTalk servers just like in production, except that
the BizTalk host instances and all other BizTalk-related Windows
Services are disabled and do not perform any processing in the
destination system. During a disaster recovery event, a script is run on
the destination system BizTalk servers to update the new location of
the BizTalk Group in the destination system SQL instances. Once updated,
processing can be enabled. Method 2 is recommended because it results
in a faster recovery and less change overall. To proceed with method 2,
first verify that procedures to restore the BizTalk Group databases and
related application databases have been completed.
NOTE
Path references to Microsoft BizTalk Server 2009
will be located in the Microsoft BizTalk Server installation directory
if an in-place upgrade was performed when BizTalk Server 2009 was
installed. For example, if you upgrade BTS 2006 to BTS 2009, your
installation directory will be {Program Files}\Microsoft BizTalk Server
2006\.
Once verification is completed, perform these steps:
Copy
the edited SampleUpdateInfo.xml file to the \Program Files\Microsoft
BizTalk Server 2009\Schema\Restore directory on every BizTalk server in
the destination system.
On each BizTalk Server, open a command prompt (must be 64-bit if on a 64-bit OS) by selecting Start => Run, typing cmd, and then clicking OK.
At
the command prompt, navigate to the location of the edited
SampleUpdateInfo.xml file and the script (\Program Files\Microsoft
BizTalk Server 2009\Schema\Restore is the default), and enter this
command:
cscript UpdateRegistry.vbs SampleUpdateInfo.xml
Enable and restart all BizTalk host instances and all other BizTalk services on the BizTalk servers in the destination system.
Restart WMI on each BizTalk server in the destination system by selecting Start => Run, typing services.msc, and clicking OK. Then right-click Windows Management Instrumentation and select Restart.
On each BizTalk server, open the BizTalk Server Administration Console, right-click BizTalk Group, and select Remove.
Right-click
BizTalk Server 2009 Administration, select Connect to Existing Group,
select the SQL Server database instance and database name that
corresponds to the BizTalk Management database for the BizTalk Group,
and click OK.
Restore
the master secret on the master secret server in the destination system
if not already completed by following the steps detailed in the
subsection titled "The Master Secret Restore Procedures" earlier.
5. Restore Procedures for BAM
The BizTalk Server 2009 documentation covers these
procedures extensively, so we won't repeat them here. BAM consists of
SQL Server databases, SQL Analysis databases, and DTS packages. Refer to
the section titled "Backing Up and Restoring BAM" in the BizTalk Server
2009 documentation for the details.
6. Other Disaster Recovery Tasks
This subsection covers other tasks and recommendations related to disaster recovery. Tracking data is an important part of a BizTalk
solution, since that data can be used for reporting and as part of
recordkeeping regulations compliance. It can also be used to help
recover from a disaster, because it is a record of data processing
activity. For this reason, we recommend separating your tracking
databases from the runtime databases that generate tracking data by
configuring your databases in separate SQL Server instances on different
disks in production. Data in the tracking databases can be used to help
determine the state of the system up to the point of failure for the
runtime databases. Tracked messages and events can indicate what
processes may have already happened and what messages have been received
or sent.
NOTE
Tracking data is not written directly to the
tracking databases. Instead, it is cached on the Messageboxes and moved
to the Tracking database. Therefore, in the event of a Messagebox data
loss, some tracking data may be lost as well.
The next subsection covers steps to evaluate data loss for the BizTalk Group with tips on how to recover data.
6.1. Evaluating Data Loss for the BizTalk Group
After data loss has occurred, recovering it is often
difficult or impossible. For these reasons, using a fault-tolerant
system to prevent data loss is extremely important. In any case, a
disaster may occur, and even the most fault-tolerant system has some
chance of failure. This subsection covers methods to help determine the
state of the system when the failure occurred and how to evaluate
corrective action.
6.1.1. Managing In-Flight Orchestrations
The Messagebox databases contain the state of
orchestrations that are currently in progress. When data is lost from
the Messagebox databases, it is not possible to tell exactly what data
has been lost. Therefore, it will be necessary to examine external
systems to see what activities have occurred in relation to the
in-progress orchestrations.
Once it is determined what has occurred, steps can be
taken to restore processes. For example, if upon looking at external
systems or logs it is determined that an orchestration was activated but
didn't perform any work, the message can be resubmitted to complete the
operation.
It is important to consider what information will be
available to compare with in-flight orchestrations in order to decide
whether to terminate or resume particular in-flight orchestrations.
Available information is largely determined by the architecture and
design of the system such as what logging is performed "out-of-band" so
as to not impact performance but at the same time provide an audit of
events for comparison purposes.
6.1.2. Viewing After the Log Mark in Tracking Databases
While all databases need to be restored to the same
mark for operational reasons in order to restore a consistent BizTalk
Group, administrators can use a Tracking database that was not lost in
Archive mode to see what happened after the mark. The process of
evaluating the data begins by comparing services that are in flight in
the BizTalk Administration Console Operations views against their state
in BizTalk Group hub reporting. If the Group Hub Reporting shows it as
having completed, the instances can be terminated.
BizTalk Message Tracking Reporting may show instances
that started after the point of recovery. If so, any actions these
instances took must be compensated, and then the initial activation
messages can be submitted.
Reporting may also show that instances have
progressed beyond the point at which the Operations view indicates. In
this case, use the Orchestration Debugger in Reporting to see the last
shapes that were executed, and then use Message Flow to see what message
should have been sent or received. If they do not match the state in
the Operations view, corrective action is required. Options are to
terminate, compensate and restart, or resubmit any lost messages.
NOTE
If the BizTalk Tracking database is lost, all
discovery of what happened past the point of recovery will need to be
done using the external system's reporting mechanisms.
6.1.3. Marking In-Flight Transactions as Complete in BAM
BAM maintains data for incomplete trace instances in a
special active instance table. If some instance records were started
before the last backup but completed after the backup, those records
will remain in the active instance table because the completion records
for the instance will have been lost. Although this does not prevent the
system from functioning, it may be desirable to mark these records as
completed so that they can be moved out of the active instance table. To
accomplish this, manual intervention is necessary.
A list of incomplete ActivityIDs for a given activity
can be determined by issuing the following query against the BAM
Primary Import database:
Select ActivityID from bam_<ActivityName> where IsComplete = 0
If data from external systems indicates that the
activity instance is in fact completed, use the following query to
manually complete the instance:
exec bam_<ActivityName>_PrimaryImport @ActivityID=N'<ActivityID>', @IsStartNew=0, @IsComplete=1
7. Related Non-BizTalk Application Disaster Recovery Procedures
There may be additional
non-BizTalk applications that must be restored as part of the overall
application solution. If these application databases participate in
distributed transactions with the BizTalk Group databases, the databases
should be part of the Backup BizTalk Server SQL Agent job and restored
to the same mark as the other BizTalk Group databases. In general, each
individual application should have a disaster recovery plan tailored to
the application that should be part of the overall solution disaster
recovery plan.