Windows Error Reporting (WER) is the client
component for the overall Watson Feedback Platform (WFP), which allows
Microsoft to collect reports about failure events that occur on a user’s
system, analyze the data contained in those reports, and respond back
to the user in a meaningful and actionable manner.
WER is the technology that reports user-mode
hangs, user-mode faults, and kernel-mode faults to the back-end servers
at Microsoft and replaces Dr. Watson as the default application
exception handler.
Note
WER in Windows Vista has
support for any kind of problem event as defined by the developer, not
just critical failures as in Windows XP. |
1. Overview of Windows Error Reporting
The Watson Feedback Platform is illustrated in the high-level flow diagram in Figure 1, with Windows Error Reporting labeled as the Watson Client.
In Windows Vista, the user interface for Windows
Error Reporting is the Problem Reports and Solutions Control Panel
applet. When installing Vista, you can choose if you would like WER to
send basic problem reports automatically. Basic problem reports include
only the minimum amount of information necessary to search for a
solution. Later you can choose to send additional information
automatically as well. The goal of the Problem Reports and Solutions
control panel is to provide you with one location to simply and
efficiently view the problem events that have occurred on your computer,
track your reports, manage responses from Microsoft, and act on these
responses to prevent failures in the future.
One significant improvement of Windows Error
Reporting in Windows Vista is the concept of queuing. In Windows XP, WER
reports could only be sent at the time the event occurred, with few
exceptions. In Windows Vista, WER provides a flexible queuing
architecture where users, administrators, or WER integrators can
determine the queuing behavior of their WER events.
2. Error Reporting Cycle
The cycle begins when a
report is generated on a user’s system and completes when a response is
returned to the user. Overall, five primary steps are involved in this
process:
Reporting
The first step is the creation and submission
of the report. This can be triggered by a number of events, including an
application crash, application hang, or stop error (blue screen). In
Windows Vista, applications can also be designed to define their own
custom event types, allowing them to initiate the reporting process when
any type of problem occurs.
Categorization
After the back-end servers at Microsoft receive
the report, it is categorized by problem typeCategorization may be
possible with only the event parameters (text descriptors of the event)
or it may require additional data (dumps). The end result of
categorization is that the event reported by the customer becomes a
Watson Bucket ID. This allows the developers investigating the events to
determine the most frequently reported problems and focus on the most
common issues.
Investigation
After the problem is categorized, development
teams may view the report data via the Watson portal. The Watson portal
provides the data necessary to understand high-level trends and
aggregate data, such as the top errors reported against an application.
It also provides a mechanism to investigate the low-level data that was
reported to debug the root cause of the problem.
Resolution
After a developer has determined the root cause
of a problem, ideally a fix, workaround, or new version will be created
that can be made available to the customer.
Response
The final step is to close the loop with the
customer that reported the problem by responding to his report with
information he can use to mitigate the issue. A customer may receive a
response in two ways:
If the issue is understood at the time
an error report is submitted, the customer can receive a response in the
form of a balloon notification immediately after the categorization
step.
If the issue is not understood at
the time an error report is submitted, but is resolved some time after
the report, you will be able to query for updated knowledge of the
problem at a later time. Users can also elect to manually check for new solutions using the Problem Reports And Solutions control panel.
3. Report Data Overview
To optimize the reporting process, the WER error
data is divided into first- and second-level data. During first-level
communication with the back-end servers, WER determines if more data is
needed. If the server returns a request for more data, collection of the
second-level data begins immediately. Simultaneously, a second-level
consent dialog is displayed.
First-Level Data
First-level data consists of up to 10 string
parameters that identify a particular classification of the problem.
This data is stored in the report manifest file, report.wer, and is
initially submitted to the Watson back-end servers. (The report.wer file
is not itself sent—only the parameters are sent.) The included
parameters are used to identify a class of problems. For example, the
parameters for a crash (Application Name, Application Version, Module
Name, Module Version, Module Offset, AppTimeStamp, ModTimeStamp, and
ExceptionCode) provide a unique way to accurately classify a crash. The
parameters are the only data submitted to the Watson back-end during
first-level communication.
Report.wer File
Reports are stored in an archive as a folder
structure on the system. Each report subfolder contains, at a minimum,
the report manifest text file (report.wer), which describes the contents
of the error report. Although the report.wer file is a simple text
file, it is not meant to be human-readable or editable. Any files
referenced by the report are also placed in this folder. The following
major sections appear in most report.wer files:
Version
Event Information
Signature
UI
State
Files
Response
Second-Level Data
Second level data is additional data that may
be needed to diagnose and resolve a particular bucket. Since Microsoft
usually only needs a small sample of this verbose data, the second level
data is submitted only if the back-end server requests it and the user
consents to sharing the data. Second level data is split into two
categories:
Safe data This
is information that the developer feels is unlikely to contain any
personal information, such as a small section of memory, a specific
registry key, or a log file.
Other data This encompasses everything else, which may or may not contain personal information.
You have the option to always send safe data
automatically. Second level data is specified by the back-end Watson
servers and can include but is not limited to the following items:
Files
Minidump
Heap
Registry Keys
WMI queries
Note
Because of security reasons, report manifests are not allowed to reference files outside the report directory. |