Metadata is information
that is associated with a document or file that is not necessarily an
explicit part of the visible document. Often, metadata is held in hidden
tags on a document or with files or records associated with that
document. SharePoint 2010 has a powerful mechanism to assign a large
number of properties to lists and documents, which is configurable by
the administrator and updatable by authors and collaborators.
In this section, we will
cover how to crawl metadata such as metatags, document properties, and
SharePoint custom properties, as well as see how to map that metadata to
managed properties to make them available in search.
The first step to working
with metadata in SharePoint search is to get familiarized with the
existing property mappings and the crawled properties. In the Search
service application under the Queries and Results section of the left
navigation, there is a link to the Metadata Properties page. On this
page, all of the managed properties and their mappings to crawled
properties are listed. There are several default mappings. Many of the
crawled property mappings are obvious—for example, People:AccountName
(text). But others are not obvious—for example, Office:5 and Basic:6.
Those beginning with OWS are from SharePoint list columns.
By selecting Crawled Properties
at the top of the Metadata Properties page, a list of all crawled
properties and their respective mappings is shown. It is possible to
glean, in some cases, what the specific crawled properties mean. But in
many cases, they are a mystery. However, this is generally of little
consequence. It is important to name columns that contain custom
properties with unique and telling names so that they may be easily
identified and mapped to managed properties.
By default, columns are indexed
by the crawler, but they are not all mapped to a managed property and so
are not searchable. The exception to this is that crawled text
properties are searchable as free text when included in the index but
are not explicitly searchable as properties. To map a crawled property
from a column to a managed property, navigate to the Metadata Properties
page in the Search service application. Select New Managed Property at
the top of the page (see Figure 1).
On the New Managed Property
page, the property can be defined. The name should be indicative of the
column and perhaps have the same name. The name cannot contain spaces.
Users should be able to enter this term in the search box with
appropriate search syntax to return documents with these properties.
Using unusual codes or mysterious naming conventions should be avoided.
Declare a type of property and whether individual properties in the
columns will hold multiple values. The multiple values check box is not
necessary if different records have a single value in the column but
differ—only if a single property entry associated with a single record
may have multiple values.
Multiple crawled properties can be
mapped to a single managed property. This is useful when indexing
several lists, or libraries with similar columns but different headings
and hence different crawled property names. Also, different crawled
properties from different document types or from other sites can be
merged into a single, searchable managed property.
Select Add Mapping to find the
crawled property to map to the managed property. The Crawled Property
Selection dialog allows a category (such as SharePoint) to be chosen and
a title filter applied to narrow a potentially long list of crawled
properties. The title filter is controlled by the Find search box and
uses a "contains" operator so that any property with the entered term in
it will be returned (see Figure 2).
All managed properties can
be allowed to be used in scopes to make logical division of the search
index. This allows for searching in a particular set group of documents
with specific properties. For example, People Search in SharePoint uses a
scope that refines queries to only the People items in the index. See
more on scopes in the next section.
Finally, managed properties
that are text can be stored as hash in the database. To do this, set the
"Reduce storage requirements for text properties by using a hash for
comparison" check box when creating a new managed property. This will
reduce the amount of space necessary to store the properties but will
limit the search operators that can be used to find the properties to
equality or inequality.