Things you should know about Sharepoint 2016 Search


The search architecture of Sharepoint 2016 contains search components and databases that work cohesively to perform the search operation.. The size and structure of the search architecture depends on the volume of your content, availability, estimated amount of page views, fault tolerance and queries per second etc. Architecture can vary depending on the composition of the data that is crawled. The search architecture can be for the enterprise or for Internet sites.

Sharepoint Search components

In SharePoint Server 2016, search components are not hosted on Web servers. All components reside on application servers and all databases reside on database servers except  query processing component and index components. The query processing component and index components reside on the Web servers to make maximum use of the available hardware resources and to simplify scaling out the search topology. We'll talk about these. There are several component that involved in search and some of these categorize and described in sections below.

1. Search index

The search index is a set of files that are stored in separate folders on a server. The Content Processing Component processes items provisioned by the Crawl Components, maps crawled properties to managed properties, and formats these as artifacts that can be stored on the search index. The indexes can include:

  • Full-text indexes.
  • Indexes of the managed properties (marked as retrievable or queryable).
  • An index for attribute vectors
  • Numeric indexes.


Because the Search Index can be a very large file, you can divide it into index partitions to improve performance and management. An index partition is a logical portion of the entire search index. The index comprises the Index Component, index partitions, and index replicas, which are described as follows



Index component
The index component is the logical representation of an index replica. This is used both in the content ingestion process and in the content querying process. For the former, the Indexing Component receives items from the Content Processing Component and writes them to the index file. For the latter, it receives queries from the Query Processing Component and returns the result sets that are applicable to the query.The Indexing Component is also responsible for much of the index content management. For example, it will physically reorder index content if changes to the index architecture are triggered by the Search Administration Component.

Index replicas
This is a physical copy of an index partition. Replicas can be either a Primary Replica or a Secondary Replica. The Primary Replica is contacted by the Content Processing Component to write new items to an Index Partition. Secondary Replicas are read-only copies of the same data, which are used to provision results. You can scale you search index in two ways:


  1. You can add Index partitions to manage increasing search content volume. For example, in a farm with three index partitions, each index partition contains one-third of the entire search index.
  2. You can add Index replicas in an index partitions to manage high query loads or to provide increased fault tolerance. Each index partition has one or more index replicas. For example, in a farm with one index partition that contains three index replicas, each index replica serves one-third of the total queries

Each index partition holds one or more index replicas that contain the same information. You have to provision one index component for each index replica. To achieve fault tolerance and redundancy, create additional index replicas for each index partition and distribute the index replicas over multiple application servers.

Index partitions
 This is a logical portion of the entire search index; the index is the aggregated result of all of the index partitions.
  • You can divide the index into discrete portions, each holding a separate part of the index.
  • An index partition is stored in a set of files on a disk.
  • The search index is the aggregation of all index partitions.
You can add Index partitions to manage increasing search content volume. For example, in a farm with three index partitions, each index partition contains one-third of the entire search index.

You can add Index replicas in an index partitions to manage high query loads or to provide increased fault tolerance. Each index partition has one or more index replicas. For example, in a farm with one index partition that contains three index replicas, each index replica serves one-third of the total queries

In SharePoint Online, content is automatically crawled based on a defined crawl schedule. The crawler picks up content that has changed and updates the index. In some cases, you may want to manually request crawling and full re-indexing of a site, a document library, or a list.


2. Query processing component 
Analyzes and processes search queries and results.


3. Search administration component

Runs system processes that are essential to search. There can be more than one search administration component per Search service application, but only one component is active at any given time

4. Crawl Component 
Crawls content based on what is specified in the crawl databases. The crawl component crawls the content sources including file shares, SharePoint content, line of business applications and many more. Crawl component connects to the content sources, passing crawled items to the content processing component by invoking the appropriate indexing connector or protocol handler for retrieving information.

• Retrieves content that needs to be indexed
• Brings actual content and the metadata
• Invokes the protocol handlers
• Utilizes the Crawl Database to maintain list of items to be crawled


Crawl Modes:

  • Full Crawl Mode – Discover and Crawl every possible document on the specific content source.
  • Incremental Crawl Mode – Discover and Crawl only documents that have changed since the last full or incremental crawl.

5. Content processing component 
Carries out various processes on the crawled items, such as document parsing and property mapping.

6. Analytics processing component 
Carries out search analytics and usage analytics

7. Search administration database 
Stores search configuration data. Only one search administration database per Search service application. 

8. Crawl database
 Stores the crawl history.
 Manages crawl operations.
 Each crawl database can have one or more
crawl components associated with it.

9. Link database
Stores the information extracted from the content processing component and also stores clickthrough information.

10. Analytics reporting database
Stores the results of usage analytics.


Search component and interactions

All Search components reside on application servers and all databases reside on database servers. The following show an overview of all the available search components and search databases. 
Crawl and component processes Architecture

Crawl and component processes
Index and query processes 
Search administration
Analytics processes 

The crawl and content processing architecture include the following components and these can be scaled out based on crawl volume and performance requirements:

Crawl and component processes




Crawl component
The crawl component is responsible for crawling content sources and delivers crawled items that include both actual contents as well as their associated metadata to the content processing component. The crawl component invokes connectors (protocol handlers) that interact with content sources to retrieve data. It can be possible the multiple crawl components can be deployed for crawling simultaneously. The crawl component uses one or more crawl databases to temporarily store information about crawled items and to track crawl history.

Content processing component
It Transforms the crawled items and sends them to the index component. This component also maps crawled properties to managed properties. The content processing component is placed between the crawl component and the index component. It transforms crawled items into artifacts that can be included in the search index by carrying out operations such as document parsing and property mapping.

Analytics processing component
Carries out search analytics and usage analytics.

Comments

Popular posts from this blog

How to Create an Auto-Incrementing Number Field in a SharePoint List

SharePoint Integration with IIS

Error:Installing product D:\exchangeserver.msi failed. Fatal error during installation. Error code is 1603.