Monday, 23 July 2018

Table Scan VS Index Scan VS Index Seek


What are the differences between Table Scan, Index Scan and Index Seek?

TABLE SCAN:

The SQL Server optimizer's job is to choose the best way to execute a query. The optimizer uses indexes to improve query execution time. When you query a table that doesn't have indexes, or if the optimizer decides not to use an existing index or indexes, the system performs a table scan.

In a table scan, SQL Server sequentially (row by row) reads the table's data pages to find rows that belong to the result set.

EXAMPLE:

-- Create a Table with out Index
CREATE TABLE [dbo].[ScanSeek](
        [ScanSeekID] [int] IDENTITY(1,1) NOT NULL,
        [ScanSeekValue] [varchar](100) NULL
)
GO

-- Insert 1 Lac Records to ScanSeek table.
INSERT INTO ScanSeek (ScanSeekValue)
VALUES (CONVERT (VARCHAR(100), (SELECT ISNULL(MAX(ScanSeekID),0) + 1  FROM [ScanSeek] WITH (NOLOCK))))
GO 100000

-- Select all the records from ScanSeek table.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
GO

-- Select a record from ScanSeek table where ScanSeekID = 100.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
WHERE [ScanSeekID] = 100
GO






From the above two select queries, either to fetch all the records from table OR to fetch only one the matched record from table where ScanID=100, while seeing the execution plan the Cost is 100%, so it is obviouly bottleneck for the performance,

From the second SELECT query, the query optimizer used the table scan which means it reads/scans 100000 (1Lac) records sequentially row by row of the table for fetching the matched row.

How to overcome this issue OR to fetch only the matched records from table?
Adding/creating appropriate indices on table that helps for retriving the data fast.


INDEX SCAN:

An index scan is, scan the data or index pages to find the appropriate records, an index scan retrieves all the rows from the table, it means that all the leaf-level of the index was searched to find the information for the query,

There are two types of Index Scan.

  • Index Scan on Clustered Index
  • Index Scan On Non-Clustred Index

1.     Index Scan on Clustered Index -  In a table, when the index is a clustered index and search key column is not indexed, in this case, SQL server navigates from root level to leaf-level (data pages)  until the data match, which means that  scanning the data page by using the clustring key value as a pointer/reference.


-- Create a Table without Index
CREATE TABLE [dbo].[ScanSeek](
    [ScanSeekID] [int] IDENTITY(1,1) NOT NULL,
    [ScanSeekValue] [varchar](100) NULL
)
GO

-- Insert 1 Lac Records to ScanSeek table.
INSERT INTO ScanSeek (ScanSeekValue)
VALUES (CONVERT (VARCHAR(100), (SELECT ISNULL(MAX(ScanSeekID),0) + 1 
FROM [ScanSeek] WITH (NOLOCK))))
GO 100000

-- Create a Clustered index on ScanSeek table.
CREATE CLUSTERED INDEX [PK_ScanSeek_ScanSeekID] ON ScanSeek(ScanSeekID)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

-- Select all the records from ScanSeek table.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
GO

-- Select a record from ScanSeek table where ScanSeekValue = 100.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
WHERE ScanSeekValue = '100'
GO








From the above two select queries, either to fetch all the records from table OR to fetch only one the matched record from table where ScanID=100, while seeing the execution plan the Cost is 100%, search column [ScanSeekValue ] is not Indexed on table,

From the both SELECT queries, the query optimizer used the clustered Index scan which means retrieving the data by just using the index tree from root level to leaf-level (data pages) .

2.     Index Scan On Non-Clustred Index – a table has a cluster and  a non-clustered index and search key column is not specifed, retrieving the data by just using the index tree from root level to leaf-level pages (key values not the actual data), which means that  scanning the leaf-level pages by using the non-clustring key value as a pointer to the data pages/complete rows.

A non-cluster index built on top of the cluster index structure. SQL server uses the clustring keys in the leaf-level pages of the non-cluster index to point to the cluster index.


-- Create a Table without Index
CREATE TABLE [dbo].[ScanSeek](
    [ScanSeekID] [int] IDENTITY(1,1) NOT NULL,
    [ScanSeekValue] [varchar](100) NULL
)
GO

-- Insert 1 Lac Records to ScanSeek table.
INSERT INTO ScanSeek (ScanSeekValue)
VALUES (CONVERT (VARCHAR(100), (SELECT ISNULL(MAX(ScanSeekID),0) + 1  FROM [ScanSeek] WITH (NOLOCK))))
GO 100000

-- Create a Clustered index on ScanSeek table.
CREATE CLUSTERED INDEX [PK_ScanSeek_ScanSeekID] ON ScanSeek(ScanSeekID)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

-- Create a Non Clustered index on ScanSeek table.
CREATE NONCLUSTERED INDEX [IX_ScanSeek_ScanSeekValue] ON ScanSeek(ScanSeekValue)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

-- Select all the records from ScanSeek table.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
GO





The Clustered Index Scan operator scans the clustered index specified in the Argument column of the query execution plan. When an optional WHERE:() predicate is present, only those rows that satisfy the predicate are returned. If the Argument column contains the ORDER clause, the query processor has requested that the output of the rows be returned in the order in which the clustered index has sorted it. If the ORDER clause is not present, the storage engine scans the index in the optimal way, without necessarily sorting the output.

INDEX SEEK:

Seek uses the index to pinpoint the records that are needed to satisfy the query.

There are two types of Index Seek.

1.     Index Seek on Clustered Index -  a table has clustered index, retrieving the data by using cluster index column, In Index tree, SQL server navigates from root level to leaf-level (data pages)  until the data match, which means that  pointing the data pages by using the clustring key value as a pointer/reference.

-- Create a Table without Index
CREATE TABLE [dbo].[ScanSeek](
   [ScanSeekID] [int] IDENTITY(1,1) NOT NULL,
   [ScanSeekValue] [varchar](100) NULL
)
GO

-- Insert 1 Lac Records to ScanSeek table.
INSERT INTO ScanSeek (ScanSeekValue)
VALUES (CONVERT (VARCHAR(100), (SELECT ISNULL(MAX(ScanSeekID),0) + 1  FROM [ScanSeek] WITH (NOLOCK))))
GO 100000

-- Create a Clustered index on ScanSeek table.
CREATE CLUSTERED INDEX [PK_ScanSeek_ScanSeekID] ON ScanSeek(ScanSeekID)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)


-- Select a record from ScanSeek where ScanSeekID = 100.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
WHERE [ScanSeekID] = 100
GO







2.     Index Seek on Non-Clustered Index -  a table has clustered index and non cluster index, retrieving the data by using non-cluster index column, In Index tree, SQL server navigates from root level to data pages until the data match, which means, a non-cluster index built on top of the cluster index structure. SQL server uses the clustring keys in the leaf-level pages (key values not the actual data)  of the non-cluster index to point to the data pages.


-- Create a Table without Index
CREATE TABLE [dbo].[ScanSeek](
    [ScanSeekID] [int] IDENTITY(1,1) NOT NULL,
    [ScanSeekValue] [varchar](100) NULL
)
GO

-- Insert 1 Lac Records to ScanSeek table.
INSERT INTO ScanSeek (ScanSeekValue)
VALUES (CONVERT (VARCHAR(100), (SELECT ISNULL(MAX(ScanSeekID),0) + 1  FROM [ScanSeek] WITH (NOLOCK))))
GO 100000

-- Create a Clustered index on ScanSeek table.
CREATE CLUSTERED INDEX [PK_ScanSeek_ScanSeekID] ON ScanSeek(ScanSeekID)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

-- Create a Non Clustered index on ScanSeek table.
CREATE NONCLUSTERED INDEX [IX_ScanSeek_ScanSeekValue] ON ScanSeek(ScanSeekValue)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
-- Select a record from ScanSeek where ScanSeekID = 100.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
WHERE [ScanSeekID] = 100
GO
-- Select a record from ScanSeek table where ScanSeekValue = 100.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
WHERE ScanSeekValue = '100'
GO






LOOKUP HEAP:

A table has only non cluster index, this case SQL server manages the table as Heap,  retrieving the data by using non-cluster index column, In Index tree, SQL server navigates from root level to data pages until the data match, which means, SQL server uses the leaf-level pages (key values not the actual data)  of the non-cluster index to point to the Heap.

-- Create a Table without Index
CREATE TABLE [dbo].[ScanSeek](
    [ScanSeekID] [int] IDENTITY(1,1) NOT NULL,
    [ScanSeekValue] [varchar](100) NULL
)
GO

-- Insert 1 Lac Records to ScanSeek table.
INSERT INTO ScanSeek (ScanSeekValue)
VALUES (CONVERT (VARCHAR(100), (SELECT ISNULL(MAX(ScanSeekID),0) + 1  FROM [ScanSeek] WITH (NOLOCK))))
GO 100000

-- Create a Non Clustered index on ScanSeek table.
CREATE NONCLUSTERED INDEX [IX_ScanSeek_ScanSeekValue] ON ScanSeek(ScanSeekValue)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

-- Select a record from ScanSeek where ScanSeekID = 100.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
WHERE [ScanSeekID] = 100
GO
-- Select a record from ScanSeek table where ScanSeekValue = 100.
SELECT [ScanSeekID], [ScanSeekValue]
FROM [dbo].[ScanSeek] WITH (NOLOCK)
WHERE ScanSeekValue = '100'