Index of sql

Index of sql DEFAULT

SQL - Indexes



Indexes are special lookup tables that the database search engine can use to speed up data retrieval. Simply put, an index is a pointer to data in a table. An index in a database is very similar to an index in the back of a book.

For example, if you want to reference all pages in a book that discusses a certain topic, you first refer to the index, which lists all the topics alphabetically and are then referred to one or more specific page numbers.

An index helps to speed up SELECT queries and WHERE clauses, but it slows down data input, with the UPDATE and the INSERT statements. Indexes can be created or dropped with no effect on the data.

Creating an index involves the CREATE INDEX statement, which allows you to name the index, to specify the table and which column or columns to index, and to indicate whether the index is in an ascending or descending order.

Indexes can also be unique, like the UNIQUE constraint, in that the index prevents duplicate entries in the column or combination of columns on which there is an index.

The CREATE INDEX Command

The basic syntax of a CREATE INDEX is as follows.

CREATE INDEX index_name ON table_name;

Single-Column Indexes

A single-column index is created based on only one table column. The basic syntax is as follows.

CREATE INDEX index_name ON table_name (column_name);

Unique Indexes

Unique indexes are used not only for performance, but also for data integrity. A unique index does not allow any duplicate values to be inserted into the table. The basic syntax is as follows.

CREATE UNIQUE INDEX index_name on table_name (column_name);

Composite Indexes

A composite index is an index on two or more columns of a table. Its basic syntax is as follows.

CREATE INDEX index_name on table_name (column1, column2);

Whether to create a single-column index or a composite index, take into consideration the column(s) that you may use very frequently in a query's WHERE clause as filter conditions.

Should there be only one column used, a single-column index should be the choice. Should there be two or more columns that are frequently used in the WHERE clause as filters, the composite index would be the best choice.

Implicit Indexes

Implicit indexes are indexes that are automatically created by the database server when an object is created. Indexes are automatically created for primary key constraints and unique constraints.

The DROP INDEX Command

An index can be dropped using SQL DROP command. Care should be taken when dropping an index because the performance may either slow down or improve.

The basic syntax is as follows −

DROP INDEX index_name;

You can check the INDEX Constraint chapter to see some actual examples on Indexes.

When should indexes be avoided?

Although indexes are intended to enhance a database's performance, there are times when they should be avoided.

The following guidelines indicate when the use of an index should be reconsidered.

  • Indexes should not be used on small tables.

  • Tables that have frequent, large batch updates or insert operations.

  • Indexes should not be used on columns that contain a high number of NULL values.

  • Columns that are frequently manipulated should not be indexed.

Sours: https://www.tutorialspoint.com/sql/sql-indexes.htm

A SQL index is used to retrieve data from a database very fast. Indexing a table or view is, without a doubt, one of the best ways to improve the performance of queries and applications.

A SQL index is a quick lookup table for finding records users need to search frequently. An index is small, fast, and optimized for quick lookups. It is very useful for connecting the relational tables and searching large tables.

SQL indexes are primarily a performance tool, so they really apply if a database gets large. SQL Server supports several types of indexes but one of the most common types is the clustered index. This type of index is automatically created with a primary key. To make the point clear, the following example creates a table that has a primary key on the column “EmployeeId”:

CREATETABLEdbo.EmployeePhoto

(EmployeeId      INTNOTNULLPRIMARYKEY,

Photo           VARBINARY(MAX)NULL,

MyRowGuidColumnUNIQUEIDENTIFIERNOTNULL

                                  ROWGUIDCOLUNIQUE

                                             DEFAULTNEWID()

);

You’ll notice in the create table definition for the “EmployeePhoto” table, the primary key at the end of “EmployeeId” column definition. This creates a SQL index that is specially optimized to get used a lot. When the query is executed, SQL Server will automatically create a clustered index on the specified column and we can verify this from Object Explorer if we navigate to the newly created table, and then the Indexes folder:

An executed query for creating a clustered index on a specified column

Notice that not only creating a primary key creates a unique SQL index. The unique constraint does the same on the specified columns. Therefore, we got one additional unique index for the “MyRowGuidColumn” column. There are no remarkable differences between the unique constraint and a unique index independent of a constraint. Data validation happens in the same manner and the query optimizer does not differentiate between a unique SQL index created by a constraint or manually created. However, a unique or primary key constraint should be created on the column when data integrity is the objective because by doing so the objective of the index will be clear.

So, if we use a lot of joins on the newly created table, SQL Server can lookup indexes quickly and easily instead of searching sequentially through potentially a large table.

SQL indexes are fast partly because they don’t have to carry all the data for each row in the table, just the data that we’re looking for. This makes it easy for the operating system to cache a lot of indexes into memory for faster access and for the file system to read a huge number of records simultaneously rather than reading them from the disk.

Additional indexes can be created by using the Index keyword in the table definition. This can be useful when there is more than one column in the table that will be searched often. The following example creates indexes within the Create table statement:

CREATETABLEBookstore2

(ISBN_NO    VARCHAR(15)NOTNULLPRIMARYKEY,

SHORT_DESCVARCHAR(100),

AUTHOR     VARCHAR(40),

PUBLISHER  VARCHAR(40),

PRICE      FLOAT,

INDEXSHORT_DESC_IND(SHORT_DESC,PUBLISHER)

);

This time, if we navigate to Object Explorer, we’ll find the index on multiple columns:

An executed query for creating a clustered index on specified columns

We can right-click the index, hit Properties and it will show us what exactly this index spans like table name, index name, index type, unique/non-unique, and index key columns:

Index Properties window in SSMS

We must briefly mention statistics. As the name implies, statistics are stat sheets for the data within the columns that are indexed. They primarily measure data distribution within columns and are used by the query optimizer to estimate rows and make high-quality execution plans.

Therefore, any time a SQL index is created, stats are automatically generated to store the distribution of the data within that column. Right under the Indexes folder, there is the Statistics folder. If expanded, you’ll see the sheet with the same specified name as we previously did to our index (the same goes for the primary key):

Statistics folder of a column in Object Explorer

There is not much for users to do on SQL Server when it comes to statistics because leaving the defaults is generally the best practice which ultimately auto-creates and updates statistics. SQL Server will do an excellent job with managing statistics for 99% of databases but it’s still good to know about them because they are another piece of the puzzle when it comes to troubleshooting slow running queries.

Also worth mentioning are selectivity and density when creating SQL indexes. These are just measurements used to measure index weight and quality:

  • Selectivity – number or distinct keys values
  • Density – number of duplicate key values

These two are proportional one to another and are used to measure both index weight and quality. Essentially how this works in the real world can be explained in an artificial example. Let’s say that there’s an Employees table with 1000 records and a birth date column that has an index on it. If there is a query that hits that column often coming either from us or application and retrieves no more than 5 rows that means that our selectivity is 0.995 and density is 0.005. That is what we should aim for when creating an index. In the best-case scenario, we should have indexes that are highly selective which basically means that queries coming at them should return a low number of rows.

When creating SQL indexes, I always like to set SQL Server to display information of disk activity generated by queries. So the first thing we can do is to enable IO statistics. This is a great way to see how much work SQL Server has to do under the hood to retrieve the data. We also need to include the actual execution plan and for that, I like to use a SQL execution plan viewing and analysis tool called ApexSQL Plan. This tool will show us the execution plan that was used to retrieve the data so we can see what SQL indexes, if any, are used. When using ApexSQL Plan, we don’t really need to enable IO statistics because the application has advanced I/O reads stats like the number of logical reads including LOB, physical reads (including read-ahead and LOB), and how many times a database table was scanned. However, enabling the stats on SQL Server can help when working in SQL Server Management Studio. The following query will be used as an example:

CHECKPOINT;

GO

DBCCDROPCLEANBUFFERS;

DBCCFREESYSTEMCACHE('ALL');

GO

SETSTATISTICSIOON;  

GO  

SELECTsod.SalesOrderID,

       sod.ProductID,

       sod.ModifiedDate

FROMSales.SalesOrderDetailsod

     JOINSales.SpecialOfferProductsopONsod.SpecialOfferID=sop.SpecialOfferID

                                           ANDsod.ProductID=sop.ProductID

WHEREsop.ModifiedDate>='2013-04-30 00:00:00.000';

GO

Notice that we also have the CHECKPOINT and DBCC DROPCLEANBUFFERS that are used to test queries with a clean buffer cache. They are basically creating a clean system state without shutting down and restarting the SQL Server.

So, we got a table inside the sample AdventureWorks2014 database called “SalesOrderDetail”. By default, this table has three indexes, but I’ve deleted those for the testing purposes. If expanded, the folder is empty:

An empty Index folder of a column in Object Explorer

Next, let’s get the actual execution plan by simply pasting the code in ApexSQL Plan and clicking the Actual button. This will prompt the Database connection dialog first time in which we have to choose the SQL Server, authentication method and the appropriate database to connect to:

Database connection dialog in ApexSQL Plan

This will take us to the query execution plan where we can see that SQL Server is doing a table scan and it’s taking most resources (56.2%) relative to the batch. This is bad because it’s scanning everything in that table to pull a small portion of the data. To be more specific, the query returns only 1.021 rows out of 121.317 in total:

An operation in the execution plan scanning all rows from a table

If we hover the mouse over the red exclamation mark, an additional tooltip will show the IO cost as well. In this case, 99.5 percent:

An operation in the execution plan showing I/O and total cost of a query

So, 1.021 rows out of 121.317 returned almost instantly on the modern machine but SQL Server still has to do a lot of work and as data fills up in the table the query could get slower and slower over time. So, the first thing we have to do is create a clustered index on the “SalesOrderDetail” table. Bear in mind that we should always choose the clustered index wisely. In this case, we are creating it on the “SalesOrderID” and “SalesOrderDetailID” because we’re expecting so much data on them. Let’s just go ahead and create this SQL index by executing the query from below:

ALTERTABLE[Sales].[SalesOrderDetail]ADD  CONSTRAINT[PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID]PRIMARYKEYCLUSTERED

(

    [SalesOrderID]ASC,

    [SalesOrderDetailID]ASC

)WITH(PAD_INDEX  =OFF,STATISTICS_NORECOMPUTE  =OFF,SORT_IN_TEMPDB=OFF,IGNORE_DUP_KEY=OFF,ONLINE=OFF,ALLOW_ROW_LOCKS  =ON,ALLOW_PAGE_LOCKS  =ON)ON[PRIMARY]

Actually, before we do that. Let’s quickly switch over to the IO reads tab and take a shot from there just so we have this information before doing anything:

I/O reads tab showing stats for different columns

After executing the above query, we will have a clustered index created by a primary key constraint. If we refresh the Indexes folder in Object Explorer, we should see the newly created clustered, unique, primary key index:

An executed query for creating clustered index created by a primary key constraint

Now, this isn’t going to improve performance a great deal. As a matter of fact, if we run the same query again it will just switch from the table scan to a clustered index scan:

An operation in the execution plan scanning a clustered index

However, we paved the way for the future nonclustered SQL indexes. So, without further ado let’s create a nonclustered index. Notice that ApexSQL Plan determines missing indexes and create queries for (re)creating them from the tooltip. Feel free to review and edit the default code or just hit Execute to create the index:

Index creation window in ApexSQL Plan for improving indexing strategy

If we execute the query again, SQL Server is doing a nonclustered index seek instead of the previous scan. Remember seeks are always better than scans:

An operation in the execution plan scanning a particular range of rows from a clustered index for improving indexing strategy

Don’t let the number fools you. Even though some numbers are higher relative to the batch compared to the previous runs this doesn’t necessarily mean that it’s a bad thing. If we switch over to IO reads again and compare them to the previous results, just look at those reads going drastically down from 1.237 to 349, and 1.244 to 136. The reason this was so efficient is that SQL Server used only the SQL indexes to retrieve the data:

Indexing strategy showing the comparison of I/O reads of two query results going drastically down

Indexing strategy guidelines

Poorly designed SQL indexes and a lack of them are primary sources of database and application performance issues. Here are a few indexing strategies that should be considered when indexing tables:

  • Avoid indexing highly used table/columns – The more indexes on a table the bigger the effect will be on a performance of Insert, Update, Delete, and Merge statements because all indexes must be modified appropriately. This means that SQL Server will have to do page splitting, move data around, and it will have to do that for all affected indexes by those DML statements
  • Use narrow index keys whenever possible – Keep indexes narrow, that is, with as few columns as possible. Exact numeric keys are the most efficient SQL index keys (e.g. integers). These keys require less disk space and maintenance overhead
  • Use clustered indexes on unique columns – Consider columns that are unique or contain many distinct values and avoid them for columns that undergo frequent changes
  • Nonclustered indexes on columns that are frequently searched and/or joined on – Ensure that nonclustered indexes are put on foreign keys and columns frequently used in search conditions, such as Where clause that returns exact matches
  • Cover SQL indexes for big performance gains – Improvements are attained when the index holds all columns in the query

I hope this article on the SQL indexing strategy has been informative and I thank you for reading.

Bojan Petrovic

Bojan Petrovic

Bojan aka “Boksi”, an AP graduate in IT Technology focused on Networks and electronic technology from the Copenhagen School of Design and Technology, is a software analyst with experience in quality assurance, software support, product evangelism, and user engagement.

He has written extensively on both the SQL Shack and the ApexSQL Solution Center, on topics ranging from client technologies like 4K resolution and theming, error handling to index strategies, and performance monitoring.

Bojan works at ApexSQL in Nis, Serbia as an integral part of the team focusing on designing, developing, and testing the next generation of database tools including MySQL and SQL Server, and both stand-alone tools and integrations into Visual Studio, SSMS, and VSCode.

See more about Bojan at LinkedIn

View all posts by Bojan Petrovic

Bojan Petrovic

Latest posts by Bojan Petrovic (see all)

Constraints, Execution plans, Indexes, T-SQL

About Bojan Petrovic

Bojan aka “Boksi”, an AP graduate in IT Technology focused on Networks and electronic technology from the Copenhagen School of Design and Technology, is a software analyst with experience in quality assurance, software support, product evangelism, and user engagement. He has written extensively on both the SQL Shack and the ApexSQL Solution Center, on topics ranging from client technologies like 4K resolution and theming, error handling to index strategies, and performance monitoring. Bojan works at ApexSQL in Nis, Serbia as an integral part of the team focusing on designing, developing, and testing the next generation of database tools including MySQL and SQL Server, and both stand-alone tools and integrations into Visual Studio, SSMS, and VSCode. See more about Bojan at LinkedInView all posts by Bojan Petrovic

View all posts by Bojan Petrovic →

Sours: https://www.sqlshack.com/sql-index-overview-and-strategy/
  1. Recipe card dividers walmart
  2. Embroidery stencils letters
  3. Sea turtle puzzle 1000

SQL indexes

An index is a schema object. It is used by the server to speed up the retrieval of rows by using a pointer. It can reduce disk I/O(input/output) by using a rapid path access method to locate data quickly. An index helps to speed up select queries and where clauses, but it slows down data input, with the update and the insert statements. Indexes can be created or dropped with no effect on the data. In this article, we will see how to create, delete, and uses the INDEX in the database. 

For example, if you want to reference all pages in a book that discusses a certain topic, you first refer to the index, which lists all the topics alphabetically and is then referred to one or more specific page numbers. 

Attention reader! Don’t stop learning now. Learn SQL for interviews using SQL Course  by GeeksforGeeks.

Creating an Index:

Syntax:

CREATE INDEX index ON TABLE column;

where the index is the name given to that index and TABLE is the name of the table on which that index is created and column is the name of that column for which it is applied. 



For multiple columns:

 Syntax:

CREATE INDEX index ON TABLE (column1, column2,.....);

Unique Indexes:

Unique indexes are used for the maintenance of the integrity of the data present in the table as well as for the fast performance, it does not allow multiple values to enter into the table. 
 Syntax:

CREATE UNIQUE INDEX index ON TABLE column;

When should indexes be created:
 

  • A column contains a wide range of values.
  • A column does not contain a large number of null values.
  • One or more columns are frequently used together in a where clause or a join condition.

When should indexes be avoided:
 

  • The table is small
  • The columns are not often used as a condition in the query
  • The column is updated frequently

Removing an Index:

To remove an index from the data dictionary by using the DROP INDEX command. 

Syntax:

DROP INDEX index;

To drop an index, you must be the owner of the index or have the DROP ANY INDEX privilege. 
 

Altering an Index: 

To modify an existing table’s index by rebuilding, or reorganizing the index.

ALTER INDEX IndexName ON TableName REBUILD;

Confirming Indexes :

You can check the different indexes present in a particular table given by the user or the server itself and their uniqueness. 

Syntax:

select * from USER_INDEXES;

It will show you all the indexes present in the server, in which you can locate your own tables too.
 

Renaming an index :

 You can use the system stored procedure sp_rename to rename any index in the database.

Syntax:

EXEC sp_rename index_name, new_index_name, N'INDEX';
Sours: https://www.geeksforgeeks.org/sql-indexes/
Indexes in sql server Part 35

Indexing

Last modified: August 09, 2021

What is Indexing?

Indexing makes columns faster to query by creating pointers to where data is stored within a database.

Imagine you want to find a piece of information that is within a large database. To get this information out of the database the computer will look through every row until it finds it. If the data you are looking for is towards the very end, this query would take a long time to run.

Visualization for finding the last entry:

Gif of a basic table scan

If the table was ordered alphabetically, searching for a name could happen a lot faster because we could skip looking for the data in certain rows. If we wanted to search for “Zack” and we know the data is in alphabetical order we could jump down to halfway through the data to see if Zack comes before or after that row. We could then half the remaining rows and make the same comparison.

Gif of an index scan

This took 3 comparisons to find the right answer instead of 8 in the unindexed data.

Indexes allow us to create sorted lists without having to create all new sorted tables, which would take up a lot of storage space.

What Exactly is an Index?

An index is a structure that holds the field the index is sorting and a pointer from each record to their corresponding record in the original table where the data is actually stored. Indexes are used in things like a contact list where the data may be physically stored in the order you add people’s contact information but it is easier to find people when listed out in alphabetical order.

Let’s look at the index from the previous example and see how it maps back to the original Friends table:

Shows how an index is structured relative to the table

We can see here that the table has the data stored ordered by an incrementing id based on the order in which the data was added. And the Index has the names stored in alphabetical order.

Types of Indexing

There are two types of databases indexes:

  1. Clustered
  2. Non-clustered

Both clustered and non-clustered indexes are stored and searched as B-trees, a data structure similar to a binary tree. A B-tree is a “self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time.” Basically it creates a tree-like structure that sorts data for quick searching.

Shows an image of a B-tree's structure

Here is a B-tree of the index we created. Our smallest entry is the leftmost entry and our largest is the rightmost entry. All queries would start at the top node and work their way down the tree, if the target entry is less than the current node the left path is followed, if greater the right path is followed. In our case it checked against Matt, then Todd, and then Zack.

To increase efficiency, many B-trees will limit the number of characters you can enter into an entry. The B-tree will do this on it’s own and does not require column data to be restricted. In the example above the B-tree below limits entries to 4 characters.

Clustered Indexes

Clustered indexes are the unique index per table that uses the primary key to organize the data that is within the table. The clustered index ensures that the primary key is stored in increasing order, which is also the order the table holds in memory.

  • Clustered indexes do not have to be explicitly declared.
  • Created when the table is created.
  • Use the primary key sorted in ascending order.

Creating Clustered Indexes

The clustered index will be automatically created when the primary key is defined:

Once filled in, that table would look something like this:

Image showing a complete table with a primary key clustered index on it

The created table, “friends”, will have a clustered index automatically created, organized around the Primary Key “id” called “friends_pkey”:

Shows the pkey relative to the table

When searching the table by “id”, the ascending order of the column allows for optimal searches to be performed. Since the numbers are ordered, the search can navigate the B-tree allowing searches to happen in logarithmic time.

However, in order to search for the “name” or “city” in the table, we would have to look at every entry because these columns do not have an index. This is where non-clustered indexes become very useful.

Non-Clustered Indexes

Non-clustered indexes are sorted references for a specific field, from the main table, that hold pointers back to the original entries of the table. The first example we showed is an example of a non-clustered table:

Shows a nonclustered index relative to the table

They are used to increase the speed of queries on the table by creating columns that are more easily searchable. Non-clustered indexes can be created by data analysts/ developers after a table has been created and filled.

Note: Non-clustered indexes are not new tables. Non-clustered indexes hold the field that they are responsible for sorting and a pointer from each of those entries back to the full entry in the table.

You can think of these just like indexes in a book. The index points to the location in the book where you can find the data you are looking for.

image of a book index

Non-clustered indexes point to memory addresses instead of storing data themselves. This makes them slower to query than clustered indexes but typically much faster than a non-indexed column.

You can create many non-clustered indexes. As of 2008, you can have up to 999 non-clustered indexes in SQL Server and there is no limit in PostgreSQL.

Creating Non-Clustered Databases(PostgreSQL)

To create an index to sort our friends’ names alphabetically:

This would create an index called “friends_name_asc”, indicating that this index is storing the names from “friends” stored alphabetically in ascending order.

image showing a representation of an index

Note that the “city” column is not present in this index. That is because indexes do not store all of the information from the original table. The “id” column would be a pointer back to the original table. The pointer logic would look like this:

how the pointers point to original table

Creating Indexes

In PostgreSQL, the “\d” command is used to list details on a table, including table name, the table columns and their data types, indexes, and constraints.

The details of our friends table now look like this:

Query providing details on the friends table: \d friends;

Using \d to show clustered and non clustered indexes

Looking at the above image, the “friends_name_asc” is now an associated index of the “friends” table. That means the query plan, the plan that SQL creates when determining the best way to perform a query, will begin to use the index when queries are being made. Notice that “friends_pkey” is listed as an index even though we never declared that as an index. That is the clustered index that was referenced earlier in the article that is automatically created based off of the primary key.

We can also see there is a “friends_city_desc” index. That index was created similarly to the names index:

This new index will be used to sort the cities and will be stored in reverse alphabetical order because the keyword “DESC” was passed, short for “descending”. This provides a way for our database to swiftly query city names.

Searching Indexes

After your non-clustered indexes are created you can begin querying with them. Indexes use an optimal search method known as binary search. Binary searches work by constantly cutting the data in half and checking if the entry you are searching for comes before or after the entry in the middle of the current portion of data. This works well with B-trees because they are designed to start at the middle entry; to search for the entries within the tree you know the entries down the left path will be smaller or before the current entry and the entries to the right will be larger or after the current entry. In a table this would look like:

Gif of a binary search on a balanced tree

Comparing this method to the query of the non-indexed table at the beginning of the article, we are able to reduce the total number of searches from eight to three. Using this method, a search of 1,000,000 entries can be reduced down to just 20 jumps in a binary search.

Table showing the growth rate of the number of searches relative to the number of entries being searched

When to use Indexes

Indexes are meant to speed up the performance of a database, so use indexing whenever it significantly improves the performance of your database. As your database becomes larger and larger, the more likely you are to see benefits from indexing.

When not to use Indexes

When data is written to the database, the original table (the clustered index) is updated first and then all of the indexes off of that table are updated. Every time a write is made to the database, the indexes are unusable until they have updated. If the database is constantly receiving writes then the indexes will never be usable. This is why indexes are typically applied to databases in data warehouses that get new data updated on a scheduled basis(off-peak hours) and not production databases which might be receiving new writes all the time.

NOTE: The newest version of Postgres (that is currently in beta) will allow you to query the database while the indexes are being updated.

Testing Index performance

To test if indexes will begin to decrease query times, you can run a set of queries on your database, record the time it takes those queries to finish, and then begin creating indexes and rerunning your tests.

To do this, try using the EXPLAIN ANALYZE clause in PostgreSQL.:

Which on my small database yielded:

shows a sample query plan

This output will tell you which method of search from the query plan was chosen and how long the planning and execution of the query took.

Only create one index at a time because not all indexes will decrease query time.

  • PostgreSQL’s query planning is pretty efficient, so adding a new index may not affect how fast queries are performed.
  • Adding an index will always mean storing more data
  • Adding an index will increase how long it takes your database to fully update after a write operation.

If adding an index does not decrease query time, you can simply remove it from the database.

To remove an index use the DROP INDEX command:

The outline of the database now looks like:

Shows that the index was dropped and no longer appears in \d+ friends

Which shows the successful removal of the index for searching names.

Summary

  • Indexing can vastly reduce the time of queries
  • Every table with a primary key has one clustered index
  • Every table can have many non-clustered indexes to aid in querying
  • Non-clustered indexes hold pointers back to the main table
  • Not every database will benefit from indexing
  • Not every index will increase the query speed for the database

References:

https://www.geeksforgeeks.org/indexing-in-databases-set-1/https://www.c-sharpcorner.com/blogs/differences-between-clustered-index-and-nonclustered-index1https://en.wikipedia.org/wiki/B-treehttps://www.tutorialspoint.com/postgresql/postgresql_indexes.htmhttps://www.cybertec-postgresql.com/en/postgresql-indexing-index-scan-vs-bitmap-scan-vs-sequential-scan-basics/#

Written by: Blake Barnhill
Reviewed by: Matt David , Matthew Layne

Sours: https://dataschool.com/sql-optimization/how-indexing-works/

Of sql index

CREATE INDEX (Transact-SQL)

Applies to:yesSQL Server (all supported versions) YesAzure SQL Database YesAzure SQL Managed Instance yesAzure Synapse Analytics yesAnalytics Platform System (PDW)

Creates a relational index on a table or view. Also called a rowstore index because it is either a clustered or nonclustered B-tree index. You can create a rowstore index before there is data in the table. Use a rowstore index to improve query performance, especially when the queries select from specific columns or require values to be sorted in a particular order.

Note

Azure Synapse Analytics and Analytics Platform System (PDW) currently do not support Unique constraints. Any examples referencing Unique Constraints are only applicable to SQL Server and SQL Database.

Simple examples:

Key scenario:

Starting with SQL Server 2016 (13.x) and SQL Database, use a nonclustered index on a columnstore index to improve data warehousing query performance. For more information, see Columnstore Indexes - Data Warehouse.

For additional types of indexes, see:

Topic link iconTransact-SQL Syntax Conventions

Syntax

Syntax for SQL Server and Azure SQL Database

Backward Compatible Relational Index

Important

The backward compatible relational index syntax structure will be removed in a future version of SQL Server. Avoid using this syntax structure in new development work, and plan to modify applications that currently use the feature. Use the syntax structure specified in <relational_index_option> instead.

Syntax for Azure Synapse Analytics and Parallel Data Warehouse

Arguments

UNIQUE

Creates a unique index on a table or view. A unique index is one in which no two rows are permitted to have the same index key value. A clustered index on a view must be unique.

The Database Engine does not allow creating a unique index on columns that already include duplicate values, whether or not is set to ON. If this is tried, the Database Engine displays an error message. Duplicate values must be removed before a unique index can be created on the column or columns. Columns that are used in a unique index should be set to NOT NULL, because multiple null values are considered duplicates when a unique index is created.

CLUSTERED

Creates an index in which the logical order of the key values determines the physical order of the corresponding rows in a table. The bottom, or leaf, level of the clustered index contains the actual data rows of the table. A table or view is allowed one clustered index at a time.

A view with a unique clustered index is called an indexed view. Creating a unique clustered index on a view physically materializes the view. A unique clustered index must be created on a view before any other indexes can be defined on the same view. For more information, see Create Indexed Views.

Create the clustered index before creating any nonclustered indexes. Existing nonclustered indexes on tables are rebuilt when a clustered index is created.

If is not specified, a nonclustered index is created.

Note

Because the leaf level of a clustered index and the data pages are the same by definition, creating a clustered index and using the or clause effectively moves a table from the filegroup on which the table was created to the new partition scheme or filegroup. Before creating tables or indexes on specific filegroups, verify which filegroups are available and that they have enough empty space for the index.

In some cases creating a clustered index can enable previously disabled indexes. For more information, see Enable Indexes and Constraints and Disable Indexes and Constraints.

NONCLUSTERED

Creates an index that specifies the logical ordering of a table. With a nonclustered index, the physical order of the data rows is independent of their indexed order.

Each table can have up to 999 nonclustered indexes, regardless of how the indexes are created: either implicitly with PRIMARY KEY and UNIQUE constraints, or explicitly with .

For indexed views, nonclustered indexes can be created only on a view that has a unique clustered index already defined.

If not otherwise specified, the default index type is nonclustered.

index_name

Is the name of the index. Index names must be unique within a table or view, but do not have to be unique within a database. Index names must follow the rules of identifiers.

column

Is the column or columns on which the index is based. Specify two or more column names to create a composite index on the combined values in the specified columns. List the columns to be included in the composite index, in sort-priority order, inside the parentheses after table_or_view_name.

Up to 32 columns can be combined into a single composite index key. All the columns in a composite index key must be in the same table or view. The maximum allowable size of the combined index values is 900 bytes for a clustered index, or 1,700 for a nonclustered index. The limits are 16 columns and 900 bytes for versions before SQL Database and SQL Server 2016 (13.x).

Columns that are of the large object (LOB) data types ntext, text, varchar(max), nvarchar(max), varbinary(max), xml, or image cannot be specified as key columns for an index. Also, a view definition cannot include ntext, text, or image columns, even if they are not referenced in the statement.

You can create indexes on CLR user-defined type columns if the type supports binary ordering. You can also create indexes on computed columns that are defined as method invocations off a user-defined type column, as long as the methods are marked deterministic and do not perform data access operations. For more information about indexing CLR user-defined type columns, see CLR User-defined Types.

[ ASC | DESC ]

Determines the ascending or descending sort direction for the particular index column. The default is ASC.

INCLUDE (column [ ,... n ] )

Specifies the non-key columns to be added to the leaf level of the nonclustered index. The nonclustered index can be unique or non-unique.

Column names cannot be repeated in the INCLUDE list and cannot be used simultaneously as both key and non-key columns. Nonclustered indexes always contain the clustered index columns if a clustered index is defined on the table. For more information, see Create Indexes with Included Columns.

All data types are allowed except text, ntext, and image. The index must be created or rebuilt offline if any one of the specified non-key columns are varchar(max), nvarchar(max), or varbinary(max) data types.

Computed columns that are deterministic and either precise or imprecise can be included columns. Computed columns derived from image, ntext, text, varchar(max), nvarchar(max), varbinary(max), and xml data types can be included in non-key columns as long as the computed column data types is allowable as an included column. For more information, see Indexes on Computed Columns.

For information on creating an XML index, see CREATE XML INDEX.

WHERE <filter_predicate>

Creates a filtered index by specifying which rows to include in the index. The filtered index must be a nonclustered index on a table. Creates filtered statistics for the data rows in the filtered index.

The filter predicate uses simple comparison logic and cannot reference a computed column, a UDT column, a spatial data type column, or a hierarchyID data type column. Comparisons using literals are not allowed with the comparison operators. Use the and operators instead.

Here are some examples of filter predicates for the table:

Filtered indexes do not apply to XML indexes and full-text indexes. For UNIQUE indexes, only the selected rows must have unique index values. Filtered indexes do not allow the option.

ON partition_scheme_name( column_name )

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

Specifies the partition scheme that defines the filegroups onto which the partitions of a partitioned index will be mapped. The partition scheme must exist within the database by executing either CREATE PARTITION SCHEME or ALTER PARTITION SCHEME. column_name specifies the column against which a partitioned index will be partitioned. This column must match the data type, length, and precision of the argument of the partition function that partition_scheme_name is using. column_name is not restricted to the columns in the index definition. Any column in the base table can be specified, except when partitioning a UNIQUE index, column_name must be chosen from among those used as the unique key. This restriction allows the Database Engine to verify uniqueness of key values within a single partition only.

Note

When you partition a non-unique, clustered index, the Database Engine by default adds the partitioning column to the list of clustered index keys, if it is not already specified. When partitioning a non-unique, nonclustered index, the Database Engine adds the partitioning column as a non-key (included) column of the index, if it is not already specified.

If partition_scheme_name or filegroup is not specified and the table is partitioned, the index is placed in the same partition scheme, using the same partitioning column, as the underlying table.

Note

You cannot specify a partitioning scheme on an XML index. If the base table is partitioned, the XML index uses the same partition scheme as the table.

For more information about partitioning indexes, Partitioned Tables and Indexes.

ON filegroup_name

Applies to: SQL Server (Starting with SQL Server 2008)

Creates the specified index on the specified filegroup. If no location is specified and the table or view is not partitioned, the index uses the same filegroup as the underlying table or view. The filegroup must already exist.

ON "default"

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

Creates the specified index on the same filegroup or partition scheme as the table or view.

The term default, in this context, is not a keyword. It is an identifier for the default filegroup and must be delimited, as in or . If "default" is specified, the QUOTED_IDENTIFIER option must be ON for the current session. This is the default setting. For more information, see SET QUOTED_IDENTIFIER.

Note

"default" does not indicate the database default filegroup in the context of . This differs from , where "default" locates the table on the database default filegroup.

[ FILESTREAM_ON { filestream_filegroup_name | partition_scheme_name | "NULL" } ]

Applies to: SQL Server (Starting with SQL Server 2008)

Specifies the placement of FILESTREAM data for the table when a clustered index is created. The clause allows FILESTREAM data to be moved to a different FILESTREAM filegroup or partition scheme.

filestream_filegroup_name is the name of a FILESTREAM filegroup. The filegroup must have one file defined for the filegroup by using a CREATE DATABASE or ALTER DATABASE statement; otherwise, an error is raised.

If the table is partitioned, the clause must be included and must specify a partition scheme of FILESTREAM filegroups that uses the same partition function and partition columns as the partition scheme for the table. Otherwise, an error is raised.

If the table is not partitioned, the FILESTREAM column cannot be partitioned. FILESTREAM data for the table must be stored in a single filegroup that is specified in the clause.

can be specified in a statement if a clustered index is being created and the table does not contain a FILESTREAM column.

For more information, see FILESTREAM (SQL Server).

<object>::=

Is the fully qualified or nonfully qualified object to be indexed.

database_name

Is the name of the database.

schema_name

Is the name of the schema to which the table or view belongs.

table_or_view_name

Is the name of the table or view to be indexed.

The view must be defined with SCHEMABINDING to create an index on it. A unique clustered index must be created on a view before any nonclustered index is created. For more information about indexed views, see the Remarks section.

Starting with SQL Server 2016 (13.x), the object can be a table stored with a clustered columnstore index.

Azure SQL Database supports the three-part name format database_name.[schema_name].object_name when the database_name is the current database or the database_name is and the object_name starts with #.

<relational_index_option>::=

Specifies the options to use when you create the index.

PAD_INDEX = { ON | OFF }

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

Specifies index padding. The default is OFF.

ON
The percentage of free space that is specified by fillfactor is applied to the intermediate-level pages of the index.

OFF or fillfactor is not specified
The intermediate-level pages are filled to near capacity, leaving sufficient space for at least one row of the maximum size the index can have, considering the set of keys on the intermediate pages.

The option is useful only when FILLFACTOR is specified, because uses the percentage specified by FILLFACTOR. If the percentage specified for FILLFACTOR is not large enough to allow for one row, the Database Engine internally overrides the percentage to allow for the minimum. The number of rows on an intermediate index page is never less than two, regardless of how low the value of fillfactor.

In backward compatible syntax, is equivalent to .

FILLFACTOR = fillfactor

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

Specifies a percentage that indicates how full the Database Engine should make the leaf level of each index page during index creation or rebuild. The value for fillfactor must be an integer value from 1 to 100. Fill factor values 0 and 100 are the same in all respects. If fillfactor is 100, the Database Engine creates indexes with leaf pages filled to capacity.

The setting applies only when the index is created or rebuilt. The Database Engine does not dynamically keep the specified percentage of empty space in the pages.

To view the fill factor setting, use in .

Important

Creating a clustered index with a less than 100 affects the amount of storage space the data occupies because the Database Engine redistributes the data when it creates the clustered index.

For more information, see Specify Fill Factor for an Index.

SORT_IN_TEMPDB = { ON | OFF }

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

Specifies whether to store temporary sort results in tempdb. The default is OFF except for Azure SQL Database Hyperscale.For all index build operations in Hyperscale, is always ON, regardless of the option specified unless resumable index rebuild is used.

ON
The intermediate sort results that are used to build the index are stored in tempdb. This may reduce the time required to create an index if tempdb is on a different set of disks than the user database. However, this increases the amount of disk space that is used during the index build.

OFF
The intermediate sort results are stored in the same database as the index.

In addition to the space required in the user database to create the index, tempdb must have about the same amount of additional space to hold the intermediate sort results. For more information, see SORT_IN_TEMPDB Option For Indexes.

In backward compatible syntax, is equivalent to .

IGNORE_DUP_KEY = { ON | OFF }

Specifies the error response when an insert operation attempts to insert duplicate key values into a unique index. The option applies only to insert operations after the index is created or rebuilt. The option has no effect when executing CREATE INDEX, ALTER INDEX, or UPDATE. The default is OFF.

ON
A warning message will occur when duplicate key values are inserted into a unique index. Only the rows violating the uniqueness constraint will fail.

OFF
An error message will occur when duplicate key values are inserted into a unique index. The entire INSERT operation will be rolled back.

cannot be set to ON for indexes created on a view, non-unique indexes, XML indexes, spatial indexes, and filtered indexes.

To view , use sys.indexes.

In backward compatible syntax, is equivalent to .

STATISTICS_NORECOMPUTE = { ON | OFF}

Specifies whether distribution statistics are recomputed. The default is OFF.

ON
Out-of-date statistics are not automatically recomputed.

OFF
Automatic statistics updating are enabled.

To restore automatic statistics updating, set the to OFF, or execute without the clause.

Important

Disabling automatic recomputation of distribution statistics may prevent the query optimizer from picking optimal execution plans for queries involving the table.

In backward compatible syntax, is equivalent to .

STATISTICS_INCREMENTAL = { ON | OFF }

Applies to: SQL Server (Starting with SQL Server 2014 (12.x)) and Azure SQL Database

When ON, the statistics created are per partition statistics. When OFF, the statistics tree is dropped and SQL Server re-computes the statistics. The default is OFF.

If per partition statistics are not supported the option is ignored and a warning is generated. Incremental stats are not supported for following statistics types:

  • Statistics created with indexes that are not partition-aligned with the base table.
  • Statistics created on Always On readable secondary databases.
  • Statistics created on read-only databases.
  • Statistics created on filtered indexes.
  • Statistics created on views.
  • Statistics created on internal tables.
  • Statistics created with spatial indexes or XML indexes.

DROP_EXISTING = { ON | OFF }

Is an option to drop and rebuild the existing clustered or nonclustered index with modified column specifications, and keep the same name for the index. The default is OFF.

ON
Specifies to and the existing index, which must have the same name as the parameter index_name.

OFF
Specifies not to and the existing index. SQL Server displays an error if the specified index name already exists.

With , you can change:

  • A nonclustered rowstore index to a clustered rowstore index.

With , you cannot change:

  • A clustered rowstore index to a nonclustered rowstore index.
  • A clustered columnstore index to any type of rowstore index.

In backward compatible syntax, is equivalent to .

ONLINE = { ON | OFF }

Specifies whether underlying tables and associated indexes are available for queries and data modification during the index operation. The default is OFF.

ON
Long-term table locks are not held for the duration of the index operation. During the main phase of the index operation, only an Intent Share (IS) lock is held on the source table. This enables queries or updates to the underlying table and indexes to proceed. At the start of the operation, a Shared (S) lock is held on the source object for a very short period of time. At the end of the operation, for a short period of time, an S (Shared) lock is acquired on the source if a nonclustered index is being created. A Sch-M (Schema Modification) lock is acquired when a clustered index is created or dropped online and when a clustered or nonclustered index is being rebuilt. ONLINE cannot be set to ON when an index is being created on a local temporary table.

OFF
Table locks are applied for the duration of the index operation. An offline index operation that creates, rebuilds, or drops a clustered index, or rebuilds or drops a nonclustered index, acquires a Schema modification (Sch-M) lock on the table. This prevents all user access to the underlying table for the duration of the operation. An offline index operation that creates a nonclustered index acquires a Shared (S) lock on the table. This prevents updates to the underlying table but allows read operations, such as SELECT statements.

For more information, see Perform Index Operations Online.

Indexes, including indexes on global temp tables, can be created online except for the following cases:

  • XML index
  • Index on a local temp table
  • Initial unique clustered index on a view
  • Disabled clustered indexes
  • Columnstore indexes
  • Clustered index, if the underlying table contains LOB data types (image, ntext, text) and spatial data types
  • varchar(max) and varbinary(max) columns cannot be part of an index key. In SQL Server (Starting with SQL Server 2012 (11.x)) and Azure SQL Database, when a table contains varchar(max) or varbinary(max) columns, a clustered index containing other columns can be built or rebuilt using the option.

For more information, see How Online Index Operations Work.

RESUMABLE = { ON | OFF}

Applies to: SQL Server (Starting with SQL Server 2019 (15.x)) and Azure SQL Database

Specifies whether an online index operation is resumable.

ON
Index operation is resumable.

OFF
Index operation is not resumable.

MAX_DURATION =time [MINUTES] used with (requires )

Applies to: SQL Server (Starting with SQL Server 2019 (15.x)) and Azure SQL Database

Indicates time (an integer value specified in minutes) that a resumable online index operation is executed before being paused.

Note

Resumable online index rebuilds are not supported on columnstore indexes.

ALLOW_ROW_LOCKS = { ON | OFF }

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

Specifies whether row locks are allowed. The default is ON.

ON
Row locks are allowed when accessing the index. The Database Engine determines when row locks are used.

OFF
Row locks are not used.

ALLOW_PAGE_LOCKS = { ON | OFF }

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

Specifies whether page locks are allowed. The default is ON.

ON
Page locks are allowed when accessing the index. The Database Engine determines when page locks are used.

OFF
Page locks are not used.

OPTIMIZE_FOR_SEQUENTIAL_KEY = { ON | OFF }

Applies to: SQL Server (Starting with SQL Server 2019 (15.x)) and Azure SQL Database

Specifies whether or not to optimize for last-page insert contention. The default is OFF. See the Sequential Keys section for more information.

MAXDOP = max_degree_of_parallelism

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

Overrides the max degree of parallelism configuration option for the duration of the index operation. For more information, see Configure the max degree of parallelism Server Configuration Option. Use MAXDOP to limit the number of processors used in a parallel plan execution. The maximum is 64 processors.

max_degree_of_parallelism can be:

1
Suppresses parallel plan generation.

>1
Restricts the maximum number of processors used in a parallel index operation to the specified number or fewer based on the current system workload.

0 (default)
Uses the actual number of processors or fewer based on the current system workload.

For more information, see Configure Parallel Index Operations.

DATA_COMPRESSION

Specifies the data compression option for the specified index, partition number, or range of partitions. The options are as follows:

NONE
Index or specified partitions are not compressed.

ROW
Index or specified partitions are compressed by using row compression.

PAGE
Index or specified partitions are compressed by using page compression.

For more information about compression, see Data Compression.

ON PARTITIONS ( { <partition_number_expression> | <range> } [ ,...n ] )

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

Specifies the partitions to which the setting applies. If the index is not partitioned, the argument will generate an error. If the clause is not provided, the option applies to all partitions of a partitioned index.

<partition_number_expression> can be specified in the following ways:

  • Provide the number for a partition, for example: .
  • Provide the partition numbers for several individual partitions separated by commas, for example: .
  • Provide both ranges and individual partitions, for example: .

<range> can be specified as partition numbers separated by the word TO, for example: .

To set different types of data compression for different partitions, specify the option more than once, for example:

The statement is optimized like any other query. To save on I/O operations, the query processor may choose to scan another index instead of performing a table scan. The sort operation may be eliminated in some situations. On multiprocessor computers can use more processors to perform the scan and sort operations associated with creating the index, in the same way as other queries do. For more information, see Configure Parallel Index Operations.

The operation can be minimally logged if the database recovery model is set to either bulk-logged or simple.

Indexes can be created on a temporary table. When the table is dropped or the session ends, the indexes are dropped.

A clustered index can be built on a table variable when a Primary Key is created. When the query completes or the session ends, the index is dropped.

Indexes support extended properties.

Clustered Indexes

Creating a clustered index on a table (heap) or dropping and re-creating an existing clustered index requires additional workspace to be available in the database to accommodate data sorting and a temporary copy of the original table or existing clustered index data. For more information about clustered indexes, see Create Clustered Indexes and the SQL Server Index Architecture and Design Guide.

Nonclustered Indexes

Starting with SQL Server 2016 (13.x) and in Azure SQL Database, you can create a nonclustered index on a table stored as a clustered columnstore index. If you first create a nonclustered index on a table stored as a heap or clustered index, the index will persist if you later convert the table to a clustered columnstore index. It is also not necessary to drop the nonclustered index when you rebuild the clustered columnstore index.

Limitations and Restrictions:

  • The option is not valid when you create a nonclustered index on a table stored as a clustered columnstore index.

Unique Indexes

When a unique index exists, the Database Engine checks for duplicate values each time data is added by insert operations. Insert operations that would generate duplicate key values are rolled back, and the Database Engine displays an error message. This is true even if the insert operation changes many rows but causes only one duplicate. If an attempt is made to enter data for which there is a unique index and the clause is set to ON, only the rows violating the UNIQUE index fail.

Partitioned Indexes

Partitioned indexes are created and maintained in a similar manner to partitioned tables, but like ordinary indexes, they are handled as separate database objects. You can have a partitioned index on a table that is not partitioned, and you can have a nonpartitioned index on a table that is partitioned.

If you are creating an index on a partitioned table, and do not specify a filegroup on which to place the index, the index is partitioned in the same manner as the underlying table. This is because indexes, by default, are placed on the same filegroups as their underlying tables, and for a partitioned table in the same partition scheme that uses the same partitioning columns. When the index uses the same partition scheme and partitioning column as the table, the index is aligned with the table.

Warning

Creating and rebuilding nonaligned indexes on a table with more than 1,000 partitions is possible, but is not supported. Doing so may cause degraded performance or excessive memory consumption during these operations. We recommend using only aligned indexes when the number of partitions exceed 1,000.

When partitioning a non-unique, clustered index, the Database Engine by default adds any partitioning columns to the list of clustered index keys, if not already specified.

Indexed views can be created on partitioned tables in the same manner as indexes on tables. For more information about partitioned indexes, see Partitioned Tables and Indexes and the SQL Server Index Architecture and Design Guide.

In SQL Server, statistics are not created by scanning all the rows in the table when a partitioned index is created or rebuilt. Instead, the query optimizer uses the default sampling algorithm to generate statistics. To obtain statistics on partitioned indexes by scanning all the rows in the table, use or with the clause.

Filtered Indexes

A filtered index is an optimized nonclustered index, suited for queries that select a small percentage of rows from a table. It uses a filter predicate to index a portion of the data in the table. A well-designed filtered index can improve query performance, reduce storage costs, and reduce maintenance costs.

Required SET Options for Filtered Indexes

The SET options in the Required Value column are required whenever any of the following conditions occur:

  • Create a filtered index.

  • INSERT, UPDATE, DELETE, or MERGE operation modifies the data in a filtered index.

  • The filtered index is used by the query optimizer to produce the query plan.

    SET optionsRequired valueDefault server valueDefault

    OLE DB and ODBC value
    Default

    DB-Library value
    ANSI_NULLSONONONOFF
    ANSI_PADDINGONONONOFF
    ANSI_WARNINGS*ONONONOFF
    ARITHABORTONONOFFOFF
    CONCAT_NULL_YIELDS_NULLONONONOFF
    NUMERIC_ROUNDABORTOFFOFFOFFOFF
    QUOTED_IDENTIFIERONONONOFF
    • Setting ANSI_WARNINGS to ON implicitly sets ARITHABORT to ON when the database compatibility level is set to 90 or higher. If the database compatibility level is set to 80 or earlier, the ARITHABORT option must explicitly be set to ON.

If the SET options are incorrect, the following conditions can occur:

  • The filtered index is not created.
  • The Database Engine generates an error and rolls back INSERT, UPDATE, DELETE, or MERGE statements that change data in the index.
  • Query optimizer does not consider the index in the execution plan for any Transact-SQL statements.

For more information about Filtered Indexes, see Create Filtered Indexes and the SQL Server Index Architecture and Design Guide.

Spatial Indexes

For information about spatial indexes, see CREATE SPATIAL INDEX and Spatial Indexes Overview.

XML Indexes

For information about XML indexes see, CREATE XML INDEX and XML Indexes (SQL Server).

Index Key Size

The maximum size for an index key is 900 bytes for a clustered index and 1,700 bytes for a nonclustered index. (Before SQL Database and SQL Server 2016 (13.x) the limit was always 900 bytes.) Indexes on varchar columns that exceed the byte limit can be created if the existing data in the columns do not exceed the limit at the time the index is created; however, subsequent insert or update actions on the columns that cause the total size to be greater than the limit will fail. The index key of a clustered index cannot contain varchar columns that have existing data in the ROW_OVERFLOW_DATA allocation unit. If a clustered index is created on a varchar column and the existing data is in the IN_ROW_DATA allocation unit, subsequent insert or update actions on the column that would push the data off-row will fail.

Nonclustered indexes can include non-key columns in the leaf level of the index. These columns are not considered by the Database Engine when calculating the index key size . For more information, see Create Indexes with Included Columns and the SQL Server Index Architecture and Design Guide.

Note

When tables are partitioned, if the partitioning key columns are not already present in a non-unique clustered index, they are added to the index by the Database Engine. The combined size of the indexed columns (not counting included columns), plus any added partitioning columns cannot exceed 1800 bytes in a non-unique clustered index.

Computed Columns

Indexes can be created on computed columns. In addition, computed columns can have the property PERSISTED. This means that the Database Engine stores the computed values in the table, and updates them when any other columns on which the computed column depends are updated. The Database Engine uses these persisted values when it creates an index on the column, and when the index is referenced in a query.

To index a computed column, the computed column must be deterministic and precise. However, using the PERSISTED property expands the type of indexable computed columns to include:

  • Computed columns based on Transact-SQL and CLR functions and CLR user-defined type methods that are marked deterministic by the user.
  • Computed columns based on expressions that are deterministic as defined by the Database Engine but imprecise.

Persisted computed columns require the following SET options to be set as shown in the previous section Required SET Options for Filtered Indexes.

The UNIQUE or PRIMARY KEY constraint can contain a computed column as long as it satisfies all conditions for indexing. Specifically, the computed column must be deterministic and precise or deterministic and persisted. For more information about determinism, see Deterministic and Nondeterministic Functions.

Computed columns derived from image, ntext, text, varchar(max), nvarchar(max), varbinary(max), and xml data types can be indexed either as a key or included non-key column as long as the computed column data type is allowable as an index key column or non-key column. For example, you cannot create a primary XML index on a computed xml column. If the index key size exceeds 900 bytes, a warning message is displayed.

Creating an index on a computed column may cause the failure of an insert or update operation that previously worked. Such a failure may take place when the computed column results in arithmetic error. For example, in the following table, although computed column results in an arithmetic error, the INSERT statement works.

If, instead, after creating the table, you create an index on computed column , the same statement will now fail.

For more information, see Indexes on Computed Columns.

Included Columns in Indexes

Non-key columns, called included columns, can be added to the leaf level of a nonclustered index to improve query performance by covering the query. That is, all columns referenced in the query are included in the index as either key or non-key columns. This allows the query optimizer to locate all the required information from an index scan; the table or clustered index data is not accessed. For more information, see Create Indexes with Included Columns and the SQL Server Index Architecture and Design Guide.

Specifying Index Options

SQL Server 2005 (9.x) introduced new index options and also modifies the way in which options are specified. In backward compatible syntax, is equivalent to . When you set index options, the following rules apply:

  • New index options can only be specified by using .
  • Options cannot be specified by using both the backward compatible and new syntax in the same statement. For example, specifying causes the statement to fail.
  • When you create an XML index, the options must be specified by using .

DROP_EXISTING Clause

You can use the clause to rebuild the index, add or drop columns, modify options, modify column sort order, or change the partition scheme or filegroup.

If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.

enhances performance when you re-create a clustered index, with either the same or different set of keys, on a table that also has nonclustered indexes. replaces the execution of a statement on the old clustered index followed by the execution of a statement for the new clustered index. The nonclustered indexes are rebuilt once, and then only if the index definition has changed. The clause does not rebuild the nonclustered indexes when the index definition has the same index name, key and partition columns, uniqueness attribute, and sort order as the original index.

Whether the nonclustered indexes are rebuilt or not, they always remain in their original filegroups or partition schemes and use the original partition functions. If a clustered index is rebuilt to a different filegroup or partition scheme, the nonclustered indexes are not moved to coincide with the new location of the clustered index. Therefore, even the nonclustered indexes previously aligned with the clustered index, they may no longer be aligned with it. For more information about partitioned index alignment, see Partitioned Tables and Indexes.

The clause will not sort the data again if the same index key columns are used in the same order and with the same ascending or descending order, unless the index statement specifies a nonclustered index and the ONLINE option is set to OFF. If the clustered index is disabled, the operation must be performed with ONLINE set to OFF. If a nonclustered index is disabled and is not associated with a disabled clustered index, the operation can be performed with ONLINE set to OFF or ON.

Note

When indexes with 128 extents or more are dropped or rebuilt, the Database Engine defers the actual page deallocations, and their associated locks, until after the transaction commits.

ONLINE Option

The following guidelines apply for performing index operations online:

  • The underlying table cannot be altered, truncated, or dropped while an online index operation is in process.
  • Additional temporary disk space is required during the index operation.
  • Online operations can be performed on partitioned indexes and indexes that contain persisted computed columns, or included columns.
  • The argument option allows you to decide how the index operation can proceed when blocked on the Sch-M lock. This is currently supported in Azure SQL Database and Azure SQL Managed Instance only.

For more information, see Perform Index Operations Online.

Resources

The following resources are required for resumable online index create operation:

  • Additional space required to keep the index being built, including the time when index is being paused
  • Additional log throughput during the sorting phase. The overall log space usage for resumable index is less compared to regular online index create and allows log truncation during this operation.
  • A DDL state preventing any DDL modification
  • Ghost cleanup is blocked on the in-build index for the duration of the operation both while paused and while the operation is running.

Current functional limitations

The following functionality is disabled for resumable index create operations:

  • After a resumable online index create operation is paused, the initial value of MAXDOP cannot be changed

  • Create an index that contains:

    • Computed or TIMESTAMP column(s) as key columns
    • LOB column as included column for resumable index create
    • Filtered index

Resumable index operations

Applies to: SQL Server (Starting with SQL Server 2019 (15.x)) and Azure SQL Database

The following guidelines apply for resumable index operations:

  • Online index create is specified as resumable using the option.
  • The RESUMABLE option is not persisted in the metadata for a given index and applies only to the duration of a current DDL statement. Therefore, the clause must be specified explicitly to enable resumability.
  • option is only supported for option.
  • for RESUMABLE option specifies the time interval for an index being built. Once this time is used the index build is either paused or it completes its execution. User decides when a build for a paused index can be resumed. The time in minutes for must be greater than 0 minutes and less or equal one week (7 * 24 * 60 = 10080 minutes). Having a long pause for an index operation may impact the DML performance on a specific table as well as the database disk capacity since both indexes the original one and the newly created one require disk space and need to be updated during DML operations. If option is omitted, the index operation will continue until its completion or until a failure occurs.
  • To pause immediately the index operation, you can stop (Ctrl-C) the ongoing command, execute the ALTER INDEX PAUSE command, or execute the command. Once the command is paused, it can be resumed using ALTER INDEX command.
  • Re-executing the original statement for resumable index, automatically resumes a paused index create operation.
  • The option is not supported for resumable index.
  • The DDL command with cannot be executed inside an explicit transaction (cannot be part of begin block).
  • To resume/abort an index create/rebuild, use the ALTER INDEX T-SQL syntax

Note

The DDL command runs until it completes, pauses or fails. In case the command pauses, an error will be issued indicating that the operation was paused and that the index creation did not complete. More information about the current index status can be obtained from sys.index_resumable_operations. As before in case of a failure an error will be issued as well.

To indicate that an index create is executed as resumable operation and to check its current execution state, see sys.index_resumable_operations.

WAIT_AT_LOW_PRIORITY with online index operations

Applies to: This syntax for currently applies to Azure SQL Database and Azure SQL Managed Instance only. For , this syntax applies to SQL Server (Starting with SQL Server 2014 (12.x)) and Azure SQL Database. For more information, see ALTER INDEX.

The syntax allows for specifying behavior. can be used with only.

The option allows DBAs to manage the Sch-S and Sch-M locks required for online index creation and allows them to select one of 3 options. In all 3 cases, if during the wait time , there are no blocking activities, the online index rebuild is executed immediately without waiting and the DDL statement is completed.

indicates that the online index create operation will wait for low priority locks, allowing other operations to proceed while the online index build operation is waiting. Omitting the option is equivalent to .

MAX_DURATION = time [MINUTES]

The wait time (an integer value specified in minutes) that the online index create locks will wait with low priority when executing the DDL command. If the operation is blocked for the time, the specified action will be executed. time is always in minutes, and the word MINUTES can be omitted.

ABORT_AFTER_WAIT = [NONE | SELF | BLOCKERS } ]

NONE
Continue waiting for the lock with normal (regular) priority.

SELF
Exit the online index create DDL operation currently being executed, without taking any action. The option SELF cannot be used with a of 0.

BLOCKERS
Kill all user transactions that block the online index rebuild DDL operation so that the operation can continue. The BLOCKERS option requires the login to have permission.

Row and Page Locks Options

When and , row-, page-, and table-level locks are allowed when accessing the index. The Database Engine chooses the appropriate lock and can escalate the lock from a row or page lock to a table lock.

When and , only a table-level lock is allowed when accessing the index.

Sequential Keys

Applies to: SQL Server (Starting with SQL Server 2019 (15.x)) and Azure SQL Database

Last-page insert contention is a common performance problem that occurs when a large number of concurrent threads attempt to insert rows into an index with a sequential key. An index is considered sequential when the leading key column contains values that are always increasing (or decreasing), such as an identity column or a date that defaults to the current date/time. Because the keys being inserted are sequential, all new rows will be inserted at the end of the index structure - in other words, on the same page. This leads to contention for the page in memory which can be observed as several threads waiting on PAGELATCH_EX for the page in question.

Enabling the index option enables an optimization within the database engine that helps improve throughput for high-concurrency inserts into the index. It is intended for indexes that have a sequential key and thus are prone to last-page insert contention, but it may also help with indexes that have hot spots in other areas of the B-Tree index structure.

Viewing Index Information

To return information about indexes, you can use catalog views, system functions, and system stored procedures.

Data Compression

Data compression is described in the topic Data Compression. The following are key points to consider:

  • Compression can allow more rows to be stored on a page, but does not change the maximum row size.
  • Non-leaf pages of an index are not page compressed but can be row compressed.
  • Each nonclustered index has an individual compression setting, and does not inherit the compression setting of the underlying table.
  • When a clustered index is created on a heap, the clustered index inherits the compression state of the heap unless an alternative compression state is specified.

The following restrictions apply to partitioned indexes:

  • You cannot change the compression setting of a single partition if the table has nonaligned indexes.
  • The syntax rebuilds the specified partition of the index.
  • The syntax rebuilds all partitions of the index.

To evaluate how changing the compression state will affect a table, an index, or a partition, use the sp_estimate_data_compression_savings stored procedure.

Permissions

Requires permission on the table or view or membership in the fixed database role.

Limitations and Restrictions

In Azure Synapse Analytics and Analytics Platform System (PDW), you cannot create:

  • A clustered or nonclustered rowstore index on a data warehouse table when a columnstore index already exists. This behavior is different from SMP SQL Server which allows both rowstore and columnstore indexes to co-exist on the same table.
  • You cannot create an index on a view.

Metadata

To view information on existing indexes, you can query the sys.indexes catalog view.

Version Notes

SQL Database does not support filegroup and filestream options.

Examples: All versions. Uses the AdventureWorks database

A. Create a simple nonclustered rowstore index

The following examples create a nonclustered index on the column of the table.

B. Create a simple nonclustered rowstore composite index

The following example creates a nonclustered composite index on the and columns of the table.

C. Create an index on a table in another database

The following example creates a clustered index on the column of the table in the database.

D. Add a column to an index

The following example creates index IX_FF with two columns from the dbo.FactFinance table. The next statement rebuilds the index with one more column and keeps the existing name.

Examples: SQL Server, Azure SQL Database

E. Create a unique nonclustered index

The following example creates a unique nonclustered index on the column of the table in the AdventureWorks2012 database. The index will enforce uniqueness on the data inserted into the column.

The following query tests the uniqueness constraint by attempting to insert a row with the same value as that in an existing row.

The resulting error message is:

F. Use the IGNORE_DUP_KEY option

The following example demonstrates the effect of the option by inserting multiple rows into a temporary table first with the option set to and again with the option set to . A single row is inserted into the table that will intentionally cause a duplicate value when the second multiple-row statement is executed. A count of rows in the table returns the number of rows inserted.

Here are the results of the second statement.

Notice that the rows inserted from the table that did not violate the uniqueness constraint were successfully inserted. A warning was issued and the duplicate row ignored, but the entire transaction was not rolled back.

The same statements are executed again, but with set to .

Here are the results of the second statement.

Notice that none of the rows from the table were inserted into the table even though only one row in the table violated the index constraint.

G. Using DROP_EXISTING to drop and re-create an index

The following example drops and re-creates an existing index on the column of the table in the AdventureWorks2012 database by using the option. The options and are also set.

H. Create an index on a view

The following example creates a view and an index on that view. Two queries are included that use the indexed view.

I. Create an index with included (non-key) columns

The following example creates a nonclustered index with one key column () and four non-key columns (, , , ). A query that is covered by the index follows. To display the index that is selected by the query optimizer, on the Query menu in SQL Server Management Studio, select Display Actual Execution Plan before executing the query.

J. Create a partitioned index

The following example creates a nonclustered partitioned index on , an existing partition scheme in the AdventureWorks2012 database. This example assumes the partitioned index sample has been installed.

Applies to: SQL Server (Starting with SQL Server 2008) and Azure SQL Database

K. Creating a filtered index

The following example creates a filtered index on the Production.BillOfMaterials table in the AdventureWorks2012 database. The filter predicate can include columns that are not key columns in the filtered index. The predicate in this example selects only the rows where EndDate is non-NULL.

L. Create a compressed index

The following example creates an index on a nonpartitioned table by using row compression.

The following example creates an index on a partitioned table by using row compression on all partitions of the index.

The following example creates an index on a partitioned table by using page compression on partition of the index and row compression on partitions through of the index.

M. Create, resume, pause, and abort resumable index operations

Applies to: SQL Server (Starting with SQL Server 2019 (15.x)) and Azure SQL Database

N. CREATE INDEX with different low priority lock options

The following examples use the option to specify different strategies for dealing with blocking.

The following example uses both the option and specifies two values, the first applies to the option, the second applies to the option.

Examples: Azure Synapse Analytics and Analytics Platform System (PDW)

O. Basic syntax

Create, resume, pause, and abort resumable index operations

Applies to: SQL Server (Starting with SQL Server 2019 (15.x)) and Azure SQL Database

P. Create a nonclustered index on a table in the current database

The following example creates a nonclustered index on the column of the table.

Q. Create a clustered index on a table in another database

The following example creates a nonclustered index on the column of the table in the database.

R. Create an ordered clustered index on a table

The following example creates an ordered clustered index on the and columns of the table in the database.

S. Convert a CCI to an ordered clustered index on a table

The following example converts the existing clustered columnstore index to an ordered clustered columnstore index called on the and columns of the table in the database.

See also

Sours: https://docs.microsoft.com/en-us/sql/t-sql/statements/create-index-transact-sql
How do SQL Indexes Work

Clustered and Nonclustered Indexes Described

Applies to:yesSQL Server (all supported versions) YesAzure SQL Database

An index is an on-disk structure associated with a table or view that speeds retrieval of rows from the table or view. An index contains keys built from one or more columns in the table or view. These keys are stored in a structure (B-tree) that enables SQL Server to find the row or rows associated with the key values quickly and efficiently.

A table or view can contain the following types of indexes:

  • Clustered

    • Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be stored in only one order.  
    • The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.
  • Nonclustered

    • Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and each key value entry has a pointer to the data row that contains the key value.

    • The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key.

    • You can add nonkey columns to the leaf level of the nonclustered index to by-pass existing index key limits, and execute fully covered, indexed, queries. For more information, see Create Indexes with Included Columns. For details about index key limits see Maximum Capacity Specifications for SQL Server.

Both clustered and nonclustered indexes can be unique. This means no two rows can have the same value for the index key. Otherwise, the index is not unique and multiple rows can share the same key value. For more information, see Create Unique Indexes.

Indexes are automatically maintained for a table or view whenever the table data is modified.

See Indexes for additional types of special purpose indexes.

Indexes and Constraints

Indexes are automatically created when PRIMARY KEY and UNIQUE constraints are defined on table columns. For example, when you create a table with a UNIQUE constraint, Database Engine automatically creates a nonclustered index. If you configure a PRIMARY KEY, Database Engine automatically creates a clustered index, unless a clustered index already exists. When you try to enforce a PRIMARY KEY constraint on an existing table and a clustered index already exists on that table, SQL Server enforces the primary key using a nonclustered index.

For more information, see Create Primary Keys and Create Unique Constraints.

How Indexes are used by the Query Optimizer

Well-designed indexes can reduce disk I/O operations and consume fewer system resources therefore improving query performance. Indexes can be helpful for a variety of queries that contain SELECT, UPDATE, DELETE, or MERGE statements. Consider the query in the AdventureWorks2012 database. When this query is executed, the query optimizer evaluates each available method for retrieving the data and selects the most efficient method. The method may be a table scan, or may be scanning one or more indexes if they exist.

When performing a table scan, the query optimizer reads all the rows in the table, and extracts the rows that meet the criteria of the query. A table scan generates many disk I/O operations and can be resource intensive. However, a table scan could be the most efficient method if, for example, the result set of the query is a high percentage of rows from the table.

When the query optimizer uses an index, it searches the index key columns, finds the storage location of the rows needed by the query and extracts the matching rows from that location. Generally, searching the index is much faster than searching the table because unlike a table, an index frequently contains very few columns per row and the rows are in sorted order.

The query optimizer typically selects the most efficient method when executing queries. However, if no indexes are available, the query optimizer must use a table scan. Your task is to design and create indexes that are best suited to your environment so that the query optimizer has a selection of efficient indexes from which to select. SQL Server provides the Database Engine Tuning Advisor to help with the analysis of your database environment and in the selection of appropriate indexes.

Related content

Sours: https://docs.microsoft.com/en-us/sql/relational-databases/indexes/clustered-and-nonclustered-indexes-described

You will also be interested:

What is an index in SQL?

So, How indexing actually works?

Well, first off, the database table does not reorder itself when we put index on a column to optimize the query performance.

The major advantage of B-tree is that the data in it is sortable. Along with it, B-Tree data structure is time efficient and operations such as searching, insertion, deletion can be done in logarithmic time.

So the index would look like this -

enter image description here

Here for each column, it would be mapped with a database internal identifier (pointer) which points to the exact location of the row. And, now if we run the same query.

Visual Representation of the Query execution

enter image description here

So, indexing just cuts down the time complexity from o(n) to o(log n).

A detailed info - https://pankajtanwar.in/blog/what-is-the-sorting-algorithm-behind-order-by-query-in-mysql

Sours: https://stackoverflow.com/questions/2955459/what-is-an-index-in-sql


722 723 724 725 726