SQL Server Index on Computed Columns

Learning Objective

The objective of this SQL Server tutorial is to teach you what is a computed column and why and how to create an index on a computed column.

What is a Computed Column?

A Computed Column is a column whose values are derived or computed by an expression. The expression might be a constant, function or combination of values from one or more columns in the same table. The need for such a column arises when we need more information or insight about a particular piece of data in a table but which does not exist in the table as a column. One such example is total price which is a product of unit price and quantity of a commodity. The unit price and quantity values are available in the table but the total price which is variable and dependent is not. Another example might be an employee table containing the employee’s salary and tax slab to which the employee belongs but not the exact tax amount payable per month as part of salary deduction. In such cases computed columns come to the rescue. There can be one or more Computed Columns defined on a table.

A computed column is of 2 types – PERSISTED and VIRTUAL (or NOT PERSISTED).

A Persisted Computed Column is permanently physically stored as an additional column in the table. It is more efficient of the two as the column is always available in the actual table but requires additional disk space for storage.

The other is when the Computed Column exists as a virtual or logical entity and its values are computed dynamically at run time when it is referred to by a query. This saves disk space but slows query processing as the Computed Column has to generated every time it is referred to in a query. If you want to create a Computed Column as PERSISTED you have to precede the column name with the PERSISTED keyword. Otherwise SQL Server will create it as a virtual Computed Column i.e. not PERSISTED. If you want to add a Computed Column to an already existing table you can do so with ALTER TABLE statement and using or skipping the PERSISTED keyword as per your requirement.

What is an Index on Computed Column?

An Index on a Computed Column is simply an index which one or more Computed Columns as it’s key. Or to put it otherwise it is indexing a Computed Column(s) to enable faster search and retrieval of values from the column(s). However, to be able to create an index on a Computed Column, the Computed Column must satisfy the below 2 conditions.

  1. Determinism – Determinism means that the column values of a computed column should be fixed and not change with time. The expression for computation should always produce the same value. An example is that the computed tax payable per month for an employee will be the same always until the employee’s salary changes and a different slab becomes applicable or the government changes the tax rate for the employees’ current slab.
  2. Precision – Precision means that the datatype of the Computed Column or the table columns from which it is derived cannot have FLOAT or REAL numeric datatypes. If the data type of the Computed Column is numeric it has to be a whole number and cannot be decimal which is an approximate value and never absolute.

If the above conditions are met, we can create a non-clustered index using one or more persisted or normal Computed Columns.

Operation

There is no special syntax to create an index on Computed Columns. It is the same CREATE INDEX statement which is used to create the other types of indexes. However, the syntax to create a Computed Column using CREATE TABLE or ALTER TABLE is different and the same is highlighted below.

SQL Server Index on Computed Columns Syntax

The basic syntax to create a Computed Column using CREATE TABLE is as below.

CREATE TABLE table_name
column1 datatype,
column2 datatype,
………………………………
computed_column1 AS expression,
computed_column2 AS expression PERSISTED;

The basic syntax to create a Computed Column using ALTER TABLE is as below.

ALTER TABLE table_name
computed_column1 AS expression,
computed_column2 AS expression PERSISTED;

In this syntax,

  • expression – the expression to compute the Computed Column values.
  • PERSISTED – optional keyword to create a PERSISTED Computed Column. If skipped SQL Server will create a virtual Computed Column. Please note that on the demo syntax above I have cited both cases.

SQL Server Index on Computed Columns Example

Let us consider a table called employee which contains the basic information of employees. The table is represented below. The emp_id column is the primary key column in the table with an associated default clustered index by the name PK_employee.

emp_idemp_nameemp_sexemp_dob
1David JacksonMale8/8/1998
2Jim JamesonFemale11/26/1998
3Kate JohnsonFemale1/21/1987
4Will RayMale9/19/1989
5Shane MathewsFemale10/13/1987
6Shed PriceMale2/3/1987
7Viktor SmithMale2/22/2000
8Vincent SmithsonFemale9/15/1999
9Janice StreepFemale12/29/2000
10Laura WellsFemale1/1/2000
11Mac BullMale5/24/1996
12Patrick PattersonMale1/7/1999
13Julie OrbisonFemale4/24/1988
14Elice HemingwayFemale7/27/1988
15Wayne JohnsonMale3/8/1997
Table: Employee

Now suppose HR wants the birth year of the employees for some reason. We can do so using the year () function on the emp_DOB column values as below.

Query

SELECT emp_id, year(emp_dob) AS emp_birth_year
FROM employee
WHERE emp_birth_year=1999;

If we display the estimated execution plan for the query it will be as below. From the query plan we can see that SQL Server is doing an index scan of the default clustered index which is costly both in terms of time and computing resource. This is because there is no Computed Column or an index on it.

Query Execution Plan

SQL Server Index on Computed Columns

So, we add a PERSISTED Computed Column called emp_birth_year to the employee table.

Adding the Computed Column

The below ALTER TABLE statement does the same.

ALTER TABLE employee
ADD emp_birth_year AS year(emp_dob);

Now we can specify the Computed Column directly in the SELECT query as below.

SELECT emp_id, emp_birth_year
FROM employee
WHERE emp_birth_year=1999;

But this will not change the query processing since we still do not have any index on it. The only way for SQL Server to resolve the query is by scanning the default clustered index.

So now we will create a nonclustered index on the Computed Column to improve the query processing and replace inefficient index scan with efficient index seek.

Creating Index on the Computed Column

The below CREATE INDEX statement does the same.

Using T-SQL

CREATE NONCLUSTERED INDEX ix_cc_yob
ON employee(emp_dob);

The above index can also be created from the SSMS (SQL Server Management Studio) GUI in the following steps.

Using SSMS

  1. Right click in the Index node under the employee table in Object Explorer and select New Index – > Non-Clustered Index.
SQL Server Index on Computed Columns

2. In the General tab of the New Index dialog box that will open, type in the name of the index in the Index Name textbox. Then go to the Index Key Columns tab in the middle of the dialog and add Computed Column using the Add button on the right as shown in the figure below.

3. Once added it will show in the main window as seen below. Then click OK on all dialogs to create the new index and it will be created.

Checking the new Index

You can check the new index using below T-SQL command or from the SSMS GUI as follows.

Using T-SQL

EXEC SP_HELPINDEX customer;
index_nameindex_descriptionindex_keys
ix_cc_yobnonclustered located on PRIMARYemp_birth_year
PK_employeeclustered, unique, primary key located on PRIMARYemp_id

Using SSMS

Checking the New Index in Action

With the index in place if we generate the estimated query plan for the above query, we can see the difference. The estimated query execution plan and the query output are shown below. We can see that the query processor is now doing an index seek on the newly created nonclustered index. So, our purpose is achieved.

Query Execution Plan

Query execution plan after index on computed columns

Query Resultset

emp_idemp_birth_year
81999
121999

Advertisement