SQL Server CORRELATED SUBQUERY

Learning Objective

The objective of this SQL Server tutorial is to teach you how to use a CORRELATED SUBQUERY in a SQL statement to accept input from the main query for processing and generation of final resultset.

What is CORRELATED SUBQUERY in SQL Server?

There are two types of subqueries – CORRELATED SUBQUERY and UNCORRELATED SUBQUERY (simply put a normal subquery). A correlated subquery accepts input from the main query for processing and hence is dependent upon it. That is why it is called a CORRELATED SUBQUERY i.e. a subquery related to the main query. Without the main query a CORRELATED SUBQUERY does not have a life of its own. A CORRELATED SUBQUERY executes multiple times for every row fetched from the table by the main query. That is why a CORRELATED SUBQUERY is also called a REPEATING SUBQUERY.

The differences between CORRELATED and UNCORRELATED SUBQUERY are listed below.

Pt. No.Uncorrelated SubqueryCorrelated Subquery
1Subquery executes first and only onceSubquery executes after main query and executes multiple times
2Subquery provides input to the main querySubquery accepts input from the main query
3Subquery resultset value or values are patched to the final resultsetSubquery resultset value or values are used for evaluation or comparison with main query row value or values as per condition specified
4Subquery resultset value or values form part of the final resultsetSubquery resultset value or values do not form part of the final resultset
5Efficient and faster processing since subquery executes only onceInefficient and slower processing since subquery processes repeatedly for every row fetched by main query from the table

Operation

A CORRELATED SUBQUERY can be part of SELECT, UPDATE and DELETE statements. A CORRELATED SUBQUERY can return one or more values and is usually used in conjunction with an aggregate function. The value or values returned is compared or evaluated against the specified outer query row value to keep or reject the outer query row (i.e. record) as a part of the final resultset.

SQL Server CORRELATED SUBQUERY Syntax

The basic syntax of a CORRELATED SUBQUERY can be presented as below.

SELECT column_list
FROM table1 outer_table
WHERE outer_column operator
(select column_list
FROM table2 inner_table
WHERE inner_table.inner_column= outer_table.outer_column);

In this syntax,

  • column_list – list of table columns.
  • outer – table on which the main query operates.
  • outer_column – column belonging to the outer table.
  • operator – a comparison, logical or mathematical operator.
  • inner – table on which the subquery operates. It can be the same as outer table or another table.
  • inner _column – column belonging to the outer table.

1)  Example 1: SQL Server CORRELATED SUBQUERY

Let us understand the same with the help of some examples. Suppose we have a table called orders containing the information of orders placed by different customers as below. We will use this table as the sample reference table for our example.

order_idcust_nameorder_dateorder_value
210Michael Smith6/11/20195500
211Michael Smith6/11/20195100
212Jackie Collins6/13/20193139
213Jackie Collins6/13/20193000
214Howard Spencer6/13/20193550
215Howard Spencer6/13/20193500
216Michael Smith6/22/20195000
217Jackie Collins6/22/20195500
218Jackie Collins6/22/20193000
Table: orders

Suppose we want to find out the highest value order placed by each customer with the resultset sorted by order value. We can do so through the following query with a CORRELATED SUBQUERY in it.

SELECT cust_name, order_id, order_value
FROM orders o1
WHERE order_value IN
(SELECT MAX (order_value) FROM orders o2
WHERE o2.cust_name = o1.cust_name GROUP BY cust_name)
ORDER BY order_value DESC;

It will generate the following result which fulfils our requirement.

cust_nameorder_idorder_value
Michael Smith2105500
Jackie Collins2175500
Howard Spencer2143550

Explanation: Here the outer main query executes first and fetches the mentioned column values from every row of the table one by one and passes it to the subquery. The subquery then runs in its entirety determining the highest value order for each customer with the help of the MAX aggregate function. The outer table row order value is then compared with this value. If outer table row order value matches any of the MAX values then that row is retained otherwise rejected. In this manner the query isolates the highest value customer order records for each customer which constitutes the final resultset.

2) Example 2: SQL Server CORRELATED SUBQUERY

Let us consider another example of an employees table containing the data of company employees. The table is represented below. The table has employees belonging to 3 departments identified by the id’s 221, 222, 223.

emp_idfirst_namelast_nameemp_salarydept_id
1270ClaudiaCrawford52000221
1271JeffGoldsmith50000221
1272NigelKnight60000222
1273JamesOrwell66000222
1274CindySmith65000222
1275NikiBailey55000223
1276MikeMattis69000223
1277AngeliaJackson59000223
1278MarthaDeClarke55000223
1279AlbertHemingway55000221

Suppose we want to retrieve the list of employees whose salary is higher than the department average sorted by employee salary. We can do so through the following query using a CORRELATED SUBQUERY in it.

SELECT emp_id, first_name, last_name, emp_salary
FROM employees e1
WHERE emp_salary >
(SELECT AVG (emp_salary)
FROM employees
WHERE dept_id = e1.dept_id)
ORDER BY emp_salary DESC;

It will generate the following result which fulfils our requirement.

emp_idfirst_namelast_nameemp_salary
1276MikeMattis69000
1273JamesOrwell66000
1274CindySmith65000
1279AlbertHemingway55000

Explanation: Just like in the previous example the outer main query executes first and fetches the mentioned column values from every row of the table one by one and passes it to the subquery. The subquery then runs in its entirety determining the average salary for each department with the help of the AVG aggregate function. The outer table row employee salary value is then compared with this value. If outer table employee salary is higher than any of the department values then that row is retained otherwise rejected. In this manner the query isolates the employee records of those employees whose salary above the department average and that constitutes the final resultset.

Advertisement