Often times during development, we come across requirements which ends up producing complex SQL statements. One way to make complex query more readable and manageable is by making use of views. One issue with this is that, view stores the schema definition permanently in the database. So if the result a view produces will not be used in other procedures, then creating a view would be a bad design choice.
Instead of creating a view, we could also make use of temporary result sets such as a temporary table or a derived table. The issue with using temporary table is that it’s a multi-step approach. First you would need to define a temporary table, then you would need to populate the temporary table and after using the temporary table, you need to drop the temporary table. With a derived table, although it has limited scope and disappears after the query has been completed, it affects readability of a query. A better solution in this case is to make use of Common Table Expression (CTE).
CTE improves readability and makes a complex query more manageable. It is similar to a derived table in that its scope is also limited to a single query. But unlike a derived table, it has the ability to run recursive queries by referencing itself multiple times within the same query.
SYNTAX:
WITH
CTE_Name[Column1, Column2 …]
AS
(
CTE_Definition
)
OuterQuery_Definition
You can divide the syntax into three parts:
WITH Clause
Definition of a CTE needs to be defined within a WITH clause. You can define multiple CTE’s within a single WITH clause. It is important to note here that the query preceding the WITH clause needs to be terminated using a semicolon ”;”.
CTE Name and Definition
Name of the CTE is used to identify itself. If there exists multiple CTE’s within a WITH clause, then each of them must have unique names.
CTE name is followed by optional column names. If column names are not specified, CTE will use the column names from the SELECT clause in the CTE definition.
CTE definition is simply one or more SELECT statements which populates the CTE.
Outer Query
This is the query which immediately follows the CTE definition. If there exists multiple CTE definition’s within a WITH clause, then Outer query will appear after defining all the CTE’s.
There are two types of CTE in SQL SERVER:
  • Non recursive CTE
  • Recursive CTE
Non Recursive CTE
Non recursive CTE’s are simply CTE’s which does not reference itself.
Before analyzing an example using CTE, lets first look into a query which does not use CTE.
For example, you have 3 tables in your database: Customer, CustomerProduct and Product.
Customer table contains details of all customers, CustomerProduct table contains the product each customer has bought and Product table contains details of each product. The requirement is that, we need to find total amount spent by customers who are at least 30 years old and has bought products worth $200 or more.
So without using CTE, we might create a query like this:
SELECT C.Name, C.Age, C.City, C.State, A.TotalSpent
FROM Customer AS C
INNER JOIN (
SELECT CP.CustomerID, SUM(P.Price) AS TotalSpent
FROM CustomerProduct AS CP
INNER JOIN Product AS P ON CP.ProductID = P.ProductID
WHERE P.Price > 200
GROUP BY CP.CustomerID
) AS A ON C.CustomerID = A.CustomerID
WHERE C.Age > 30
The above query makes use of a derived table. The derived table in this case returns list of customer id and the total amount spent by each customer on products worth more than $200. The main query joins the derived table, filters the customers who are older than 30 years and finally selects the customer information and the total amount spent by the customer.
As you can see, the more complex derived table we have, the more it affects the readability. A better approach is to make use of CTE as shown below.
WITH
AmountSpent(CustomerID, TotalSpent)
AS
(
SELECT CP.CustomerID, SUM(P.Price)
FROM CustomerProduct AS CP
INNER JOIN Product AS P ON CP.ProductID = P.ProductID
WHERE P.Price > 200
GROUP BY CP.CustomerID
)
SELECT C.Name, C.Age, C.City, C.State, A.TotalSpent
FROM AmountSpent AS A
INNER JOIN Customer AS C ON C.CustomerID = A.CustomerID
WHERE C.Age > 30
In the above query, AmountSpent is the name of the CTE. Following the CTE name, we have CustomerID and TotalSpent which are the column names for the CTE. CTE definition in the above query is a SELECT statement which returns list of customer id and the total amount spent by each customer on products worth more than $200. This definition is similar to the derived table we saw in the previous example. The output of this query gets inserted into the CTE AmountSpent.
Next we have an outer query, which joins the customer table with the CTE AmountSpent, filters the customers who are older than 30 years and finally selects the customer information and the total amount spent by the customer.
You can also add multiple CTE definitions within the WITH clause, and the final query will be able to reference all of them making the query much more readable compared to having multiple derived tables within a query.
Recursive CTE
As the name suggests, recursive CTE provides the ability to reference itself multiple times. It is typically used to obtain hierarchical or tree structured data.
SYNTAX:
WITH
CTE_Name[Column1, Column2 …]
AS
(
AnchorQuery_Definition
UNION ALL
RecursiveQuery_Definition
)
OuterQuery_Definition
Definition of Recursive CTE can been divided in to two parts.
Anchor Query
Anchor query is the one which gets invoked only once and does not reference the CTE.
Recursive Query
Recursive query on the other hand references the CTE and will keep getting executed multiple times until it returns an empty data set.
Let’s take a look at an example. Suppose we are interested in finding hierarchy of managers for a particular employee, we can make use of recursive CTE as shown below.
WITH
EmployeeManager (EmployeeID, ManagerID)
AS
(
SELECT EmployeeID, ManagerID
FROM Employee
WHERE EmployeeID = 10
UNION ALL
SELECT E.EmployeeID, E.ManagerID
FROM Employee AS E
INNER JOIN EmployeeManager AS M ON E.EmployeeID = M.ManagerID
)
SELECT E.EmployeeID, E.FirstName + ' ' + E.LastName AS Name, E.Title, E.ManagerID
FROM EmployeeManager AS M
INNER JOIN Employee AS E ON M.EmployeeID = E.EmployeeID
In the above query, EmployeeManager is the name of the CTE. It has two columns, EmployeeID and ManagerID. Anchor query is the first query in the CTE definition and is the one which gets executed first. It selects EmployeeID and ManagerID from Employee table. The query inserts a single row with EmployeeID 10 in to the CTE.
Now that we have the primary employee record in the CTE, what we need now is to find the manager of this employee and insert the manager record in the CTE. To achieve this, the second query which is the recursive query in the CTE gets executed.
The recursive query references the existing record in the CTE and selects the EmployeeID and ManagerID of the manager of the primary employee. At this point, the CTE would have two records, the first record is the primary employee with EmployeeID 10 and the second record is the immediate manager of the primary employee.
Since the recursive query did not return an empty data set, the recursive query again executes by referencing the CTE (which now has 2 records) and inserts managers of the existing records in the CTE. This continues until the recursive query inserts all the managers in the hierarchy in to the CTE.
Once the CTE has been completely populated, the control goes to the outer query, which references the CTE and displays details of those employees in the CTE.
INVENTORY DATABASE
Let’s consider a furniture database with 3 tables: Inventory table, Location table and Orders table. Inventory table stores data regarding furniture details, Location table stores information regarding all the store locations and Orders table keeps track of sales for each inventory in each location.
Table: Inventory
InventoryID Name Price
1 Queen Bed 1000
2 King Bed 1500
3 6pc. Bedroom Set 3500
Table: Location
LocationID Name State
1 San Francisco CA
2 New York NY
3 Phoenix AZ
Table: Orders
OrderID InventoryID LocationID OrderDate DailyTotalOrders
1 1 1 1/1/2013 5
2 1 2 1/1/2013 8
3 1 3 1/1/2013 0
4 2 1 1/1/2013 8
5 2 2 1/1/2013 3
6 2 3 1/1/2013 10
7 3 1 1/1/2013 5
8 3 2 1/1/2013 0
9 3 3 1/1/2013 0
10 1 1 7/1/2013 15
11 1 2 7/1/2013 18
12 1 3 7/1/2013 10
13 2 1 7/1/2013 19
14 2 2 7/1/2013 12
15 2 3 7/1/2013 13
16 3 1 7/1/2013 8
17 3 2 7/1/2013 5
18 3 3 7/1/2013 9
Example 1 (Non Recursive CTE)
Query :
WITH
InventorySales(InventoryID, Inventory, TotalSales)
AS
(
SELECT I.InventoryID, I.Name, SUM(DailyTotalOrders)
FROM InventoryOrders AS O
INNER JOIN Inventory AS I ON I.InventoryID = O.InventoryID
GROUP BY I.InventoryID, I.Name
)
SELECT *
FROM InventorySales
Output :
InventoryID Inventory TotalSales
1 Queen Bed 56
2 King Bed 65
3 6pc. Bedroom Set 27
Description :
The above query is a simple example of CTE. InventorySales is the name of the CTE and it contains 3 columns: InventoryID, Inventory and TotalSales.
In the CTE definition, we have a query which returns total sales for each inventory which then gets inserted into the CTE InventorySales. The outer query simply selects all the records from the CTE InventorySales.
Example 2 (Non Recursive CTE)
In this example, the objective is to compare total orders for each inventory in each location with the total orders across all location for the specific inventory.
Query :
WITH
NationalInventorySales(InventoryID, TotalSales)
AS
(
SELECT InventoryID, SUM(DailyTotalOrders)
FROM InventoryOrders
GROUP BY InventoryID
)
SELECT I.InventoryID, I.Name AS Inventory, L.Name AS Region, SUM(DailyTotalOrders) AS RegionalTotalSales, N.TotalSales AS NationalTotalSales
FROM NationalInventorySales AS N
INNER JOIN InventoryOrders AS O ON O.InventoryID = N.InventoryID
INNER JOIN Inventory AS I ON I.InventoryID = O.InventoryID
INNER JOIN InventoryLocation AS L ON L.LocationID = O.LocationID
GROUP BY I.InventoryID, I.Name, L.Name, N.TotalSales
Output :
InventoryID Inventory Region RegionalTotalSales NationalTotalSales
1 Queen Bed New York 26 56
1 Queen Bed Phoenix 10 56
1 Queen Bed San Francisco 20 56
2 King Bed New York 15 65
2 King Bed Phoenix 23 65
2 King Bed San Francisco 27 65
3 6pc. Bedroom Set New York 5 27
3 6pc. Bedroom Set Phoenix 9 27
3 6pc. Bedroom Set San Francisco 13 27
Description :
In the above example, CTE NationalInventorySales contains sum of daily orders across all locations for each inventory. The outer query displays total orders for each Inventory at each location using the InventoryOrders table, as well as total orders for each Inventory across all locations using the NationalInventorySales CTE.
Example 3 (Non Recursive CTE)
In this example, the objective is to compare daily sales of inventory at each location with the yearly average sales for the specific inventory and location.
Query :
WITH
YearlyAverage(InventoryID, LocationID, OrderYear, AverageOrders)
AS
(
SELECT InventoryID, LocationID, YEAR(OrderDate), AVG(DailyTotalOrders)
FROM InventoryOrders
GROUP BY InventoryID, YEAR(OrderDate), LocationID
)
SELECT I.Name AS Inventory, L.Name AS Region, O.OrderDate, O.DailyTotalOrders, Y.AverageOrders AS YearlyAverage
FROM YearlyAverage AS Y
INNER JOIN InventoryOrders AS O ON O.InventoryID = Y.InventoryID
AND O.LocationID = Y.LocationID AND YEAR(O.OrderDate) = Y.OrderYear
INNER JOIN Inventory AS I ON I.InventoryID = O.InventoryID
INNER JOIN InventoryLocation AS L ON L.LocationID = O.LocationID
Output :
Inventory Region OrderDate DailyTotalOrders YearlyAverage
Queen Bed San Francisco 1/1/2013 5 10
Queen Bed New York 1/1/2013 8 13
Queen Bed Phoenix 1/1/2013 0 5
King Bed San Francisco 1/1/2013 8 13
King Bed New York 1/1/2013 3 7
King Bed Phoenix 1/1/2013 10 11
6pc. Bedroom Set San Francisco 1/1/2013 5 6
6pc. Bedroom Set New York 1/1/2013 0 2
6pc. Bedroom Set Phoenix 1/1/2013 0 4
Queen Bed San Francisco 7/1/2013 15 10
Queen Bed New York 7/1/2013 18 13
Queen Bed Phoenix 7/1/2013 10 5
King Bed San Francisco 7/1/2013 19 13
King Bed New York 7/1/2013 12 7
King Bed Phoenix 7/1/2013 13 11
6pc. Bedroom Set San Francisco 7/1/2013 8 6
6pc. Bedroom Set New York 7/1/2013 5 2
6pc. Bedroom Set Phoenix 7/1/2013 9 4
Description :
In the above example, CTE YearlyAverage extracts average number of orders for each year for each inventory at each location.
The outer query then displays total daily orders for each inventory at each location along with the yearly average from the YearlyAverage CTE.
RESTAURANT DATABASE
Let’s consider a restaurant database with 2 tables: Category table and Items table. Category table contains menu categories and Items table contains details of each item in the menu.
Table: Category
CategoryID Name
1 Appetizer
2 Salad
3 Entree
4 Dessert
Table: Items
ItemID CategoryID Name Price Orders
1 1 Oysters & Clams 10 946
2 1 Shrimp & Calamari 10 756
3 1 Bruschetta Caprese 6 514
4 1 Lobster Bisque 9 789
5 1 Buffalo Wings 12 846
6 2 Mediterranean Chicken Salad 9 812
7 2 Caesar Salad 10 579
8 2 Garden Salad 6 669
9 2 Grilled Salmon Salad 14 635
10 2 Greek Salad 10 768
11 3 Baby Back Ribs 20 859
12 3 Fried Oyster & Crab Cake 24 917
13 3 Chicken Fettuccine 18 701
14 3 Drunken Ribeye 30 859
15 3 Herb Crusted Salmon 18 768
16 4 Tennessee Whiskey Cake 8 782
17 4 Tiramisu 8 691
18 4 Warm Apple Crostada 8 737
19 4 Crème Brulée 10 849
20 4 Bread Pudding 7 959
Example 4 (Non Recursive CTE)
In this example, the objective is to find the most priced item in each catregory.
Query :
WITH
MostExpensive(CategoryID, Price)
AS
(
SELECT CategoryID, MAX(Price)
FROM RestaurantItem
GROUP BY CategoryID
)
SELECT C.Name AS Category, R.Name AS 'Most Expensive Item', R.Price
FROM RestaurantCategory AS C
INNER JOIN MostExpensive AS ME ON ME.CategoryID = C.CategoryID
INNER JOIN RestaurantItem AS R ON R.CategoryID = ME.CategoryID AND R.Price = ME.Price
ORDER BY C.CategoryID
Output :
Category Most Expensive Item Price
Appetizer Buffalo Wings 12
Salad Grilled Salmon Salad 14
Entree Drunken Ribeye 30
Dessert Crème Brulée 10
Description :
In the above example, the CTE groups each item by category and returns the max price among all the items within each category. The outer query then displays details of the item which has the maximum price within each category.
Example 5 (Non Recursive CTE)
In this example, the objective is to find the most expensive item, the most popular item and the lease popular item within each category.
Query :
WITH
MostExpensive(CategoryID, Price)
AS
(
SELECT CategoryID, MAX(Price)
FROM RestaurantItem
GROUP BY CategoryID
),
MostPopular(CategoryID, Popularity)
AS
(
SELECT CategoryID, MAX(Orders)
FROM RestaurantItem
GROUP BY CategoryID
),
LeastPopular(CategoryID, Popularity)
AS
(
SELECT CategoryID, Min(Orders)
FROM RestaurantItem
GROUP BY CategoryID
)
SELECT C.Name AS Category, R1.Name AS 'Most Expensive', R1.Price, R2.Name AS 'Most Popular', R2.Price, R3.Name AS 'Least Popular', R3.Price
FROM RestaurantCategory AS C
INNER JOIN MostExpensive AS ME ON ME.CategoryID = C.CategoryID
INNER JOIN RestaurantItem AS R1 ON R1.CategoryID = ME.CategoryID AND R1.Price = ME.Price
INNER JOIN MostPopular AS MP ON MP.CategoryID = C.CategoryID
INNER JOIN RestaurantItem AS R2 ON R2.CategoryID = MP.CategoryID AND R2.Orders = MP.Popularity
INNER JOIN LeastPopular AS LP ON LP.CategoryID = C.CategoryID
INNER JOIN RestaurantItem AS R3 ON R3.CategoryID = LP.CategoryID AND R3.Orders = LP.Popularity
ORDER BY C.CategoryID
Output :
Category Most Expensive Item Price Most Popular Item Price Least Popular Item Price
Appetizer Buffalo Wings 12 Oysters & Clams 10 Bruschetta Caprese 6
Salad Grilled Salmon Salad 14 Mediterranean Chicken Salad 9 Caesar Salad 10
Entree Drunken Ribeye 30 Fried Oyster & Crab Cake 24 Chicken Fettuccine 18
Dessert Crème Brulée 10 Bread Pudding 7 Tiramisu 8
Description :
In the above example, we have multiple CTE’s.
MostExpensive CTE contains the max price among all items within each category.
MostPopular CTE contains the max orders among all items within each category.
LeastPopular CTE contains he minimum orders among all items within each category
The outer query then displays name and price of the item with maximum price, name and price of the item with most number of orders, name and price of the item with least number of orders for each category.
Note that all 3 CTE’s in the above query is defined within a single WITH clause and each CTE is separated by a comma.
EMPLOYEE DATABASE
Let’s consider a database with an Employee table. The Employee table consists of employee details along with their ManagerID. ManagerID is just the EmployeeID of the employee’s immediate manager.
Table: Employee
EmployeeID FirstName LastName Title ManagerID
1 George Cruz CEO NULL
2 John Mathew Vice President 1
3 Ethan William Director I 2
4 Jacob Logan Director II 3
5 Nathan Mateo Senior Manager 4
6 Alice Charles Manager I 5
7 John Robert Manager II 6
8 Sophia James Manager III 7
9 Kevin Chan Product Manager 5
10 Susan Mortiz Solutions Arthitect 6
11 Martha Miller Systems Arthitect 10
Example 6 (Recursive CTE)
In this example, the objective is to find the hierarchy of managers for an employee with EmployeeID 11.
Query :
WITH
EmployeeManager(EmployeeID, ManagerID)
AS
(
SELECT EmployeeID, ManagerID
FROM Employee
WHERE EmployeeID = 11
UNION ALL
SELECT E.EmployeeID, E.ManagerID
FROM Employee AS E
INNER JOIN EmployeeManager AS M ON E.EmployeeID = M.ManagerID
)
SELECT E.EmployeeID, E.FirstName + ' ' + E.LastName AS Name, E.Title, E.ManagerID
FROM EmployeeManager AS M
INNER JOIN Employee AS E ON M.EmployeeID = E.EmployeeID
Output :
EmployeeID Name Title ManagerID
11 Martha Miller Systems Arthitect 10
10 Susan Mortiz Solutions Arthitect 6
6 Alice Charles Manager I 5
5 Nathan Mateo Senior Manager 4
4 Jacob Logan Director II 3
3 Ethan William Director I 2
2 John Mathew Vice President 1
1 George Cruz CEO NULL
Description :
In the above example, anchor query for the CTE (EmployeeManager) adds EmployeeID and ManagerID into the CTE for employeeID 11 from the Employee table.
At this point, the CTE has one record of employee with EmployeeID 11.
Next the recursive query is executed which joins the CTE (EmployeeManager) with the Employee table to extract EmployeeID and ManagerID of employee’s immediate manager.
At this point, the CTE will have two records, one with EmployeeID 11 and another with EmployeeID 8.
Now the recursive query again joins the CTE with the Employee table and adds EmployeeID and ManagerID of managers for both the employees in the CTE. This continues until the recursive query returns an empty record set which implies that recursion has reached the top level of the hierarchy, which means CEO has been added to the CTE and since there is no one above CEO, the recursive query returned an empty data set.
The outer query then displays details of all employees in the CTE EmployeeManager.
Example 7 (Recursive CTE)
As you can see from the Employee table, every employee in the organization except the CEO is subordinate to someone else within the organization. CEO is the only employee who does not have a ManagerID. In this example, the objective is to find the hierarchical structure of the organization which displays the leadership within the organization.
Query :
WITH
EmployeeManager (EmployeeID, ManagerID, RecursionLevel)
AS
(
SELECT EmployeeID, ManagerID, 0
FROM Employee
WHERE ManagerID IS NULL
UNION ALL
SELECT E.EmployeeID, E.ManagerID, RecursionLevel + 1
FROM Employee AS E
INNER JOIN EmployeeManager AS M ON E.ManagerID = M.EmployeeID
)
SELECT E.EmployeeID, E.FirstName + ' ' + E.LastName AS Name, E.Title, E.ManagerID, RecursionLevel
FROM EmployeeManager AS M
INNER JOIN Employee AS E ON M.EmployeeID = E.EmployeeID
Output :
EmployeeID Name Title ManagerID RecursionLevel
1 George Cruz CEO NULL 0
2 John Mathew Vice President 1 1
3 Ethan William Director I 2 2
4 Jacob Logan Director II 3 3
5 Nathan Mateo Senior Manager 4 4
6 Alice Charles Manager I 5 5
9 Kevin Chan Product Manager 5 5
7 John Robert Manager II 6 6
10 Susan Mortiz Solutions Arthitect 6 6
11 Martha Miller Systems Arthitect 10 7
8 Sophia James Manager III 7 7
Description :
In the above example, anchor query extracts the CEO of the organization and inserts it into the CTE EmployeeManager. It sets EmployeeID as 1, ManagerID as NULL and RecursionLevel as 0.
Now the recursive query joins the record in the CTE with the Employee table to extract all the employees who report to the CEO. From the Employee table, there exists only one employee who reports to the CEO, so the recursive query extracts EmployeeID and ManagerID for EmployeeID 2 and sets the recursion level to 1.
Since the recursive query did not return an empty result set, it once again executes. This time it extracts all the employees who report to either EmployeeID 1 or EmployeeID 2 and sets the recursion level to 2.
Recursive query will keep executing until all the employees have been inserted into the CTE.
Once the CTE has been completely populated, the outer query executes and displays employee details of all the employees in the EmployeeManager CTE.