3 min read · Feb 17, 2023

Common Table Expressions (CTEs) are a powerful feature in SQL that allow for breaking down complex queries into smaller, more manageable parts. They provide a way to temporarily store intermediate results, making it easier to write complex queries and improving their readability. In this article, we’ll explore the basics of CTEs, including how to create and use them, as well as some of their more advanced applications.

A CTE is a named temporary result set that is defined within a SELECT, INSERT, UPDATE, or DELETE statement. Unlike a derived table, which is only accessible within the scope of a single query, a CTE can be used multiple times within a query. This allows for the reuse of complex subqueries, making it easier to write complex queries and improving their readability.

To create a CTE, you simply include the CTE definition within the query, preceded by the keyword “WITH”. For example, consider the following query:

WITH cte_name (column_name1, column_name2, ...) AS (
SELECT column_expression1, column_expression2, ...
FROM table_name
WHERE some_condition
)
SELECT *
FROM cte_name

In this example, the CTE is defined as “cte_name” and it has two columns, “column_name1” and “column_name2”. The SELECT statement within the CTE definition specifies the expressions for these columns and the source of the data. The final SELECT statement then retrieves all data from the CTE.

One of the key benefits of CTEs is their ability to break down complex queries into smaller, more manageable parts. For example, consider a query that calculates the running total of sales for each day in a given year. Without a CTE, the query would be complex and difficult to read

SELECT date, SUM(sales) OVER (ORDER BY date) AS running_total
FROM sales_data
WHERE date BETWEEN '2022-01-01' AND '2022-12-31'

With a CTE, however, the query can be broken down into smaller parts, making it easier to understand:

WITH daily_sales AS (
SELECT date, SUM(sales) AS total
FROM sales_data
WHERE date BETWEEN '2022-01-01' AND '2022-12-31'
GROUP BY date
)
SELECT date, SUM(total) OVER (ORDER BY date) AS running_total
FROM daily_sales

In this example, the CTE “daily_sales” calculates the total sales for each day in the given year. The final SELECT statement then uses the running total calculation to calculate the running total of sales for each day.

Another benefit of CTEs is that they can improve query performance. This is because CTEs are optimized as part of the query execution plan, and can be materialized as temporary tables if necessary. This can lead to significant performance improvements, especially in complex queries.

CTEs can also be used to simplify complex queries that would otherwise require multiple subqueries or temporary tables. For example, consider the following query:

WITH cte_name AS (
SELECT column1, column2, ...
FROM table_name
WHERE some_condition
)
SELECT c1.column1, c2.column2, ...
FROM cte_name

I love CTEs. They help me stage my data, before summarizing it for example.

Here is a simple CTE created to detect and remove duplicates on the Norwind database in SQL Server:

with cte as (
select CustomerID,
row_number() over(partition by CustomerID order by CustomerID) as rn
from orders
)

select * from cte where rn = 1

The 2 main benefits of using a CTE are:

  1. Simplifies complex queries by allowing you to break them down into smaller, more manageable chunks.
  2. Increases query readability by allowing you to give meaningful names to intermediate results.

CTEs are particularly useful for recursive queries, where a query needs to repeatedly reference a subquery that refers to itself. They are supported by most relational database management systems (RDBMS) including SQL Server, PostgreSQL, Oracle, MySQL, and others.


Leave a Reply

Your email address will not be published. Required fields are marked *