8 min read โ€“ Mar 22, 2023

SQL CASE statements for transforming, aggregating, filtering and labeling dataโ€Šโ€”โ€ŠImage by Author
SQL CASE statements for transforming, aggregating, filtering and labeling data โ€” Image by Author

CASE statements in SQL are a powerful tool for creating conditional logic and branching within SQL queries.

They allow you to evaluate expressions and return different results depending on the conditions you specify.

Here is the syntax of Simple and Search CASE Statements

Syntax of a Simple CASE statement
CASE expression
WHEN value1 THEN result1
WHEN value2 THEN result2
...
ELSE default_result
END

Simple CASE statement compares an expression to a set of values and returns a result based on the matching value

Hereโ€™s an example of a simple CASE statement from the Northwind database that checks the value of the CategoryID column for each row in the Products table. If the value matches one of the categories (1-8), the corresponding category name is returned. If the value does not match any of the categories, the result is โ€˜Unknownโ€™.

SELECT 
ProductName,
CategoryID,
CASE CategoryID
WHEN 1 THEN 'Beverages'
WHEN 2 THEN 'Condiments'
WHEN 3 THEN 'Confections'
WHEN 4 THEN 'Dairy Products'
WHEN 5 THEN 'Grains/Cereals'
WHEN 6 THEN 'Meat/Poultry'
WHEN 7 THEN 'Produce'
WHEN 8 THEN 'Seafood'
ELSE 'Unknown'
END AS Category
FROM Products;
Syntax of a Searched CASE statement
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
...
ELSE default_result
END

Search CASE statement, on the other hand, allows for more complex comparisons by evaluating multiple conditions.

Hereโ€™s an example of a searched CASE statement that returns a list of orders and their shipping status based on the ShippedDate and RequiredDate columns. We can use a Searched CASE statement to determine whether the order was shipped on time, early, or late:

SELECT 
OrderID,
ShippedDate,
RequiredDate,
CASE
WHEN ShippedDate IS NULL THEN 'Not Shipped Yet'
WHEN ShippedDate <= RequiredDate THEN 'On Time'
ELSE 'Late'
END AS ShippingStatus
FROM Orders;

In this query, the CASE statement evaluates multiple conditions for each row in the Orders table. If the ShippedDate is NULL, the result is โ€˜Not Shipped Yetโ€™. If the ShippedDate is on or before the RequiredDate, the result is โ€˜On Timeโ€™. If the ShippedDate is after the RequiredDate, the result is โ€˜Lateโ€™. The query returns four columns: OrderIDShippedDateRequiredDate, and ShippingStatus. The ShippingStatus column is derived from the CASE statement and indicates whether the order was shipped on time, early, or late.

In addition to categorizing data, CASE statements can also be used for:

Data Transformation: CASE statements can be used to transform data within a SQL query. For example, you could use a CASE statement to convert a string value to a numeric value, or to format dates in a specific way.

Aggregating Data: CASE statements can be used with aggregate functions such as COUNT, SUM, and AVG to create customized calculations. For example, you could use a CASE statement to count the number of orders that fall within a specific date range.

Filtering Data: CASE statements can be used to filter data in a SQL query. For example, you could use a CASE statement to exclude certain products or orders from your query based on specific criteria.

Labeling Data: CASE statements can be used to label data within a SQL query. For example, you could use a CASE statement to label customers as โ€œhigh valueโ€ or โ€œlow valueโ€ based on their total order amount.

Below are two examples of each of the four use cases described above, which utilize tables from the Northwind database in SQL Server.

Data Transformation with CASE Statements

SQL CASE Statement can be used for data transformationโ€Šโ€”โ€ŠImage by Author

Example 1

Letโ€™s say you have a column in your Northwind database that contains a string representation of a date, and you want to convert it to a date data type. You can use a CASE statement like this:

SELECT OrderID, 
CASE
WHEN ISDATE(OrderDate) = 1 THEN CONVERT(date, OrderDate, 101)
ELSE NULL
END AS OrderDate
FROM Orders;

This CASE statement checks if the value in the OrderDate column can be converted to a date data type using the ISDATE function. The ISDATE function returns 1 if the expression is a valid datetime value; otherwise, 0. If it can be converted, the CASE statement uses the CONVERT function to convert the value to a date in the format โ€˜mm-dd-yyyyโ€™. If it canโ€™t be converted, the CASE statement returns NULL.

Example 2

Letโ€™s say you have a column in your Northwind database that contains a currency value as a string, and you want to convert it to a numeric data type. You can use a CASE statement like this:

SELECT 
ProductName,
CASE
WHEN CHARINDEX('$', UnitPrice) > 0 THEN CONVERT(NUMERIC(10,2), REPLACE(UnitPrice, '$', ''))
WHEN CHARINDEX('ยฃ', UnitPrice) > 0 THEN CONVERT(NUMERIC(10,2), REPLACE(UnitPrice, 'ยฃ', '')) / 1.5
ELSE CONVERT(NUMERIC(10,2), UnitPrice)
END AS NumericPrice
FROM Products;

This CASE statement checks if the value in the UnitPrice column contains a dollar sign ($) or a pound sign (ยฃ). If it does, the CASE statement uses the REPLACE function to remove the symbol and then converts the value to a numeric data type. If it doesnโ€™t contain a symbol, the CASE statement simply converts the value to a numeric data type. If the currency is pounds, it also converts it to dollars by dividing by 1.5.

Aggregating Data with CASE Statements

SQL CASE Statement can be used for aggregating dataโ€Šโ€”โ€ŠImage by Author

Example 1

Letโ€™s say you want to count the number of orders in the Northwind database that fall within specific date ranges. You can use a CASE statement like this:

SELECT 
COUNT(*) AS TotalOrders,
SUM(CASE
WHEN OrderDate BETWEEN '1997-01-01' AND '1997-03-31' THEN 1
ELSE 0
END) AS Q1Orders,
SUM(CASE
WHEN OrderDate BETWEEN '1997-04-01' AND '1997-06-30' THEN 1
ELSE 0
END) AS Q2Orders,
SUM(CASE
WHEN OrderDate BETWEEN '1997-07-01' AND '1997-09-30' THEN 1
ELSE 0
END) AS Q3Orders,
SUM(CASE
WHEN OrderDate BETWEEN '1997-10-01' AND '1997-12-31' THEN 1
ELSE 0
END) AS Q4Orders
FROM Orders;

This CASE statement uses the BETWEEN operator to check if the value in the OrderDate column falls within specific date ranges. If it does, the CASE statement returns a value of 1, which is then added up using the SUM function to calculate the total number of orders that fall within each quarter of the year.

Example 2

Letโ€™s say you want to calculate the average order value for each customer in the Northwind database, but you want to exclude any orders that have a discount greater than 10%. You can use a CASE statement like this:

SELECT 
c.CustomerID,
AVG(CASE
WHEN o.Freight > 1 THEN od.UnitPrice * od.Quantity
ELSE 0
END) AS AverageOrderValue
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
JOIN OrderDetails od ON o.OrderID = od.OrderID
GROUP BY c.CustomerID;

This CASE statement checks the value in the Freight column for each order. If the freight value is more than 1, the CASE statement calculates the total value of the order by multiplying the unit price by the quantity, and then takes the average of those values using the AVG function. If the freight is less or equal to 1, the CASE statement returns a value of 0, which is then excluded from the average calculation.

Filtering Data with CASE Statements

SQL CASE Statement can be used for filtering data

Example 1

Letโ€™s say you want to exclude certain products from a query based on their category. You can use a CASE statement like this:

SELECT ProductName, CategoryID
FROM Products
WHERE
(CASE WHEN CategoryID in (1,2) THEN 'exclude' ELSE 'include' END)
<> 'exclude';

This CASE statement checks the value in the CategoryID column for each product. If the value is 1 or 2 โ€” weโ€™re using the IN operator here โ€” (which represent categories that you want to exclude), the CASE statement returns the string โ€˜excludeโ€™. If the value is any other category, the CASE statement returns the string โ€˜includeโ€™. The WHERE clause then filters out any products that have a value of โ€˜excludeโ€™. This is an example of a CASE statement in a WHERE clause.

Example 2

Letโ€™s say you want to exclude any orders from the Northwind database that were shipped outside of the United States. You can use a CASE statement like this:

SELECT 
OrderID,
ShipCountry,
CASE
WHEN ShipCountry = 'USA' THEN 1
ELSE 0
END AS InUSA
FROM Orders
WHERE CASE
WHEN ShipCountry = 'USA' THEN 1
ELSE 0
END = 1;

This CASE statement checks the value in the ShipCountry column for each order. If the value is โ€˜USAโ€™, the CASE statement returns a value of 1, indicating that the order was shipped within the United States. If the value is anything else, the CASE statement returns a value of 0. The WHERE clause then filters out any orders that have a value of 0, leaving only orders that were shipped within the United States.

Labeling Data with CASE Statements

SQL CASE Statement can be used for labeling dataโ€Šโ€”โ€ŠImage by Author

Example 1

Letโ€™s say you want to label customers in the Northwind database as โ€œhigh valueโ€ or โ€œlow valueโ€ based on their total order amount. You can use a CASE statement like this:

SELECT 
CustomerID,
SUM(od.UnitPrice * od.Quantity) AS TotalSpent,
CASE
WHEN SUM(od.UnitPrice * od.Quantity) > 10000 THEN 'high value'
ELSE 'low value'
END AS ValueCategory
FROM Orders o
JOIN OrderDetails od ON o.OrderID = od.OrderID
GROUP BY CustomerID;

This CASE statement calculates the total amount that each customer has spent on orders.

Example 2

Letโ€™s say you want to label each employee in the Northwind database as โ€œexperiencedโ€ or โ€œinexperiencedโ€ based on their hire date. You can use a CASE statement like this:

SELECT 
EmployeeID,
HireDate,
CASE
WHEN DATEDIFF(year, HireDate, GETDATE()) >= 5 THEN 'experienced'
ELSE 'inexperienced'
END AS ExperienceLevel
FROM Employees

The query utilizes two functions, namely GETDATE and DATEDIFF. The GETDATE function returns the current datetime value of the database system, while the DATEDIFF function calculates the difference between two dates based on a specified datepart.

Conclusion

In conclusion, CASE statements are a powerful tool in SQL that allow for conditional logic and branching within queries. They can be used for a variety of purposes, including data transformation, aggregating data, filtering data, and labeling data.

By using CASE statements in SQL, you can create more complex and customized calculations, as well as filter and label data based on specific criteria. This makes them an extremely useful tool for manipulating and analyzing data within a SQL database.

In addition, the examples provided above illustrate how CASE statements can be used in a variety of scenarios to accomplish different tasks, such as converting data types, excluding certain values, or labeling data based on specific criteria. This flexibility and versatility make CASE statements a must-have tool for any SQL developer or data analyst.


Leave a Reply

Your email address will not be published. Required fields are marked *