GROUP BY Clause
GROUP BY clause is used to create a summary of the dataset by grouping the dataset by one or more columns. GROUP BY clause is usually used with aggregate functions to define subtotals for the aggregate data.
When a GROUP BY clause is used in a query, SELECT statement for the query can only contain aggregate functions or columns which are listed in the GROUP BY clause.
SYNTAX:
SELECT Column1, Column2, Aggregate Function (Column3)
FROM Table1
GROUP BY Column1, Column2
HAVING Clause
HAVING clause is used to filter data generated using aggregate functions. It can also be used to filter data for columns present in the GROUP BY clause. It is typically used along with GROUP BY clause, but it is not mandatory to have a GROUP BY clause to use HAVING clause.
Difference between HAVING clause and WHERE clause is that, WHERE cause applies filter to data extracted from tables specified in the FROM clause, whereas HAVING clause applies filter only for data generated by a group or an aggregate function.
SYNTAX:
SELECT Column1, Column2, Aggregate Function (Column3)
FROM Table1
GROUP BY Column1, Column2
HAVING [condition]
Table: Movie
ID PurchaseType Category SubCategory Quantity
1 Purchase Action Action-Comedy 100
2 Purchase Action Superheroes 100
3 Purchase Drama Classics 200
4 Purchase Drama Romantic 200
5 Rental Action Action-Comedy 500
6 Rental Action Superheroes 1000
7 Rental Drama Classics 500
8 Rental Drama Romantic 1500
Example 1 (GROUP BY clause)
Query :
SELECT PurchaseType, SUM(Quantity) AS TotalMovies
FROM Movie
GROUP BY PurchaseType
Output :
PurchaseType TotalMovies
Purchase 600
Rental 3500
Description :
The above query groups all the rows in Movie table by PurchaseType column and uses aggregate function “SUM” to calculate total number of movies for each PurchaseType.
Example 2 (GROUP BY clause with multiple columns)
Query :
SELECT PurchaseType, Category, SUM(Quantity) AS TotalMovies
FROM Movie
GROUP BY PurchaseType, Category
Output :
PurchaseType Category TotalMovies
Purchase Action 200
Rental Action 1500
Purchase Drama 400
Rental Drama 2000
Description :
The above query groups all the rows in Movie table by PurchaseType column and Category column. For each combination of PurchaseType and Category, it calculates the total number of movies using aggregate function “SUM”.
Example 3 (HAVING clause)
Query :
SELECT PurchaseType, Category, SUM(Quantity) AS TotalMovies
FROM Movie
GROUP BY PurchaseType, Category
HAVING SUM (Quantity) > 300
Output :
PurchaseType Category TotalMovies
Rental Action 1500
Purchase Drama 400
Rental Drama 2000
Description :
The above query groups the dataset by PurchaseType and Category, then calculates total number of movies for each combination of PurchaseType and Category using aggregate function “SUM” and then it filters the results to output records having total number of movies greater than 300 using HAVING clause.
Example 4 (HAVING clause with WHERE clause)
Query :
SELECT PurchaseType, Category, SUM(Quantity) AS TotalMovies
FROM Movie
WHERE SubCategory = 'Classics'
GROUP BY PurchaseType, Category
HAVING SUM (Quantity) > 300
AND PurchaseType = 'Rental'
Output :
PurchaseType Category TotalMovies
Rental Drama 500
Description :
In the above query, first the WHERE clause is applied to filter records by SubCategory. The records which meet the condition in the WHERE clause are then grouped using the GROUP BY clause. The groups generated by the GROUP BY clause are filtered using HAVING clause.
Notice the columns used to filter records in WHERE clause and HAVING clause. SubCategory column used in the WHERE clause cannot be used in the HAVING clause because this column is not listed in the GROUP BY clause. Similarly Aggregate function “SUM” used in HAVING clause cannot be used in the WHERE clause because WHERE clause does not allow aggregate functions, but the second condition in the HAVING clause can be moved to the WHERE clause because PurchaseType is a column within the table listed in the FROM clause.