MySQL Group Filtering with HAVING

What is the HAVING Clause?

The HAVING clause in MySQL is used to filter results of a GROUP BY query based on aggregate conditions. Unlike WHERE, which operates on individual rows, HAVING applies to grouped data after aggregate functions like COUNT, SUM, AVG, MAX, and MIN have been computed.

Basic Syntax:

SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1
HAVING aggregate_function(column2) condition;
  • column1 – Column(s) to group by.
  • aggregate_function(column2) – Aggregated value to filter.
  • condition – The criteria to filter groups.

Setting Up the Example Table

Consider the following employees table:

CREATE TABLE employees (
    id INT AUTO_INCREMENT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50),
    salary DECIMAL(10,2),
    hire_date DATE
);

This table will serve as the basis for demonstrating HAVING.


Basic Use of HAVING

The HAVING clause is most useful when filtering results after applying aggregate functions.

Example 1: Departments with More Than 5 Employees

SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;

Step-by-Step Analysis:

  • GROUP BY department groups employees by department.
  • COUNT(*) calculates the number of employees in each department.
  • HAVING COUNT(*) > 5 filters out departments with 5 or fewer employees.
  • Returns only departments meeting the employee threshold.

Logic Behind the Query:
MySQL first groups rows by department, calculates the count for each group, and then filters groups based on the aggregated value.


Using HAVING with SUM()

HAVING can filter groups based on the total of numeric columns.

Example 2: Departments with Total Salary Above 300,000

SELECT department, SUM(salary) AS total_salary
FROM employees
GROUP BY department
HAVING SUM(salary) > 300000;

Step-by-Step Analysis:

  • SUM(salary) calculates the total salaries per department.
  • HAVING SUM(salary) > 300000 returns only departments whose total salary exceeds 300,000.
  • Useful for budgeting or payroll analysis.

Combining Multiple Aggregates in HAVING

You can apply multiple aggregate conditions in a single HAVING clause.

Example 3: Departments with More Than 5 Employees and Average Salary Above 50,000

SELECT department, COUNT(*) AS employee_count, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING COUNT(*) > 5 AND AVG(salary) > 50000;

Step-by-Step Analysis:

  • COUNT(*) > 5 ensures only departments with sufficient employees are included.
  • AVG(salary) > 50000 filters based on average salary.
  • AND combines the two conditions.
  • Returns departments meeting both criteria.

Logic Behind the Query:
MySQL performs the grouping first, calculates both aggregates per group, and then applies the HAVING filter to determine which groups appear in the final result.


Using HAVING with Aliases

You can reference column aliases in the HAVING clause for readability.

Example 4: Using Alias for Average Salary

SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING avg_salary > 50000;

Step-by-Step Analysis:

  • AVG(salary) AS avg_salary assigns a readable name to the aggregated value.
  • HAVING avg_salary > 50000 filters departments where the average salary exceeds 50,000.
  • Improves clarity in queries with multiple aggregates.

Key Differences Between WHERE and HAVING

FeatureWHEREHAVING
Applies toIndividual rows before groupingGroups after aggregation
Can use aggregate
Used forFiltering raw dataFiltering aggregated data

Example Comparison:

-- WHERE cannot use aggregates
SELECT department, AVG(salary)
FROM employees
WHERE AVG(salary) > 50000
GROUP BY department;
-- ❌ This will cause an error

-- Correct use with HAVING
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000;
-- ✅ Correct

Best Practices for Using HAVING

  1. Always Combine with GROUP BY: HAVING is meaningful only when groups are created.
  2. Use Aggregates Only: Avoid using non-aggregated columns in HAVING unless they appear in GROUP BY.
  3. Prefer Aliases for Clarity: Use descriptive names for aggregated columns.
  4. Optimize for Large Datasets: Index columns used in GROUP BY to improve query performance.
  5. Test Conditions Separately: Validate GROUP BY and aggregate functions with a simple query before applying HAVING.

Practical Use Cases

  • Identifying departments with high payroll costs.
  • Finding teams with exceptionally high or low average performance scores.
  • Reporting on stores with total sales exceeding targets.
  • Filtering product categories with a large number of items.

Leave a Reply

Your email address will not be published. Required fields are marked *