Advanced Filtering with WHERE Clauses in SQL
Description
This coding project will guide you through the essential skills needed to effectively filter and retrieve data from SQL databases. We will start with basic filtering and progressively work towards advanced techniques including multiple conditions, wildcard characters, and subqueries. Each step includes practical examples to solidify your understanding and enhance your coding proficiency.
The original prompt:
Create a detailed guide around the following topic - 'Advanced Filtering with WHERE Clauses'. Be informative by explaining the concepts thoroughly. Also, add many examples to assist with the understanding of topics.
Introduction to Basic WHERE Clauses
Setup Instructions
- Create Sample Database and Table
CREATE DATABASE SchoolDB; USE SchoolDB; CREATE TABLE Students ( StudentID INT PRIMARY KEY, FirstName VARCHAR(50), LastName VARCHAR(50), Age INT, Gender VARCHAR(1), Grade INT ); INSERT INTO Students (StudentID, FirstName, LastName, Age, Gender, Grade) VALUES (1, 'John', 'Doe', 15, 'M', 10), (2, 'Jane', 'Smith', 14, 'F', 9), (3, 'Emily', 'Jones', 15, 'F', 10), (4, 'Michael', 'Brown', 16, 'M', 11), (5, 'Emma', 'Wilson', 14, 'F', 9);
Basic WHERE Clauses Examples
Select Students with Age 15
SELECT * FROM Students WHERE Age = 15;
Select Female Students
SELECT * FROM Students WHERE Gender = 'F';
Select Students in Grade 10
SELECT * FROM Students WHERE Grade = 10;
Select Students with Last Name 'Smith'
SELECT * FROM Students WHERE LastName = 'Smith';
Select Students Older Than 14
SELECT * FROM Students WHERE Age > 14;
Combining Conditions
Select Female Students with Grade 9
SELECT * FROM Students WHERE Gender = 'F' AND Grade = 9;
Select Male Students Older Than 14
SELECT * FROM Students WHERE Gender = 'M' AND Age > 14;
Select Students with Age 14 or 15
SELECT * FROM Students WHERE Age = 14 OR Age = 15;
Conclusion
Practice these examples in your SQL environment to get comfortable with basic WHERE clauses. These examples will serve as a foundation for more advanced filtering techniques.
Combining Conditions with AND & OR in SQL
-- Select all columns from the 'employees' table where the employee's department is 'Sales'
-- AND the employee's salary is greater than 50000, OR hire_date is before '2022-01-01'.
SELECT *
FROM employees
WHERE (department = 'Sales' AND salary > 50000)
OR hire_date < '2022-01-01';
-- Select specific columns from the 'products' table where the category is 'Electronics'
-- AND the product is available in stock, OR the product has a discount of more than 20%.
SELECT product_id, product_name, price, discount
FROM products
WHERE (category = 'Electronics' AND in_stock = 'Y')
OR discount > 20;
-- Select customers from the 'customers' table where the country is 'USA' OR the age is less than 30
-- AND the customer has made more than 3 purchases.
SELECT customer_id, customer_name, country, age, num_of_purchases
FROM customers
WHERE (country = 'USA' OR age < 30)
AND num_of_purchases > 3;
-- Select orders from the 'orders' table where the order total is greater than 200
-- OR the order_date is within the current year AND the order status is 'Confirmed'.
SELECT order_id, customer_id, order_total, order_date, status
FROM orders
WHERE (order_total > 200 OR YEAR(order_date) = YEAR(CURDATE()))
AND status = 'Confirmed';
-- Select employees from the 'employees' table where the position is 'Manager'
-- AND the years_of_experience is greater than 5 OR the age is greater than 40.
SELECT employee_id, name, position, years_of_experience, age
FROM employees
WHERE (position = 'Manager' AND years_of_experience > 5)
OR age > 40;
Practical Implementation of Using Wildcards for Pattern Matching
SQL Query Implementation
-- Select students whose first names start with 'A'
SELECT *
FROM students
WHERE first_name LIKE 'A%';
-- Select products with descriptions that end with 'xyz'
SELECT *
FROM products
WHERE description LIKE '%xyz';
-- Select employees with email addresses that include 'gmail'
SELECT *
FROM employees
WHERE email LIKE '%gmail%';
-- Select orders with IDs containing the number '123'
SELECT *
FROM orders
WHERE order_id LIKE '%123%';
-- Select books where the author's last name starts with 'S' and is followed by exactly four characters
SELECT *
FROM books
WHERE author_last_name LIKE 'S____';
Practical Notes:
%
matches any number of characters (including zero characters)._
matches exactly one character.
This set of examples can be integrated into the advanced section of your data retrieval project to help master wildcard pattern matching within SQL queries.
Practical Implementation: Incorporating Subqueries within WHERE Clauses
SQL Code Implementation
-- Creating sample tables for the demonstration
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department_id INT
);
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(100)
);
CREATE TABLE salaries (
employee_id INT,
salary DECIMAL(10, 2),
FOREIGN KEY (employee_id) REFERENCES employees(employee_id)
);
-- Sample data insertion
INSERT INTO departments (department_id, department_name) VALUES
(1, 'HR'), (2, 'Engineering'), (3, 'Marketing');
INSERT INTO employees (employee_id, name, department_id) VALUES
(1, 'Alice', 2), (2, 'Bob', 1), (3, 'Charlie', 2), (4, 'Daisy', 3);
INSERT INTO salaries (employee_id, salary) VALUES
(1, 75000), (2, 50000), (3, 100000), (4, 60000);
-- Query to find details of employees in departments with average salary greater than 60000
SELECT*
FROM employees
WHERE department_id IN
(
SELECT department_id
FROM salaries s
JOIN employees e ON s.employee_id = e.employee_id
GROUP BY department_id
HAVING AVG(s.salary) > 60000
);
Explanation of the SQL Code
- CREATE TABLE: Defines
employees
,departments
, andsalaries
tables. - INSERT INTO: Populates the tables with sample data.
- SELECT: Retrieves employee details.
- WHERE department_id IN (Subquery): Filters departments with an average salary above 60000 using a subquery inside the
WHERE
clause. - JOIN: Combines
salaries
andemployees
tables onemployee_id
. - GROUP BY: Groups results based on
department_id
. - HAVING AVG(s.salary) > 60000: Ensures only departments where the average salary exceeds 60000 are selected.
This concise, complete implementation allows you to incorporate subqueries in WHERE
clauses effectively.
Complex Data Retrieval with Aggregate Functions using WHERE Clauses in SQL
-- Filtering employees, we want to select employees who have a salary
-- greater than the average salary of the department they belong to.
SELECT
employee_id,
employee_name,
department_id,
salary
FROM
employees e
WHERE
salary > (
SELECT
AVG(salary)
FROM
employees
WHERE
department_id = e.department_id
);
-- Filtering products, we want to select products that have sales
-- above the average sales for their corresponding categories.
SELECT
product_id,
product_name,
category_id,
sales
FROM
products p
WHERE
sales > (
SELECT
AVG(sales)
FROM
products
WHERE
category_id = p.category_id
);
-- Filtering orders, we want to select orders placed on dates
-- where the total sales were greater than a specified threshold.
SELECT
order_id,
order_date,
customer_id,
total_sales
FROM
orders o
WHERE
order_date IN (
SELECT
order_date
FROM
orders
GROUP BY
order_date
HAVING
SUM(total_sales) > 10000
);
This script can be used to perform advanced filtering using aggregate functions and WHERE
clauses, useful in different scenarios like employee salary review, product sales analysis, and order filtering. Make sure your database tables (employees
, products
, orders
) contain relevant columns as referenced in the queries for smooth execution.
Optimizing Query Performance - Advanced Filtering Techniques with WHERE Clauses
-- Use Indexes on Columns Frequently Used in WHERE Clauses
CREATE INDEX idx_customer_name ON customers(name);
-- Use Proper Data Types and Avoid Functions on Indexed Columns
SELECT customer_id, name, email
FROM customers
WHERE name = 'John Doe';
-- Use BETWEEN for Range Queries
SELECT order_id, order_date, total
FROM orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';
-- Use EXISTS for Subqueries Instead of IN
SELECT customer_id, name
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id AND o.total > 100
);
-- Use JOINs Over Subqueries for Enhanced Performance
SELECT c.customer_id, c.name, o.order_id, o.total
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.total > 100;
-- Avoid SELECT *
SELECT customer_id, name, email
FROM customers
WHERE email LIKE '%@example.com';
-- Use LIMIT and OFFSET for Paging
SELECT customer_id, name, email
FROM customers
WHERE email LIKE '%@example.com'
ORDER BY name
LIMIT 10 OFFSET 20;
-- Narrow Down Results as Early as Possible
SELECT customer_id, name, email
FROM customers
WHERE name LIKE 'J%'
AND email LIKE '%@example.com'
AND active = TRUE;
General Practices
- Create indexes on columns frequently used in WHERE clauses.
- Avoid using functions on indexed columns to retain index use.
- Use range queries (
BETWEEN
) efficiently. - Prefer
EXISTS
overIN
for better performance in subqueries. - Joins can be more efficient than subqueries when pulling related data.
- Select only needed columns, avoid
SELECT *
. - Apply paging using
LIMIT
andOFFSET
. - Filter out as much data as possible at the beginning of the query.
Apply these structured SQL improvements to optimize your query performance effectively.