Project

Advanced Filtering with WHERE Clauses in SQL

Learn how to master complex data retrieval using advanced filtering techniques with WHERE clauses in SQL.

Empty image or helper icon

Advanced Filtering with WHERE Clauses in SQL

Description

This coding project will guide you through the essential skills needed to effectively filter and retrieve data from SQL databases. We will start with basic filtering and progressively work towards advanced techniques including multiple conditions, wildcard characters, and subqueries. Each step includes practical examples to solidify your understanding and enhance your coding proficiency.

The original prompt:

Create a detailed guide around the following topic - 'Advanced Filtering with WHERE Clauses'. Be informative by explaining the concepts thoroughly. Also, add many examples to assist with the understanding of topics.

Introduction to Basic WHERE Clauses

Setup Instructions

  1. Create Sample Database and Table
    CREATE DATABASE SchoolDB;
    USE SchoolDB;
    
    CREATE TABLE Students (
        StudentID INT PRIMARY KEY,
        FirstName VARCHAR(50),
        LastName VARCHAR(50),
        Age INT,
        Gender VARCHAR(1),
        Grade INT
    );
    
    INSERT INTO Students (StudentID, FirstName, LastName, Age, Gender, Grade)
    VALUES
    (1, 'John', 'Doe', 15, 'M', 10),
    (2, 'Jane', 'Smith', 14, 'F', 9),
    (3, 'Emily', 'Jones', 15, 'F', 10),
    (4, 'Michael', 'Brown', 16, 'M', 11),
    (5, 'Emma', 'Wilson', 14, 'F', 9);

Basic WHERE Clauses Examples

  1. Select Students with Age 15

    SELECT * FROM Students
    WHERE Age = 15;
  2. Select Female Students

    SELECT * FROM Students
    WHERE Gender = 'F';
  3. Select Students in Grade 10

    SELECT * FROM Students
    WHERE Grade = 10;
  4. Select Students with Last Name 'Smith'

    SELECT * FROM Students
    WHERE LastName = 'Smith';
  5. Select Students Older Than 14

    SELECT * FROM Students
    WHERE Age > 14;

Combining Conditions

  1. Select Female Students with Grade 9

    SELECT * FROM Students
    WHERE Gender = 'F' AND Grade = 9;
  2. Select Male Students Older Than 14

    SELECT * FROM Students
    WHERE Gender = 'M' AND Age > 14;
  3. Select Students with Age 14 or 15

    SELECT * FROM Students
    WHERE Age = 14 OR Age = 15;

Conclusion

Practice these examples in your SQL environment to get comfortable with basic WHERE clauses. These examples will serve as a foundation for more advanced filtering techniques.

Combining Conditions with AND & OR in SQL

-- Select all columns from the 'employees' table where the employee's department is 'Sales'
-- AND the employee's salary is greater than 50000, OR hire_date is before '2022-01-01'.

SELECT *
FROM employees
WHERE (department = 'Sales' AND salary > 50000)
   OR hire_date < '2022-01-01';
-- Select specific columns from the 'products' table where the category is 'Electronics'
-- AND the product is available in stock, OR the product has a discount of more than 20%.

SELECT product_id, product_name, price, discount
FROM products
WHERE (category = 'Electronics' AND in_stock = 'Y')
   OR discount > 20;
-- Select customers from the 'customers' table where the country is 'USA' OR the age is less than 30
-- AND the customer has made more than 3 purchases.

SELECT customer_id, customer_name, country, age, num_of_purchases
FROM customers
WHERE (country = 'USA' OR age < 30)
   AND num_of_purchases > 3;
-- Select orders from the 'orders' table where the order total is greater than 200
-- OR the order_date is within the current year AND the order status is 'Confirmed'.

SELECT order_id, customer_id, order_total, order_date, status
FROM orders
WHERE (order_total > 200 OR YEAR(order_date) = YEAR(CURDATE()))
   AND status = 'Confirmed';
-- Select employees from the 'employees' table where the position is 'Manager'
-- AND the years_of_experience is greater than 5 OR the age is greater than 40.

SELECT employee_id, name, position, years_of_experience, age
FROM employees
WHERE (position = 'Manager' AND years_of_experience > 5)
   OR age > 40;

Practical Implementation of Using Wildcards for Pattern Matching

SQL Query Implementation

-- Select students whose first names start with 'A'
SELECT * 
FROM students
WHERE first_name LIKE 'A%';

-- Select products with descriptions that end with 'xyz'
SELECT *
FROM products
WHERE description LIKE '%xyz';

-- Select employees with email addresses that include 'gmail'
SELECT *
FROM employees
WHERE email LIKE '%gmail%';

-- Select orders with IDs containing the number '123'
SELECT *
FROM orders
WHERE order_id LIKE '%123%';

-- Select books where the author's last name starts with 'S' and is followed by exactly four characters
SELECT *
FROM books
WHERE author_last_name LIKE 'S____';

Practical Notes:

  • % matches any number of characters (including zero characters).
  • _ matches exactly one character.

This set of examples can be integrated into the advanced section of your data retrieval project to help master wildcard pattern matching within SQL queries.

Practical Implementation: Incorporating Subqueries within WHERE Clauses

SQL Code Implementation

-- Creating sample tables for the demonstration
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(100),
    department_id INT
);

CREATE TABLE departments (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(100)
);

CREATE TABLE salaries (
    employee_id INT,
    salary DECIMAL(10, 2),
    FOREIGN KEY (employee_id) REFERENCES employees(employee_id)
);

-- Sample data insertion
INSERT INTO departments (department_id, department_name) VALUES
(1, 'HR'), (2, 'Engineering'), (3, 'Marketing');

INSERT INTO employees (employee_id, name, department_id) VALUES
(1, 'Alice', 2), (2, 'Bob', 1), (3, 'Charlie', 2), (4, 'Daisy', 3);

INSERT INTO salaries (employee_id, salary) VALUES
(1, 75000), (2, 50000), (3, 100000), (4, 60000);

-- Query to find details of employees in departments with average salary greater than 60000

SELECT*
FROM employees
WHERE department_id IN
(
    SELECT department_id
    FROM salaries s
    JOIN employees e ON s.employee_id = e.employee_id
    GROUP BY department_id
    HAVING AVG(s.salary) > 60000
);

Explanation of the SQL Code

  • CREATE TABLE: Defines employees, departments, and salaries tables.
  • INSERT INTO: Populates the tables with sample data.
  • SELECT: Retrieves employee details.
  • WHERE department_id IN (Subquery): Filters departments with an average salary above 60000 using a subquery inside the WHERE clause.
  • JOIN: Combines salaries and employees tables on employee_id.
  • GROUP BY: Groups results based on department_id.
  • HAVING AVG(s.salary) > 60000: Ensures only departments where the average salary exceeds 60000 are selected.

This concise, complete implementation allows you to incorporate subqueries in WHERE clauses effectively.

Complex Data Retrieval with Aggregate Functions using WHERE Clauses in SQL

-- Filtering employees, we want to select employees who have a salary 
-- greater than the average salary of the department they belong to.

SELECT 
    employee_id,
    employee_name,
    department_id,
    salary
FROM 
    employees e
WHERE 
    salary > (
        SELECT 
            AVG(salary) 
        FROM 
            employees 
        WHERE 
            department_id = e.department_id
    );

-- Filtering products, we want to select products that have sales 
-- above the average sales for their corresponding categories.

SELECT 
    product_id,
    product_name,
    category_id,
    sales
FROM 
    products p
WHERE 
    sales > (
        SELECT 
            AVG(sales)
        FROM 
            products 
        WHERE 
            category_id = p.category_id
    );

-- Filtering orders, we want to select orders placed on dates 
-- where the total sales were greater than a specified threshold.

SELECT 
    order_id,
    order_date,
    customer_id,
    total_sales
FROM 
    orders o
WHERE 
    order_date IN (
        SELECT 
            order_date
        FROM 
            orders 
        GROUP BY 
            order_date
        HAVING 
            SUM(total_sales) > 10000
    );

This script can be used to perform advanced filtering using aggregate functions and WHERE clauses, useful in different scenarios like employee salary review, product sales analysis, and order filtering. Make sure your database tables (employees, products, orders) contain relevant columns as referenced in the queries for smooth execution.

Optimizing Query Performance - Advanced Filtering Techniques with WHERE Clauses

-- Use Indexes on Columns Frequently Used in WHERE Clauses
CREATE INDEX idx_customer_name ON customers(name);

-- Use Proper Data Types and Avoid Functions on Indexed Columns
SELECT customer_id, name, email 
FROM customers 
WHERE name = 'John Doe';

-- Use BETWEEN for Range Queries
SELECT order_id, order_date, total 
FROM orders 
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';

-- Use EXISTS for Subqueries Instead of IN
SELECT customer_id, name 
FROM customers c
WHERE EXISTS (
    SELECT 1 
    FROM orders o 
    WHERE o.customer_id = c.customer_id AND o.total > 100
);

-- Use JOINs Over Subqueries for Enhanced Performance
SELECT c.customer_id, c.name, o.order_id, o.total 
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.total > 100;

-- Avoid SELECT *
SELECT customer_id, name, email 
FROM customers 
WHERE email LIKE '%@example.com';

-- Use LIMIT and OFFSET for Paging
SELECT customer_id, name, email 
FROM customers 
WHERE email LIKE '%@example.com'
ORDER BY name
LIMIT 10 OFFSET 20;

-- Narrow Down Results as Early as Possible
SELECT customer_id, name, email 
FROM customers 
WHERE name LIKE 'J%' 
  AND email LIKE '%@example.com'
  AND active = TRUE;

General Practices

  • Create indexes on columns frequently used in WHERE clauses.
  • Avoid using functions on indexed columns to retain index use.
  • Use range queries (BETWEEN) efficiently.
  • Prefer EXISTS over IN for better performance in subqueries.
  • Joins can be more efficient than subqueries when pulling related data.
  • Select only needed columns, avoid SELECT *.
  • Apply paging using LIMIT and OFFSET.
  • Filter out as much data as possible at the beginning of the query.

Apply these structured SQL improvements to optimize your query performance effectively.