Project

Comprehensive Retail Sales Analysis Using SQL

This project aims to analyze retail sales data using SQL queries to draw meaningful insights and create visuals to aid business decisions.

Empty image or helper icon

Comprehensive Retail Sales Analysis Using SQL

Description

The project involves processing and analyzing sales data stored in a SQL database. Participants will write SQL queries to perform various kinds of analysis, including sales trends, category performance, and the impact of economic indexes and promotions on sales. The final output will include a series of charts and graphs that visually represent the insights derived from the data.

The original prompt:

Based on this dataset can you work out various types of analysis that can be completed and then create specific SQL code to get subsets of data that could be turned into charts to highlight the analysis.

I'm looking for a range of SQL code that could be easily used for the analysis of this dataset

Date Store Category Sales Economic_Index Promotion 1/1/2019 Store A Electronics 412 99.44954273 0 1/2/2019 Store A Electronics 202 98.85525499 0 1/3/2019 Store A Electronics 299 101.1858996 0 1/4/2019 Store A Electronics 181 98.69758959 1 1/5/2019 Store A Electronics 490 97.86122082 1 1/6/2019 Store A Electronics 306 103.6332865 1 1/7/2019 Store A Electronics 327 99.49102562 1 1/8/2019 Store A Electronics 197 102.1630683 0 1/9/2019 Store A Electronics 252 92.6185201 0 1/10/2019 Store A Electronics 413 98.16149309 0 1/11/2019 Store A Electronics 219 98.27721518 1 1/12/2019 Store A Electronics 188 104.2501821 0 1/13/2019 Store A Electronics 455 101.9019185 0 1/14/2019 Store A Electronics 169 103.4198824 0 1/15/2019 Store A Electronics 472 99.72160068 0

Introduction to SQL and Relational Databases

Understanding Relational Databases

A relational database is a type of database that stores and provides access to data points that are related to one another. Relational databases are based on the relational model, an intuitive, straightforward way to represent data in tables. In a relational database, each row in the table is a record with a unique ID called the key. The columns of the table hold attributes of the data, and each record usually has a value for each attribute, making it easy to establish relationships between data points.

SQL Basics

SQL (Structured Query Language) is the standard programming language used to interact with relational databases. It allows you to create, update, delete, and query data within a database. Below are some fundamental SQL statements and their purposes:

Data Definition Language (DDL)

  • CREATE: Used to create a new table or database.
CREATE DATABASE retail_sales;

USE retail_sales;

CREATE TABLE sales (
    sale_id INT PRIMARY KEY,
    product_id INT,
    quantity INT,
    price DECIMAL,
    sale_date DATE
);
  • ALTER: Modify an existing database object, like a table.
ALTER TABLE sales
ADD COLUMN customer_id INT;
  • DROP: Delete a table or database.
DROP TABLE sales;
DROP DATABASE retail_sales;

Data Manipulation Language (DML)

  • INSERT: Add new records to a table.
INSERT INTO sales (sale_id, product_id, quantity, price, sale_date)
VALUES (1, 101, 2, 19.99, '2023-01-15');
  • UPDATE: Modify existing records.
UPDATE sales
SET price = 17.99
WHERE sale_id = 1;
  • DELETE: Remove records from a table.
DELETE FROM sales
WHERE sale_id = 1;

Data Query Language (DQL)

  • SELECT: Retrieve data from the database.
SELECT * FROM sales;

SELECT product_id, SUM(quantity) as total_quantity
FROM sales
GROUP BY product_id;

Setting Up Your Environment

  1. Install a Database Server: To start working with SQL, you need a relational database management system (RDBMS) like MySQL, PostgreSQL, or SQLite.

  2. Connect to the Database: Use a client tool like mysql CLI for MySQL, psql for PostgreSQL, or visual clients like MySQL Workbench or pgAdmin.

    # For MySQL
    mysql -u  -p
    
    # For PostgreSQL
    psql -U  -d 
  3. Create Database and Table: Use the CREATE statements as shown above to set up your initial database and table structure.

Practical Example: Retail Sales Data Analysis

Let's assume you have a retail sales table and you want to analyze your sales data:

  1. Total Sales by Product:

    SELECT product_id, SUM(quantity * price) as total_sales
    FROM sales
    GROUP BY product_id;
  2. Monthly Sales Report:

    SELECT DATE_FORMAT(sale_date, '%Y-%m') as month, SUM(quantity * price) as total_sales
    FROM sales
    GROUP BY month;
  3. Top 5 Products by Sales:

    SELECT product_id, SUM(quantity * price) as total_sales
    FROM sales
    GROUP BY product_id
    ORDER BY total_sales DESC
    LIMIT 5;

This basic setup and these queries should provide a strong foundation for analyzing your retail sales data using SQL. Continue from here to build more complex queries and derive deeper insights from your data.

Setting Up the Database Environment

SQL Script for Environment Setup

The following SQL script will guide you through setting up your database environment for analyzing retail sales data. This includes creating the database, defining the required tables, and establishing relationships.

1. Create Database

CREATE DATABASE retail_sales_analysis;

2. Switch to the New Database

USE retail_sales_analysis;

3. Create Tables

3.1. Customers Table

CREATE TABLE customers (
    customer_id INT AUTO_INCREMENT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100),
    phone VARCHAR(20),
    address VARCHAR(255),
    city VARCHAR(100),
    state VARCHAR(100),
    zip_code VARCHAR(15),
    country VARCHAR(100)
);

3.2. Products Table

CREATE TABLE products (
    product_id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100),
    description TEXT,
    price DECIMAL(10, 2),
    category VARCHAR(100),
    stock_quantity INT
);

3.3. Sales Table

CREATE TABLE sales (
    sale_id INT AUTO_INCREMENT PRIMARY KEY,
    customer_id INT,
    product_id INT,
    quantity INT,
    sale_date DATE,
    total_amount DECIMAL(10, 2),
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id),
    FOREIGN KEY (product_id) REFERENCES products(product_id)
);

3.4. Employees Table

CREATE TABLE employees (
    employee_id INT AUTO_INCREMENT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100),
    phone VARCHAR(20),
    position VARCHAR(50),
    salary DECIMAL(10, 2),
    hire_date DATE
);

3.5. Store Information Table

CREATE TABLE stores (
    store_id INT AUTO_INCREMENT PRIMARY KEY,
    store_name VARCHAR(100),
    location VARCHAR(255),
    manager_id INT,
    FOREIGN KEY (manager_id) REFERENCES employees(employee_id)
);

4. Populate Tables with Sample Data

4.1. Insert Data into Customers Table

INSERT INTO customers (first_name, last_name, email, phone, address, city, state, zip_code, country)
VALUES
('John', 'Doe', 'john.doe@example.com', '123-456-7890', '123 Main St', 'Cityville', 'State', '12345', 'Country'),
('Jane', 'Smith', 'jane.smith@example.com', '098-765-4321', '456 Elm St', 'Townsville', 'State', '67890', 'Country');

4.2. Insert Data into Products Table

INSERT INTO products (name, description, price, category, stock_quantity)
VALUES
('Product A', 'Description for Product A', 19.99, 'Category 1', 100),
('Product B', 'Description for Product B', 29.99, 'Category 2', 150);

4.3. Insert Data into Sales Table

INSERT INTO sales (customer_id, product_id, quantity, sale_date, total_amount)
VALUES
(1, 1, 2, '2023-10-01', 39.98),
(2, 2, 1, '2023-10-02', 29.99);

4.4. Insert Data into Employees Table

INSERT INTO employees (first_name, last_name, email, phone, position, salary, hire_date)
VALUES
('Alice', 'Johnson', 'alice.johnson@example.com', '555-1234', 'Manager', 60000.00, '2020-01-15'),
('Bob', 'Brown', 'bob.brown@example.com', '555-5678', 'Cashier', 30000.00, '2019-06-01');

4.5. Insert Data into Stores Table

INSERT INTO stores (store_name, location, manager_id)
VALUES
('Store 1', '789 Oak St, Bigcity', 1),
('Store 2', '101 Pine St, Smalltown', 2);

Your database environment is now set up and ready to be used for analyzing retail sales data. You have created the necessary schemas and populated them with sample data to get started.

Loading and Preparing Data for Analysis in SQL

Load Data into SQL Tables

Assume you have a sales data file sales_data.csv that needs to be loaded into a sales table in your SQL database.

-- Create the table structure for sales data
CREATE TABLE sales (
    sale_id INT PRIMARY KEY,
    product_id INT,
    customer_id INT,
    sale_date DATE,
    sale_amount DECIMAL(10, 2)
);

-- Use SQL Server's BULK INSERT or similar for other SQL engines to load the CSV data into the sales table.
BULK INSERT sales
FROM 'C:\path\to\sales_data.csv'
WITH (
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n',
    FIRSTROW = 2
);

Data Cleaning

Ensure the data is clean and consistent. For instance, you may need to remove duplicate records, handle null values, and standardize date formats.

Remove Duplicate Records

DELETE FROM sales
WHERE sale_id IN (
    SELECT sale_id
    FROM (
        SELECT sale_id, ROW_NUMBER() OVER (PARTITION BY sale_id ORDER BY sale_id) AS row_num
        FROM sales
    ) t
    WHERE t.row_num > 1
);

Handle Null Values

Replace null values in sale_amount with 0.

UPDATE sales
SET sale_amount = 0
WHERE sale_amount IS NULL;

Standardize Date Format

Ensure sale_date is in YYYY-MM-DD format.

-- Assuming your SQL dialect stores DATE data type correctly.
-- Otherwise, you may need to convert the format explicitly when inserting.

Data Transformation

You may need to transform some columns for analysis. For example, extract the year and month from sale_date.

ALTER TABLE sales
ADD sale_year INT,
ADD sale_month INT;

UPDATE sales
SET sale_year = YEAR(sale_date),
    sale_month = MONTH(sale_date);

Indexing

Create indexes to optimize query performance.

CREATE INDEX idx_sale_date ON sales(sale_date);
CREATE INDEX idx_product_id ON sales(product_id);
CREATE INDEX idx_customer_id ON sales(customer_id);

Verify Data Integrity

Run basic queries to ensure your data has loaded and transformed correctly.

Count of Records

SELECT COUNT(*) AS total_records FROM sales;

Sample Records

SELECT TOP 10 * FROM sales;

Sales Summary

Get a quick summary to ensure key figures look correct.

SELECT sale_year, sale_month, SUM(sale_amount) AS total_sales
FROM sales
GROUP BY sale_year, sale_month
ORDER BY sale_year, sale_month;

Result

Now your data is loaded, cleaned, transformed, and ready for analysis. You can proceed to writing more complex queries to draw meaningful insights from your retail sales data.

Basic SQL Queries for Data Retrieval

Fetch All Records

To retrieve all records from a table, use the SELECT * statement. For instance, fetching all records from the sales table:

SELECT * FROM sales;

Filtering Data

Use the WHERE clause to filter records. For example, to retrieve sales records where the total amount is greater than $100:

SELECT * FROM sales
WHERE total_amount > 100;

Sorting Data

To sort the data, use the ORDER BY clause. To sort sales records by total amount in descending order:

SELECT * FROM sales
ORDER BY total_amount DESC;

Selecting Specific Columns

Fetch specific columns to narrow down the retrieved data. To get only the date and total_amount from the sales table:

SELECT date, total_amount FROM sales;

Aggregating Data

Use aggregate functions like COUNT, SUM, AVG, MIN, and MAX to perform calculations on your data. For instance, to get the total sales amount:

SELECT SUM(total_amount) AS total_sales FROM sales;

Grouping Data

Use the GROUP BY clause for grouping records. Combined with aggregate functions, it provides useful summaries. To get total sales amount per day:

SELECT date, SUM(total_amount) AS daily_sales
FROM sales
GROUP BY date;

Joining Tables

Use the JOIN clause to combine records from two or more tables. If you have a customers table, to join it with the sales table based on customer_id:

SELECT sales.date, sales.total_amount, customers.customer_name
FROM sales
JOIN customers ON sales.customer_id = customers.customer_id;

Filtering Grouped Data

To filter grouped records, use the HAVING clause. For instance, to get dates with total sales greater than $500:

SELECT date, SUM(total_amount) AS daily_sales
FROM sales
GROUP BY date
HAVING daily_sales > 500;

Using Subqueries

A subquery can be used to provide data to the main query. To find customers who made sales greater than $2000:

SELECT customer_name
FROM customers
WHERE customer_id IN (
    SELECT customer_id
    FROM sales
    GROUP BY customer_id
    HAVING SUM(total_amount) > 2000
);

These examples should provide you with the foundational SQL queries necessary to retrieve and analyze retail sales data effectively.

Aggregating Data Using SQL

In this section of the project, we will focus on aggregating data using SQL to extract meaningful insights from retail sales data. We'll cover various aggregation functions and techniques to summarize data and draw insights. Let's assume we have a table named sales with the following columns: sale_id, product_id, customer_id, store_id, sale_date, quantity, price.

Aggregation Queries

1. Total Sales Revenue

To calculate the total sales revenue, we can use the SUM() function on the price column multiplied by the quantity column.

SELECT SUM(price * quantity) AS total_revenue
FROM sales;

2. Total Quantity Sold

To find the total quantity of products sold, we use the SUM() function on the quantity column.

SELECT SUM(quantity) AS total_quantity_sold
FROM sales;

3. Average Sale Amount

To calculate the average sale amount, we use the AVG() function on the total price (price * quantity).

SELECT AVG(price * quantity) AS average_sale_amount
FROM sales;

4. Sales Revenue by Product

To find the sales revenue for each product, we group by the product_id and use the SUM() function.

SELECT product_id, SUM(price * quantity) AS product_revenue
FROM sales
GROUP BY product_id;

5. Sales Revenue by Store

To find the sales revenue for each store, we group by the store_id and use the SUM() function.

SELECT store_id, SUM(price * quantity) AS store_revenue
FROM sales
GROUP BY store_id;

6. Number of Sales per Customer

To count the number of sales transactions per customer, we group by the customer_id and use the COUNT() function.

SELECT customer_id, COUNT(sale_id) AS number_of_sales
FROM sales
GROUP BY customer_id;

7. Monthly Sales Revenue

To calculate monthly sales revenue, we need to extract the month and year from the sale_date and then group by these values.

SELECT EXTRACT(YEAR FROM sale_date) AS year, 
       EXTRACT(MONTH FROM sale_date) AS month,
       SUM(price * quantity) AS monthly_revenue
FROM sales
GROUP BY year, month
ORDER BY year, month;

8. Top 5 Products by Revenue

To find the top 5 products by revenue, we use the SUM() function to calculate revenue for each product and then use the ORDER BY clause and LIMIT clause to get the top 5.

SELECT product_id, SUM(price * quantity) AS product_revenue
FROM sales
GROUP BY product_id
ORDER BY product_revenue DESC
LIMIT 5;

9. Average Quantity Sold per Product

To find the average quantity sold per product, we use the AVG() function on the quantity column grouped by product_id.

SELECT product_id, AVG(quantity) AS average_quantity_sold
FROM sales
GROUP BY product_id;

Aggregating Data for Visualizations

To aid in business decision-making, we can aggregate data specifically for visualizations. For instance, aggregating data for a sales trend chart could look like this:

Monthly Sales Trend

SELECT EXTRACT(YEAR FROM sale_date) AS year, 
       EXTRACT(MONTH FROM sale_date) AS month,
       SUM(price * quantity) AS monthly_revenue
FROM sales
GROUP BY year, month
ORDER BY year, month;

This query provides a structured timeline of monthly sales revenue, which can be used to plot a line chart or bar chart to visualize sales trends over time.

Conclusion

These SQL queries demonstrate various methods of aggregating retail sales data to extract insightful metrics that can inform business decisions. The examples provided cover the most common types of aggregation tasks you might need when analyzing sales data.

Analyzing Sales Trends Over Time

SQL Implementation

To analyze sales trends over time using SQL, we need to perform a few steps:

  1. Aggregate sales data on different time granularity levels such as daily, monthly, and yearly.
  2. Calculate year-over-year (YoY), month-over-month (MoM), or day-over-day (DoD) growth rates.
  3. Create views or temporary tables for visualizations.

Here’s how you can implement these steps:

-- Aggregate daily sales
SELECT 
    DATE(sale_date) AS day,
    SUM(sales_amount) AS total_sales
FROM 
    sales
GROUP BY 
    DATE(sale_date)
ORDER BY 
    DATE(sale_date);

-- Aggregate monthly sales
SELECT 
    DATE_FORMAT(sale_date, '%Y-%m') AS month,
    SUM(sales_amount) AS total_sales
FROM 
    sales
GROUP BY 
    DATE_FORMAT(sale_date, '%Y-%m')
ORDER BY 
    DATE_FORMAT(sale_date, '%Y-%m');

-- Aggregate yearly sales
SELECT 
    YEAR(sale_date) AS year,
    SUM(sales_amount) AS total_sales
FROM 
    sales
GROUP BY 
    YEAR(sale_date)
ORDER BY 
    YEAR(sale_date);
Calculate Growth Rates
-- Calculate Month-over-Month growth
SELECT 
    DATE_FORMAT(sale_date, '%Y-%m') AS month,
    SUM(sales_amount) AS total_sales,
    (SUM(sales_amount) - LAG(SUM(sales_amount)) OVER (ORDER BY DATE_FORMAT(sale_date, '%Y-%m'))) / LAG(SUM(sales_amount)) OVER (ORDER BY DATE_FORMAT(sale_date, '%Y-%m')) * 100 AS MoM_growth
FROM 
    sales
GROUP BY 
    DATE_FORMAT(sale_date, '%Y-%m')
ORDER BY 
    DATE_FORMAT(sale_date, '%Y-%m');

-- Calculate Year-over-Year growth
SELECT 
    YEAR(sale_date) AS year,
    SUM(sales_amount) AS total_sales,
    (SUM(sales_amount) - LAG(SUM(sales_amount)) OVER (ORDER BY YEAR(sale_date)) ) / LAG(SUM(sales_amount)) OVER (ORDER BY YEAR(sale_date)) * 100 AS YoY_growth
FROM 
    sales
GROUP BY 
    YEAR(sale_date)
ORDER BY 
    YEAR(sale_date);

Creating Views for Visualization

-- Create a view for daily sales
CREATE VIEW daily_sales AS
SELECT 
    DATE(sale_date) AS day,
    SUM(sales_amount) AS total_sales
FROM 
    sales
GROUP BY 
    DATE(sale_date);

-- Create a view for monthly sales
CREATE VIEW monthly_sales AS
SELECT 
    DATE_FORMAT(sale_date, '%Y-%m') AS month,
    SUM(sales_amount) AS total_sales
FROM 
    sales
GROUP BY 
    DATE_FORMAT(sale_date, '%Y-%m');

-- Create a view for yearly sales
CREATE VIEW yearly_sales AS
SELECT 
    YEAR(sale_date) AS year,
    SUM(sales_amount) AS total_sales
FROM 
    sales
GROUP BY 
    YEAR(sale_date);

These SQL queries and views will help you to analyze the sales trends over different periods, calculate growth rates, and prepare the data for visualization purposes.

7. Impact of Economic Indicators on Sales

This section focuses on analyzing how various economic indicators influence retail sales. Assuming the economic indicators data are already imported into the database, we will perform SQL queries to draw insights.

Prerequisites

  • sales_data table: Contains retail sales information (columns: date, product_id, amount, quantity, etc.)
  • economic_indicators table: Contains economic indicators information (columns: date, indicator_name, indicator_value, etc.)

Step 1: Data Join

To analyze the impact of economic indicators on sales, we need to join the sales_data with the economic_indicators table based on the date column.

CREATE VIEW sales_with_indicators AS
SELECT 
    s.date,
    s.product_id,
    s.amount,
    s.quantity,
    e.indicator_name,
    e.indicator_value
FROM 
    sales_data s
JOIN 
    economic_indicators e
ON 
    s.date = e.date;

Step 2: Aggregation by Indicators

Aggregate the sales data by different economic indicators to understand their direct impact.

CREATE VIEW aggregated_sales_by_indicators AS
SELECT 
    indicator_name,
    indicator_value,
    SUM(amount) as total_sales,
    SUM(quantity) as total_quantity
FROM 
    sales_with_indicators
GROUP BY 
    indicator_name, indicator_value;

Step 3: Correlation Analysis

To quantify the impact, compute the correlation between economic indicators and sales figures.

-- Use aggregate function to pull the data needed for correlation calculation.

WITH sales_cor AS (
    SELECT 
        indicator_name,
        indicator_value,
        total_sales
    FROM 
        aggregated_sales_by_indicators
)

-- Assuming you have a function `CORR` available for computing correlation
SELECT 
    indicator_name,
    CORR(indicator_value, total_sales) AS sales_correlation
FROM 
    sales_cor
GROUP BY 
    indicator_name;

Step 4: Visualize the Impact

You can use external tools like Tableau, PowerBI, or your preferred visualization tool to create visualizations. Here's an example SQL query to extract data for a specific indicator for visualization:

SELECT 
    date,
    indicator_value AS economic_indicator,
    SUM(amount) AS total_sales
FROM 
    sales_with_indicators
WHERE 
    indicator_name = 'specific_indicator_name'
GROUP BY 
    date, economic_indicator
ORDER BY 
    date;

Conclusion

The SQL queries outlined above allow detailed analysis of the relationship between economic indicators and retail sales. By joining the data sources, aggregating relevant stats, and calculating correlations, we can draw meaningful insights. Visualizing these relationships provides a comprehensive view of the impact.

Apply the extracted data to your visualization tool of choice for further graphical insights.

Analyzing Sales by Category and Store

SQL Queries to Analyze Sales by Category and Store

Step 1: Total Sales by Category and Store

This query calculates the total sales for each category and store.

SELECT 
    store_name, 
    category, 
    SUM(sales_amount) AS total_sales
FROM 
    sales
JOIN 
    products ON sales.product_id = products.product_id
JOIN 
    stores ON sales.store_id = stores.store_id
GROUP BY 
    store_name, 
    category
ORDER BY 
    store_name, 
    category;

Step 2: Average Sales by Category and Store

This query calculates the average sales for each category and store.

SELECT 
    store_name, 
    category, 
    AVG(sales_amount) AS average_sales
FROM 
    sales
JOIN 
    products ON sales.product_id = products.product_id
JOIN 
    stores ON sales.store_id = stores.store_id
GROUP BY 
    store_name, 
    category
ORDER BY 
    store_name, 
    category;

Step 3: Number of Sales Transactions by Category and Store

This query counts the number of sales transactions for each category and store.

SELECT 
    store_name, 
    category, 
    COUNT(sales.transaction_id) AS number_of_sales
FROM 
    sales
JOIN 
    products ON sales.product_id = products.product_id
JOIN 
    stores ON sales.store_id = stores.store_id
GROUP BY 
    store_name, 
    category
ORDER BY 
    store_name, 
    category;

Step 4: Visualizing Sales by Category and Store

Though SQL does not directly create visuals, we can extract the data and utilize a visualization tool like Excel, Tableau, or a matplotlib script in Python. Here’s an example of how the data might be prepped for export:

SELECT 
    store_name, 
    category, 
    SUM(sales_amount) AS total_sales,
    AVG(sales_amount) AS average_sales,
    COUNT(sales.transaction_id) AS number_of_sales
FROM 
    sales
JOIN 
    products ON sales.product_id = products.product_id
JOIN 
    stores ON sales.store_id = stores.store_id
GROUP BY 
    store_name, 
    category
ORDER BY 
    store_name, 
    category;

Export the results of this final query into a CSV file, which can subsequently be used to create visuals.

Conclusion

These SQL queries will help to aggregate and analyze sales data by category and store. You can copy the results into visualization tools for further analysis and reporting.

Effect of Promotions on Sales Performance

In this section, we will focus on how to analyze the effect of promotions on sales performance using SQL queries.

Step 1: Identify Relevant Tables and Columns

Assume we have the following tables in the database:

  • sales: Contains sales transactions

    • sale_id: Primary key
    • date: Date of the sale
    • store_id: Store identifier
    • product_id: Product identifier
    • quantity_sold: Quantity of product sold
    • total_amount: Total sale amount
  • promotions: Contains promotion information

    • promo_id: Primary key
    • promo_desc: Description of promotion
    • start_date: Promotion start date
    • end_date: Promotion end date
    • discount: Discount percentage
  • products: Contains product information

    • product_id: Primary key
    • product_name: Name of the product
    • category: Category of the product

Step 2: Analyze Sales Performance with and without Promotions

To determine the effect of promotions on sales performance, we will compare total sales during promotion periods to periods without promotions.

Query 1: Total Sales During Promotion Periods

SELECT 
    p.product_id,
    p.product_name,
    SUM(s.quantity_sold) AS total_quantity_sold,
    SUM(s.total_amount) AS total_sales_amount
FROM 
    sales s
JOIN 
    products p ON s.product_id = p.product_id
JOIN 
    promotions pr ON s.date BETWEEN pr.start_date AND pr.end_date
GROUP BY 
    p.product_id, p.product_name
ORDER BY 
    total_sales_amount DESC;

Query 2: Total Sales Outside of Promotion Periods

SELECT 
    p.product_id,
    p.product_name,
    SUM(s.quantity_sold) AS total_quantity_sold,
    SUM(s.total_amount) AS total_sales_amount
FROM 
    sales s
JOIN 
    products p ON s.product_id = p.product_id
WHERE 
    s.date NOT BETWEEN 
    (SELECT start_date FROM promotions WHERE product_id = p.product_id) AND 
    (SELECT end_date FROM promotions WHERE product_id = p.product_id)
GROUP BY 
    p.product_id, p.product_name
ORDER BY 
    total_sales_amount DESC;

Step 3: Calculate the Effectiveness of Promotions

To quantify the effect, we can compute the difference or ratio between sales during promotions and sales outside promotions.

Query 3: Sales Performance Comparison

WITH promoted_sales AS (
    SELECT 
        p.product_id,
        SUM(s.total_amount) AS promoted_sales_amount
    FROM 
        sales s
    JOIN 
        products p ON s.product_id = p.product_id
    JOIN 
        promotions pr ON s.date BETWEEN pr.start_date AND pr.end_date
    GROUP BY 
        p.product_id
),
non_promoted_sales AS (
    SELECT 
        p.product_id,
        SUM(s.total_amount) AS non_promoted_sales_amount
    FROM 
        sales s
    JOIN 
        products p ON s.product_id = p.product_id
    WHERE 
        s.date NOT BETWEEN 
        (SELECT start_date FROM promotions WHERE product_id = p.product_id) AND 
        (SELECT end_date FROM promotions WHERE product_id = p.product_id)
    GROUP BY 
        p.product_id
)

SELECT 
    ps.product_id,
    p.product_name,
    ps.promoted_sales_amount,
    nps.non_promoted_sales_amount,
    (ps.promoted_sales_amount - nps.non_promoted_sales_amount) AS sales_difference,
    (ps.promoted_sales_amount / nps.non_promoted_sales_amount) AS sales_ratio
FROM 
    promoted_sales ps
JOIN 
    non_promoted_sales nps ON ps.product_id = nps.product_id
JOIN 
    products p ON ps.product_id = p.product_id
ORDER BY 
    sales_difference DESC;

Conclusion

The provided queries help you analyze sales performance by comparing data from promotion periods against non-promotion periods. These insights will enable you to understand the impact of promotions on sales and make data-driven decisions for future promotions.

Creating Visual Representations of Data Insights

To create visual representations of data insights from SQL queries, you can use a reporting or visualization tool such as Tableau, Power BI, or similar platforms. Below are the steps and SQL queries for extracting data required for the visual representations:

1. Total Sales Over Time

SELECT 
    DATE_TRUNC('month', sale_date) AS sale_month,
    SUM(total_amount) AS total_sales
FROM 
    sales
GROUP BY 
    sale_month
ORDER BY 
    sale_month;

2. Sales by Category

SELECT
    category,
    SUM(total_amount) AS total_sales
FROM
    sales
JOIN
    products
ON
    sales.product_id = products.product_id
GROUP BY
    category
ORDER BY
    total_sales DESC;

3. Sales Performance by Store

SELECT
    store_name,
    SUM(total_amount) AS total_sales
FROM
    sales
JOIN
    stores
ON
    sales.store_id = stores.store_id
GROUP BY
    store_name
ORDER BY
    total_sales DESC;

4. Effectiveness of Promotions

SELECT
    p.promotion_name,
    SUM(s.total_amount) AS total_sales,
    COUNT(s.sale_id) AS total_transactions
FROM
    sales s
JOIN
    promotions p
ON
    s.promotion_id = p.promotion_id
GROUP BY
    p.promotion_name
ORDER BY
    total_sales DESC;

5. Impact of Economic Indicators on Sales

SELECT
    e.economic_indicator,
    e.indicator_value,
    SUM(s.total_amount) AS total_sales
FROM
    sales s
JOIN
    economic_data e
ON
    DATE_TRUNC('month', s.sale_date) = DATE_TRUNC('month', e.date)
GROUP BY
    e.economic_indicator, e.indicator_value
ORDER BY
    e.indicator_value DESC;

Exporting Queries for Visualization

After extracting the data with the above SQL queries, export these results to a CSV file or connect the above SQL queries directly to a visualization tool like Tableau or Power BI.

Example Workflow in Tableau:

  1. Connect to SQL Database: Open Tableau > Connect to Data > Select your SQL Database.
  2. Write Custom SQL Queries: In the Data source tab, use New Custom SQL to paste the queries above and load data.
  3. Create Visualizations: Use the worksheet area to drag and drop the required fields, creating bar charts, line graphs, or other visualizations based on the exported data.
  4. Combine Visuals in a Dashboard: Create a new dashboard in Tableau and add all the visualizations created from the SQL queries.
  5. Publish Dashboard: Share the insights with stakeholders by publishing the dashboard to Tableau server or Tableau Public.

Example Workflow in Power BI:

  1. Connect to SQL Database: Open Power BI Desktop > Get Data > SQL Server.
  2. Write Custom Queries: Use the Advanced Editor to insert SQL queries and load data.
  3. Create Visualizations: Utilize the Report view to draw different visuals such as pie charts, line charts, bar graphs, etc., by dragging the required fields.
  4. Arrange in Report View: Organize all visuals into different pages or sections of the report.
  5. Publish Report: Publish the report to Power BI Service to share with team members.

By following this implementation, you'll be able to create detailed visual representations of your retail sales data, providing meaningful insights for business decisions.