Project

Absenteeism Analysis with Power BI using DAX and SQL

This project aims to calculate and analyze the percentage of absenteeism for the financial year to date using Power BI, leveraging DAX and SQL.

Empty image or helper icon

Absenteeism Analysis with Power BI using DAX and SQL

Description

The project will guide you through the process of importing data, creating necessary relationships, and writing DAX and SQL queries to calculate absenteeism percentages. You will then visualize the data in Power BI to derive meaningful insights. By the end of this project, you will be equipped with the skills to perform data analysis and create dynamic reports on employee absenteeism.

The original prompt:

I need to calculate the percentage of absenteeism for the financial year to date.

Data Import and Preparation

In this section, we will focus on importing and preparing data for calculating and analyzing the percentage of absenteeism for the financial year to date using Power BI, with DAX and SQL.

Data Import in Power BI

  1. Open Power BI Desktop:

    • Launch Power BI Desktop application.
  2. Get Data:

    • Click on the Home ribbon.
    • Select Get Data.
  3. Choose Data Source:

    • Choose the appropriate data source (e.g., SQL Server, Excel, CSV).
    • For SQL Server:
      • Click SQL Server.
      • Enter Server and Database details.
      • Provide the credentials if necessary.
    • For other sources:
      • Click the corresponding data source (e.g., Excel, Text/CSV) and browse to the file location.
  4. Load Data:

    • Once the data source is connected, select the relevant tables or views you need.
    • Click Load.

Data Preparation Using Power Query

  1. Transform Data:

    • After loading your data into Power BI, navigate to the Transform Data tab. This will open the Power Query Editor.
  2. Inspect and Clean Data:

    • Inspect the data for any anomalies, such as null values, duplicates, or incorrect data types.
    • Remove unnecessary columns.
    • Filter out irrelevant rows.

    Example steps in Power Query:

    // Remove null values
    Select Column -> Remove Rows -> Remove Rows with Errors
    
    // Remove duplicates
    Select Column -> Remove Rows -> Remove Duplicates
    
    // Convert Data Types
    Select Column -> Data Type -> Choose appropriate type (e.g., Date, Text, Number)
  3. Create Necessary Columns:

    • Create new columns if needed, for example, a calculated column for 'Absenteeism Percentage'.

    Example DAX formula for Absenteeism Percentage in Power BI:

    Absenteeism_Percentage = DIVIDE(ABSENT_DAYS, TOTAL_WORK_DAYS, 0)
  4. Ensure Data Consistency:

    • Check for consistency in date formats, text casing, and numerical precision.
  5. Apply Changes:

    • Click on Close & Apply to save and apply transformations.

Creating Calculated Columns and Measures using DAX

  1. Create Measures:

    • Go to the Modeling tab.
    • Click on New Measure.
  2. Define Measure:

    • Define the DAX formula for absenteeism percentage calculation.

    Example:

    Total_Work_Days = SUM('Attendance'[Work_Days])
    Total_Absent_Days = SUM('Attendance'[Absent_Days])
    Absenteeism_Percentage = DIVIDE([Total_Absent_Days], [Total_Work_Days], 0)

Data Import Using SQL (if applicable)

  1. Create SQL Queries:

    • Write SQL queries to import relevant data needed for the analysis.

    Example:

    -- Query to get attendance data
    SELECT 
        EmployeeID,
        Work_Days,
        Absent_Days,
        Date
    FROM 
        Attendance
    WHERE
        Date BETWEEN '2022-04-01' AND '2023-03-31';  -- Financial Year to Date
  2. Execute Queries in Power BI:

    • In Power BI Desktop, use the Advanced option in the Get Data for SQL Server
    • Paste your SQL query to import the data.

By following these steps, you will effectively import and prepare your data in Power BI, ensuring it is ready for detailed analysis using DAX formulas and SQL queries.

Data Relationships and Models

1. Define Relationships in Power BI

  1. Import Data into Power BI, ensuring that all relevant tables are loaded.

  2. Create Relationships by navigating to the 'Model' view:

    • For instance, assuming you have three tables: Employees, Attendance, and Dates.
    • Link Employees[EmployeeID] to Attendance[EmployeeID].
    • Link Attendance[Date] to Dates[Date].

2. DAX for Calculating Absenteeism

Example Measures

  1. Total Days:

    TotalDays = COUNTROWS(Dates)
  2. Total Working Days:

    TotalWorkingDays = 
        CALCULATE(
            COUNTROWS(Dates),
            Dates[IsWorkingDay] = TRUE()  -- Assuming a flag indicating working days.
        )
  3. Total Absenteeism Days:

    TotalAbsenteeismDays = 
        CALCULATE(
            COUNTROWS(Attendance),
            Attendance[IsAbsent] = TRUE()  -- Assuming a flag indicating absenteeism.
        )
  4. Absenteeism Percentage Per Employee:

    AbsenteeismPercentage = 
        DIVIDE(
            [TotalAbsenteeismDays],
            [TotalWorkingDays],
            0
        ) * 100

3. SQL for Data Preparation

Assuming you have a database from where you get the raw data, you might need a script to preprocess it.

  1. Create Absenteeism View:

    CREATE VIEW EmployeeAbsenteeism AS
    SELECT 
        e.EmployeeID,
        SUM(CASE WHEN a.IsAbsent = 1 THEN 1 ELSE 0 END) AS TotalAbsenteeismDays,
        COUNT(d.Date) AS TotalWorkingDays
    FROM 
        Employees e
    JOIN 
        Attendance a ON e.EmployeeID = a.EmployeeID
    JOIN 
        Dates d ON a.Date = d.Date
    WHERE
        d.IsWorkingDay = 1
    GROUP BY 
        e.EmployeeID;
  2. Absenteeism Percentage Calculation (if needed in SQL):

    SELECT 
        EmployeeID,
        (TotalAbsenteeismDays * 100.0 / TotalWorkingDays) AS AbsenteeismPercentage
    FROM 
        EmployeeAbsenteeism;

4. Power BI Report and Visualization

  1. Import the Preprocessed Data:

    • Use SQL views or pre-calculated tables to bring the dataset into Power BI.
  2. Create Visuals:

    • Use charts like bar charts or pie charts to visualize AbsenteeismPercentage per employee.
    • Create slicers for better data filtering and analysis.
  3. KPIs:

    • Add Key Performance Indicators (KPIs) to track overall absenteeism trends.

Summary

By setting up relationships in Power BI, defining key metrics with DAX, preprocessing data with SQL views, and creating visualizations, you can effectively analyze absenteeism percentages in your organization. Apply these steps as indicated, aligning with your existing data preparation methods.

DAX Fundamentals and Calculations for Absenteeism Analysis in Power BI

Overview

This section will cover the practical implementation of DAX (Data Analysis Expressions) to calculate and analyze the percentage of absenteeism for the financial year to date. This assumes data has already been imported, relationships are set up, and models are prepared.

SQL Query to Extract Absenteeism Data

SELECT 
    EmployeeID,
    AbsenceDate,
    ISNULL(AbsenceHours, 0) AS AbsenceHours
FROM 
    AbsenteeismTable
WHERE 
    YEAR(AbsenceDate) = YEAR(GETDATE())

DAX Calculations

Calculation Columns

Total Absence Hours

TotalAbsenceHours = 
SUMX(AbsenteeismTable, AbsenteeismTable[AbsenceHours])

Measures

Total Working Hours YTD

TotalWorkingHoursYTD = 
CALCULATE(
    SUM(WorkingHoursTable[WorkingHours]),
    DATESYTD(Calendar[Date], "30/6")  -- Assuming financial year ends on June 30th
)

Total Absence Hours YTD

TotalAbsenceHoursYTD = 
CALCULATE(
    SUM(AbsenteeismTable[AbsenceHours]),
    DATESYTD(Calendar[Date], "30/6")  -- Assuming financial year ends on June 30th
)

Absenteeism Percentage YTD

AbsenteeismPercentageYTD = 
DIVIDE(
    [TotalAbsenceHoursYTD],
    [TotalWorkingHoursYTD],
    0 
) * 100

Visualizations in Power BI

Steps to Create the Visualization

  1. Add a Card Visualization:

    • Drag and drop the AbsenteeismPercentageYTD measure to display the overall percentage of absenteeism YTD.
  2. Add a Line Chart Visualization:

    • Axis: Calendar[Date]
    • Values: AbsenteeismPercentageYTD
    • This can show the trend of absenteeism over the financial year.
  3. Add a Table Visualization:

    • Columns: EmployeeID, TotalAbsenceHoursYTD, AbsenteeismPercentageYTD
    • This can provide detailed view per employee.

Conclusion

By following the DAX calculations and utilizing SQL for data extraction, this setup allows for effective calculation and analysis of absenteeism percentage for the financial year to date in Power BI. This guide should enable you to calculate, visualize, and interpret absenteeism data accurately.

Implementing SQL Integration with Power BI to Analyze Absenteeism

Connecting Power BI to SQL Server

  1. Open Power BI Desktop.

  2. Navigate to the Home Tab:

    • Click on Get Data.
    • Choose SQL Server from the list of available data sources.
  3. Enter SQL Server Connection Details:

    • Server: <Your_SQL_Server_Name>
    • Database: <Your_Database_Name>
    • Click OK.
  4. Authentication:

    • Choose the appropriate authentication method (Windows or database credentials).
    • Click Connect.
  5. Select Data for Import:

    • In the Navigator window, select the tables or views relevant to absenteeism data.
    • Click Load to import the data into Power BI.

Writing SQL Queries for Data Extraction

  1. Advanced SQL Query for Absenteeism Data:

    • Instead of importing entire tables, you can use an advanced SQL query to pull only the required data.
    SELECT 
        EmployeeID, 
        AbsenceDate, 
        AbsenceReason, 
        DATEDIFF(day, AbsenceStartDate, AbsenceEndDate) AS AbsenceDays
    FROM 
        EmployeeAbsences
    WHERE 
        AbsenceDate BETWEEN '2023-01-01' AND GETDATE();
  2. Execute the SQL Query:

    • In the Power BI data source setup, choose Advanced Options.
    • Paste the SQL query and click OK to execute and load the result.

Creating Measures for Absenteeism Analysis using DAX

  1. Calculate Total Days of Absenteeism:

    TotalAbsenceDays = SUM(EmployeeAbsences[AbsenceDays])
  2. Calculate Total Workdays in the Financial Year (up to today):

    TotalWorkDaysYTD = 
    CALCULATE (
        COUNTROWS ( 'Calendar' ),
        'Calendar'[Date] <= TODAY(),
        'Calendar'[IsWorkDay] = TRUE
    )
  3. Calculate the Percentage of Absenteeism:

    AbsenteeismPercentage = 
    DIVIDE (
        [TotalAbsenceDays],
        [TotalWorkDaysYTD],
        0
    )

Creating Visuals in Power BI

  1. Add a New Page in Power BI Report.

  2. Insert a Card Visual:

    • Drag AbsenteeismPercentage into the Fields pane of the card visual to show the overall percentage of absenteeism year to date.
  3. Insert a Line Chart Visual:

    • Axis: Calendar[Date]
    • Values: TotalAbsenceDays
    • This visual will show the trend of absenteeism over the financial year.
  4. Insert a Table Visual:

    • Columns: EmployeeID, AbsenceDate, AbsenceReason, AbsenceDays
    • This provides a detailed view of absenteeism records.

Finalizing the Dashboard

  1. Filter and Slicers:

    • Add slicers for AbsenceReason and EmployeeID to allow dynamic filtering of the data.
  2. Formatting:

    • Ensure proper formatting for readability (number formatting, labels, and titles).
  3. Publish the Report:

    • Once satisfied with the dashboard, publish the report to Power BI Service for sharing and collaboration.

This completes the implementation. Save your work frequently, and you can now use this analysis to monitor and act on absenteeism trends in your organization.

Visualizing Absenteeism Data in Power BI

Calculating Percentage Absenteeism

Assuming that you have already imported your absenteeism data and prepared it as per your earlier steps, the next step is to calculate the percentage of absenteeism using DAX within Power BI.

Define Measures

  1. Total Working Days Measure:

    TotalWorkingDays = COUNTROWS(AttendanceTable)
  2. Total Absent Days Measure:

    TotalAbsentDays = CALCULATE(
        COUNTROWS(AttendanceTable),
        AttendanceTable[Status] = "Absent"
    )
  3. Percentage Absenteeism Measure:

    PercentageAbsenteeism = DIVIDE(
        [TotalAbsentDays], 
        [TotalWorkingDays],
        0
    )

SQL Integration for Absenteeism Data

You can leverage a SQL query in Power BI to fetch the absenteeism data. Assuming you have connected the SQL database in Power BI, the query might look something like this:

SELECT 
    EmployeeID,
    Date,
    Status
FROM 
    AttendanceTable
WHERE 
    Date BETWEEN '2022-04-01' AND '2023-03-31'  -- Financial year date range

Visualization in Power BI

Step-by-Step to Create a Visualization

  1. Navigate to Visualizations Pane:

    • In the summarized report view, navigate to the Visualization pane in Power BI Desktop.
  2. Create a Bar Chart:

    • Drag EmployeeID or relevant grouping dimension to the axis.
    • Drag TotalAbsentDays and TotalWorkingDays measures to the value field.
    • Drag PercentageAbsenteeism to the tooltips section for additional insights.
  3. Create a Pie Chart:

    • Drag the Status column to the Legend.
    • Drag TotalAbsentDays to the Values field.
  4. Line Chart for Trend Analysis:

    • Choose Date for the X-axis.
    • Drag TotalAbsentDays to the Values field.
    • Configure the time filter for monthly or weekly analysis.

Applying Filters

  • Use slicers for the Employee names, departments, roles, and any other relevant categorical data to filter and drill down into the visualizations to focus on different segments.

By following these steps, you should be able to correctly visualize the absenteeism data in Power BI and leverage the DAX calculations for meaningful insights.