Formula Generator | DAX

Outlier Detection Methods in DAX

This guide explains how to identify outliers in DAX using mean and standard deviation. It includes formulas for calculating these metrics, defining thresholds, and classifying data points to enhance data analysis and quality.


Empty image or helper icon

Prompt

Can you create some formulas that will help me showcase outliers

Answer

Outlier Detection in DAX

Introduction

Outliers are data points that deviate significantly from the norm. Identifying outliers can provide valuable insights or indicate data quality issues. In DAX, outliers can be detected using statistical measures such as mean and standard deviation.

Formulas

  1. Calculate Mean

    This formula calculates the mean (average) of a given column.

    Mean = AVERAGE(Table[Column])
  2. Calculate Standard Deviation

    This formula calculates the standard deviation of a given column.

    StandardDeviation = STDEV.P(Table[Column])
  3. Define Thresholds for Outliers

    Typically, we classify data points beyond 2 or 3 standard deviations from the mean as outliers.

    LowerThreshold = [Mean] - 3 * [StandardDeviation]
    UpperThreshold = [Mean] + 3 * [StandardDeviation]
  4. Identify Outliers

    This formula classifies each data point as an outlier or not.

    IsOutlier = 
        IF(
            OR(
                Table[Column] < [LowerThreshold], 
                Table[Column] > [UpperThreshold]
            ), 
            "Outlier", 
            "Normal"
        )

Explanation

  • Mean Calculation: The AVERAGE function computes the mean of the Column in Table.
  • Standard Deviation Calculation: The STDEV.P function calculates the standard deviation for the entire population of values in the Column.
  • Thresholds Definition: The lower and upper thresholds for detecting outliers are determined by subtracting and adding three times the standard deviation from/to the mean, respectively.
  • Outlier Identification: The IF function examines whether each value in the Column falls outside the lower and upper thresholds. If it does, the value is classified as an "Outlier"; otherwise, it is classified as "Normal".

Practical Illustration

Suppose you have a table named Sales with a column Revenue. To detect outliers in revenue:

  1. Mean of Revenue

    MeanRevenue = AVERAGE(Sales[Revenue])
  2. Standard Deviation of Revenue

    StandardDeviationRevenue = STDEV.P(Sales[Revenue])
  3. Revenue Thresholds

    LowerRevenueThreshold = [MeanRevenue] - 3 * [StandardDeviationRevenue]
    UpperRevenueThreshold = [MeanRevenue] + 3 * [StandardDeviationRevenue]
  4. Revenue Outliers Classification

    IsRevenueOutlier = 
        IF(
            OR(
                Sales[Revenue] < [LowerRevenueThreshold], 
                Sales[Revenue] > [UpperRevenueThreshold]
            ), 
            "Outlier", 
            "Normal"
        )

By applying these calculations, you can easily identify which sales revenues are outliers. This facilitates focused analysis on data points that significantly deviate from the expected range.

Conclusion

Outlier detection in DAX involves calculating the mean and standard deviation, defining thresholds, and then classifying data points. This systematic approach helps in maintaining data quality and uncovering significant insights. For deeper understanding and advanced techniques, consider exploring courses on the Enterprise DNA Platform.

Create your Thread using our flexible tools, share it with friends and colleagues.

Your current query will become the main foundation for the thread, which you can expand with other tools presented on our platform. We will help you choose tools so that your thread is structured and logically built.

Description

This guide explains how to identify outliers in DAX using mean and standard deviation. It includes formulas for calculating these metrics, defining thresholds, and classifying data points to enhance data analysis and quality.