Project

Practical Data Analysis in Power BI: Implementing Categorical Scatter Plots with DAX

Learn to create and implement categorical scatter plots in Power BI using DAX expressions, overcoming common challenges such as unit alignment and aggregation.

Empty image or helper icon

Practical Data Analysis in Power BI: Implementing Categorical Scatter Plots with DAX

Description

This project focuses on creating meaningful visualizations in Power BI by combining data analysis skills with DAX language proficiency. Users will learn to handle categorical data in scatter plots, map phases correctly, and ensure units are appropriately aligned. Real-world examples and hands-on exercises will guide users through practical implementation, allowing them to understand and overcome common obstacles in data visualization.

The original prompt:

I Want to use categorical values in the X-Axis in a Power BI Scatter Plot. The dimension table that has the categorical data is dPhaseTitles, and it contains two columns; [PhaseIndex] and [Phase]. When I use the DAX Expression below in the X-Axis the Scatter Plot goes blank and the reason is Power BI Scatter Plots require the value in the X and Y axis to be aggregated. When I try this with the PhaseIndex column all of the unit markers align in one vertical line. The Unit markers need to be aligned in the Phase they that they belong in. There are 5 Phases (Maintenance, Basic, Integrated, Deployed, Sustained). The DAX Expression needs to align the Unit markers in five verticle lines.

PhaseMapping = SWITCH( 'dPhaseTitles'[Phase], "Phase1", 1, "Phase2", 2, "Phase3", 3, "Phase4", 4, "Phase5", 5, 0 )

Introduction to Power BI and Scatter Plots

Overview

This unit covers the creation and implementation of categorical scatter plots in Power BI using DAX expressions. Scatter plots are a powerful way to visualize data, showing the relationship between two numerical variables and categorizing them by a third variable. This lesson will guide you through the practical implementation and address common challenges such as unit alignment and aggregation.

Setup Instructions

  1. Open Power BI Desktop

    • Ensure you have Power BI Desktop installed on your computer.
    • Launch Power BI Desktop.
  2. Import Data

    • Go to the Home tab.
    • Click Get Data, select your data source, and load the data into Power BI.

Creating Scatter Plots

  1. Create a Scatter Plot

    • Go to the Report view.
    • Click on the Scatter chart in the Visualizations pane.
  2. Assign Data Fields

    • Drag the numeric fields you want to compare to the X Axis and Y Axis wells.
    • Drag the categorical field that you want to use for category differentiation to the Details well.

Using DAX for Calculations

Example Data

Assuming we have a table SalesData with columns: ProductType, UnitsSold, and Revenue.

Creating New Measures

Create new measures using DAX to handle unit alignment and aggregation challenges.

  1. Total Units Sold

    TotalUnitsSold = SUM(SalesData[UnitsSold])
  2. Total Revenue

    TotalRevenue = SUM(SalesData[Revenue])
  3. Average Unit Price

    AverageUnitPrice = DIVIDE([TotalRevenue], [TotalUnitsSold], 0)
  4. Aggregation by Category

    TotalUnitsByCategory = 
    CALCULATE(
        SUM(SalesData[UnitsSold]),
        ALLEXCEPT(SalesData, SalesData[ProductType])
    )

Visual Interaction and Formatting

  1. Adjust Axis Scaling

    • In the format pane of the scatter plot, configure axis scaling to ensure units are aligned properly.
  2. Customize Colors

    • Use the format pane to set distinct colors for each category based on the ProductType field.
  3. Tooltip Configuration

    • Enhance scatter plot tooltips by dragging additional measures into the Tooltips well, such as TotalRevenue and AverageUnitPrice.
  4. Enable Legend

    • In the visualizations pane, turn on the legend to display category names distinctly on the scatter plot.

Final Steps

  • Validate the Scatter Plot

    • Compare the scatter plot against your original data to ensure accuracy.
    • Utilize slicers and filters to explore specific subsets of the data interactively.
  • Save Your Report

    • Save your Power BI report file to retain the scatter plot and all settings.

By following these steps, you should be able to successfully create categorical scatter plots in Power BI and overcome common challenges related to unit alignment and aggregation using DAX expressions.

Understanding the Data Model: dPhaseTitles Dimension Table

Data Model: dPhaseTitles Dimension Table

The dPhaseTitles dimension table typically stores descriptive information about the different phases. This table often includes fields like PhaseID, PhaseTitle, and possibly other attributes such as PhaseDescription, start dates, and end dates. Below is a typical structure of the dPhaseTitles table.

Table Structure

Table: dPhaseTitles
---------------------------------
| PhaseID  | PhaseTitle         |
---------------------------------
| 1        | Initiation         |
| 2        | Planning           |
| 3        | Execution          |
| 4        | Monitoring & Control|
| 5        | Closure            |
---------------------------------

Step to Create and Implement Categorical Scatter Plots Using DAX in Power BI

Define Relationships

Ensure that your dPhaseTitles table is correctly related to the fact table (e.g., fTasks) using the PhaseID. If not, create the relationship.

Establish Relationship:
dPhaseTitles[PhaseID] -> fTasks[PhaseID]
Relationship Type: One to Many

Create Scatter Plot in Power BI

  1. Open Power BI Desktop: Ensure your tables (e.g., dPhaseTitles and fTasks) are loaded and connected.
  2. Add a Scatter Chart: Choose the scatter chart from the Visualizations pane.

Configuring the Scatter Plot

  • X-Axis: Select a numeric field from the fTasks table, like TaskDuration.
  • Y-Axis: Choose another numeric field from fTasks, like TaskCost.
  • Details: Set PhaseTitle field from the dPhaseTitles table to this property to categorize your scatter plot.
X-Axis: fTasks[TaskDuration]
Y-Axis: fTasks[TaskCost]
Details: dPhaseTitles[PhaseTitle]

DAX Expressions for Calculations

  • Total Task Duration: A measure to sum the TaskDuration
TotalTaskDuration = SUM(fTasks[TaskDuration])
  • Total Task Cost: A measure to sum the TaskCost
TotalTaskCost = SUM(fTasks[TaskCost])
  • Average Task Duration per Phase:
AvgTaskDurationPerPhase = 
   AVERAGEX(
      VALUES(dPhaseTitles[PhaseTitle]), 
      [TotalTaskDuration]
   )
  • Average Task Cost per Phase:
AvgTaskCostPerPhase = 
   AVERAGEX(
      VALUES(dPhaseTitles[PhaseTitle]), 
      [TotalTaskCost]
   )

Adding Measures to Scatter Plot Axis

Replace the fields in the X-Axis and Y-Axis with your new DAX measures if higher granularity is needed.

X-Axis: [AvgTaskDurationPerPhase]
Y-Axis: [AvgTaskCostPerPhase]

Adjust Visualization

Ensure the scatter plot settings are adjusted for clarity:

  • Adjust size based on another numerical field if necessary (e.g., sum of task complexity).
  • Customize tooltips to show relevant phase data.

Summary

By following these steps, you can effectively create a categorical scatter plot in Power BI using DAX expressions. The dPhaseTitles dimension table provides a categorical dimension to visualize and compare metrics (TaskDuration, TaskCost) across different phases.

Creating Categorical Scatter Plots in Power BI using DAX

In this part, we will implement categorical scatter plots in Power BI using DAX expressions. This will include creating a measure for the categorical values, handling unit alignment, and enabling proper aggregation for the scatter plot.

Step 1: Creating the Necessary Measures

  1. Calculate Aggregated Measures Create measures for the numerical values you want to plot. For instance, if you want to plot 'Total Sales' and 'Average Unit Price', you would create measures for these.

    Total Sales = SUM(Sales[SalesAmount])
    
    Average Unit Price = AVERAGE(Sales[UnitPrice])
  2. Categorical Measure Create a measure for the categorical axis if required. This can be useful for hover-over descriptions or categories in the scatter plot.

    Product Category = SELECTEDVALUE(Product[Category])

Step 2: Create the Scatter Plot in Power BI

  1. Add Scatter Plot Visual

    • Open your Power BI report.
    • Add a Scatter Plot visual from the Visualizations pane.
  2. Drag Measures to Visual

    • Drag the 'Total Sales' measure to the 'X-Axis'.
    • Drag the 'Average Unit Price' measure to the 'Y-Axis'.
    • Drag the 'Product Category' measure to the 'Details' or 'Legend' field to categorize the points.

Step 3: Handle Unit Alignment

  • Ensure that both axes have appropriate units and scales. You can adjust these by clicking on the axis in the Visualizations pane and setting the display units (e.g., thousands, millions).

Step 4: Enabling Data Aggregation

  1. Aggregation Control Ensure that the measure aggregations are correctly set. In the case of scatter plots, default aggregation (SUM, AVERAGE) is usually sufficient, but you can control and change it in the Visualizations pane.

  2. Tooltips for Additional Context You can add more context to your scatter plot by dragging additional measures to the Tooltips area.

Total Products Sold = SUM(Sales[Quantity])

Distinct Customers = DISTINCTCOUNT(Sales[CustomerID])
  • Drag Total Products Sold and Distinct Customers to the 'Tooltips' area.

Step 5: Overcoming Common Challenges

  1. Unit Alignment

    • Ensure consistency in the units across different measures. Use explicit formatting where needed.
  2. Handling Missing Values

    • If you notice gaps or missing values, you might want to create custom filters or replace them with a default value using DAX.
    Average Unit Price = IF(ISBLANK(AVERAGE(Sales[UnitPrice])), 0, AVERAGE(Sales[UnitPrice]))
  3. Scale Adjustments

    • Adjust the scale of your scatter plot axes to make sure the data points are appropriately spread out. This will help in better visualization and analysis.

Example

Here’s a concise example that ties in all the steps:

  1. Create Measures:

    Total Sales = SUM(Sales[SalesAmount])
    Average Unit Price = IF(ISBLANK(AVERAGE(Sales[UnitPrice])), 0, AVERAGE(Sales[UnitPrice]))
    Product Category = SELECTEDVALUE(Product[Category])
    Total Products Sold = SUM(Sales[Quantity])
    Distinct Customers = DISTINCTCOUNT(Sales[CustomerID])
  2. Configure Scatter Plot:

    • X-Axis: Total Sales
    • Y-Axis: Average Unit Price
    • Details: Product Category
    • Tooltips: Total Products Sold, Distinct Customers

By following these steps and using the provided DAX expressions, you can effectively create and implement categorical scatter plots in Power BI, handling common challenges like unit alignment and data aggregation.

Using DAX Expressions for Data Mapping and Aggregation in Power BI

Data Mapping Using DAX Expressions

Assume you have the following columns in your data tables:

  • Sales table with columns: SalesAmount, CategoryID
  • Categories table with columns: CategoryID, CategoryName

Step 1: Merge Data using DAX

To map the CategoryName from the Categories table to the Sales table in Power BI, use the RELATED function:

  1. Go to the Sales table.
  2. Create a new column for CategoryName with the following DAX expression:
CategoryName = RELATED(Categories[CategoryName])

This expression fetches the CategoryName from the Categories table based on the CategoryID in the Sales table, provided there is a relationship between these tables based on CategoryID.

Aggregation Using DAX Expressions

You might want to aggregate SalesAmount by category for visual representation.

Step 2: Generate Aggregated Sales by Category

  1. In the Sales table, create a new measure to sum the SalesAmount per CategoryName:
TotalSalesByCategory = CALCULATE(
    SUM(Sales[SalesAmount]),
    ALLEXCEPT(Sales, Sales[CategoryName])
)

This measure calculates the total sales for each category while disregarding any filters except for the CategoryName filter.

Implementing Categorical Scatter Plots

Step 3: Setup Scatter Plot Visualization

After the DAX expressions above, follow these steps in Power BI to create the scatter plot:

  1. Drag CategoryName into the Axis field.
  2. Drag TotalSalesByCategory into the Values field.
  3. Adjust other visual settings as required.

Ensure to format the data fields properly where CategoryName is typically categorical and TotalSalesByCategory is numerical.

Overcoming Common Challenges

Unit Alignment

Unit alignment often involves ensuring that data in different units are converted to a common unit before performing aggregation or comparison.

Example: If SalesAmount has different currencies, you might need to standardize it to a single currency:

  1. Create a CurrencyRates table with Currency, Rate, and Date.
  2. Create another column in the Sales table to convert SalesAmount to a standard currency (e.g., USD).
ConvertedSalesAmount = Sales[SalesAmount] * RELATED(CurrencyRates[Rate])

Handling Missing Data

In datasets where some categories might not have records, use DAX functions like COALESCE to handle such scenarios.

Example:

TotalSalesByCategory = COALESCE(
    CALCULATE(
        SUM(Sales[SalesAmount]),
        ALLEXCEPT(Sales, Sales[CategoryName])
    ), 
    0
)

This expression ensures TotalSalesByCategory returns 0 instead of BLANK() when there are no sales records for a category.

Conclusion

This implementation covers practical usage of DAX expressions for data mapping and aggregation directly in Power BI, providing a foundation to create meaningful scatter plots overcoming common challenges of unit alignment and missing data. Apply these steps in Power BI to visualize aggregated data accurately.

Implementing Unit Marker Alignment Using PhaseIndex in Power BI with DAX

Step 1: Create the PhaseIndex Table

Ensure that you have a table, PhaseIndex, which contains information on phase titles and phase indices aligned with your scatter plot dimensions.

PhaseIndex = 
ADDCOLUMNS(
    GENERATESERIES(1, 100, 1),
    "PhaseTitle", 
        SWITCH(
            TRUE(),
            [Value] >= 1 && [Value] <= 10, "Phase A",
            [Value] >= 11 && [Value] <= 20, "Phase B",
            [Value] >= 21 && [Value] <= 30, "Phase C",
            "--"
        )
)

Step 2: Align Unit Markers Using PhaseIndex

  1. Create a new column in your primary data table to reference the PhaseIndex table.
AlignedPhaseMarker = 
LOOKUPVALUE(
    PhaseIndex[PhaseTitle],
    PhaseIndex[Value], 
    [YourUnitMarkerColumn] // Replace with the column from your main data table
)

Step 3: Aggregate Data Using the Aligned Markers

Create a measure to aggregate your data, aligning units through the PhaseIndex.

AlignedAggregation = 
CALCULATE(
    SUM('YourDataTable'[MeasureColumn]), // Replace with the relevant measure column
    'YourDataTable'[AlignedPhaseMarker] = "Phase A" // Adjust as needed
)

Step 4: Build the Scatter Plot

In Power BI, use the aligned markers and aggregation as part of your scatter plot's data setup.

  • X-Axis: Choose a relevant continuous field.
  • Y-Axis: Use the AlignedAggregation measure.
  • Legend: Use the AlignedPhaseMarker column to categorize the data points.

Step 5: Fine-Tune the Visualization

Customize the scatter plot's appearance to highlight aligned unit markers effectively:

  • Adjust bubble sizes and colors using Markers settings.
  • Use tooltips to display PhaseTitle and other relevant information.
  • Configure data labels for better clarity.

Example Scatter Plot Configuration in Power BI

  1. Add a Scatter plot visualization.
  2. Place Your Continuous Column in the X-Axis.
  3. Assign AlignedAggregation to the Y-Axis.
  4. Use AlignedPhaseMarker for the legend.
  5. Fine-tune visualization properties under the Formatting pane.

This approach aligns unit markers using PhaseIndex and allows for effective categorical scatter plots in Power BI, addressing both unit alignment and aggregation challenges with DAX expressions.

Unit 6: Practical Solutions for Visual Alignment Challenges in Power BI

Aligning Unit Markers Using DAX Expressions

In this section, we'll focus on overcoming alignment challenges of unit markers in scatter plots by leveraging the DAX language. We'll align categorical units using DAX expressions based on the PhaseIndex we've already created.

  1. Creating a New Measure for Alignment Calculation:

    First, we'll create a new measure to properly calculate the positioning for unit markers.

    UnitAlignment =
    CALCULATE(
        AVERAGE('dPhaseTitles'[NumericalValue]),
        ALLEXCEPT('dPhaseTitles', 'dPhaseTitles'[Category])
    )

    This measure calculates the average NumericalValue for a specific Category while ignoring other columns.

  2. Adjusting Marker Positions Based on Aggregated Data:

    To overcome challenges in visual alignment, use an adjusted measure for aggregation and display.

    AdjustedUnitPosition =
    SUMX(
        dPhaseTitles,
        [UnitAlignment] * dPhaseTitles[PhaseIndex]
    )

    This expression sums up the adjusted unit positions by multiplying UnitAlignment with PhaseIndex. It provides a weighted position for accurate scatter plotting.

  3. Implementing the Adjusted Measure in Scatter Plot:

    In the Power BI scatter plot, use the AdjustedUnitPosition measure:

    • Add X-Axis and Y-Axis Values: Drag and drop the NumericalValue and AdjustedUnitPosition to the respective axes of your scatter plot.
    • Categorize Markers: Use the Category field to categorize the markers.
  4. Handling Aggregation Challenges:

    Aggregating data points correctly is crucial for both visual alignment and accurate data representation. Here’s a DAX expression that ensures the data points are correctly aggregated:

    AggregatedDataPoint =
    SUMMARIZE(
        'dPhaseTitles',
        'dPhaseTitles'[Category],
        "AvgValue", AVERAGE('dPhaseTitles'[NumericalValue]),
        "WeightedPosition", SUMX('dPhaseTitles', [UnitAlignment] * 'dPhaseTitles'[PhaseIndex])
    )

    This summarization groups data by Category, calculates the average value (AvgValue), and computes the weighted position (WeightedPosition) for better visual alignment in scatter plots.

  5. Fine-Tuning the Scatter Plot Appearance:

    To refine the appearance and ensure better readability:

    • Customize Marker Shapes and Sizes: Under the Format Pane, use the Shapes section to adjust marker shapes and sizes for better visibility.
    • Color Coding: Utilize different marker colors based on categories for distinction.
    • Add Trendlines: Enable trendlines if needed to visualize patterns.
  6. Deploy and Test:

    Deploy the scatter plot in your Power BI workspace and verify the alignment visually. Ensure the markers align correctly as per the calculated measures and address any disparities accordingly.

By employing these DAX expressions and visualization adjustments, you can effectively resolve visual alignment challenges in Power BI scatter plots, providing a clear and accurate representation of your categorical data.


This unit provides you with the tools and knowledge to manage and rectify common visual alignment issues in Power BI using DAX expressions. Make sure to incorporate the calculations and visual tweaks into your Power BI reports for optimal results.