Mastering DAX Table Functions: A Practical Approach
Description
This project is tailored for Power BI users who have an intermediate understanding of DAX and aim to be advanced users. We will shine a light on all key table functions, allowing the user to create and manipulate virtual tables with ease and efficiency. Through numerous examples, the user will be provided with real-world context that underscores the practicality and applicability of these functions. This progressive learning approach will not only increase knowledge depth but also improve the practical skill-set.
Unit 1: Introduction to Virtual Tables in DAX
Data Analysis Expressions (DAX) is a library of functions and operators that can be combined to build formulas and expressions in Power BI, Analysis Services, and Power Pivot in Excel. Virtual Tables in DAX are used as the source table for many DAX functions and play a crucial role in many analysis scenarios.
Setup
Ensure that you have the following software installed on your machine:
- Power BI Desktop
You can download Power BI Desktop from the official Microsoft link -
After installing Power BI Desktop, launch the application.
Section 1: Loading the Data
For our demonstration, we will be using a simple dataset in CSV format that contains sales data. Load the CSV file into Power BI by following the steps:
Select
Get Data
on the top bar ofPower BI Desktop
, click on theText/CSV
option.Navigate to the location where you have stored your dataset and select your CSV file.
Click load which roughly transcribes the CSV file into a table that DAX can utilize.
Section 2: Creating Calculated Tables
After data loading, the next step will be creating calculated tables. Perceive a calculated table as a type of virtual table that you define in a tabular model's calculation area. Here are the steps:
Click on
Modeling
in the top menu, thenNew Table
.We are going to create a virtual table that consists of unique products sold. Type the following:
UniqueProduct = DISTINCT('SalesData'[Product])
Above, SalesData
is name of the table loaded in the model and Product
is the column in that table.
Section 3: Utilizing Virtual Tables in DAX functions
Finally, let's use the SUMMARIZE
function which is often used in combination with other DAX functions and serves to summarize a table/timeline of data. In our case, let's compute the total quantity sold per product. Type this into the new table under the Modeling
tab:
ProductQuantity =
SUMMARIZE (
SalesData,
SalesData[Product],
"Total_Quantity", SUM(SalesData[Quantity])
)
The SUMMARIZE
function above creates a new table with unique Products(<SalesData>[Product]
) and sum of quantity(SUM(SalesData[Quantity])
) for each product.
Now, you have a basic understanding of how to create and utilize virtual tables in DAX. This is, of course, a simple demonstration and real-world scenarios may require more complex approaches and functions.
Remember, practice is the key to mastering DAX. Continue experimenting with different functions and applying them to your reporting needs!
Unit 2: Handling Relationships with RELATED Function
The RELATED function is a powerful feature in DAX (Data Analysis Expressions) used in Power BI for fetching data from a different (related) table using the relationship between tables.
For demonstration purposes, let's pretend we have two tables: Sales
and Products
with established relationships.
Sales Table
SaleID | ProductID | Quantity |
---|---|---|
1 | 101 | 5 |
2 | 102 | 2 |
3 | 103 | 1 |
4 | 101 | 3 |
5 | 104 | 4 |
Products Table
ProductID | ProductName | ProductPrice |
---|---|---|
101 | Product A | 10 |
102 | Product B | 15 |
103 | Product C | 20 |
104 | Product D | 25 |
We'll use the RELATED
function to pull the ProductName
and ProductPrice
from the Products
table to the Sales
table based on the ProductID
.
Implementing the RELATED function
To implement a RELATED function, follow these steps:
- Create a calculated column
Add a new column to the Sales
table that will store the product names fetched from the Products
table.
Sales[ProductName] = RELATED(Products[ProductName])
We'll do the same for the Product Price:
Sales[ProductPrice] = RELATED(Products[ProductPrice])
Now our Sales
table looks like:
Sales Table
SaleID | ProductID | Quantity | ProductName | ProductPrice |
---|---|---|---|---|
1 | 101 | 5 | Product A | 10 |
2 | 102 | 2 | Product B | 15 |
3 | 103 | 1 | Product C | 20 |
4 | 101 | 3 | Product A | 10 |
5 | 104 | 4 | Product D | 25 |
- Calculate total sales per product
Now that we have the Sales
table enriched with product details, we can calculate the total sales for each product.
Sales[TotalSale] = Sales[Quantity] * Sales[ProductPrice]
and our Sales
table now is:
Sales Table
SaleID | ProductID | Quantity | ProductName | ProductPrice | TotalSale |
---|---|---|---|---|---|
1 | 101 | 5 | Product A | 10 | 50 |
2 | 102 | 2 | Product B | 15 | 30 |
3 | 103 | 1 | Product C | 20 | 20 |
4 | 101 | 3 | Product A | 10 | 30 |
5 | 104 | 4 | Product D | 25 | 100 |
Congratulations, you have just used the RELATED
function to enrich one table with details from a related table and then used that additional information to perform calculations. This concept applies not only to Power BI but also to any kind of relational data model.
Implementing Aggregations using SUMMARIZE and ADDCOLUMNS Functions
To practically demonstrate the applications of the SUMMARIZE and ADDCOLUMNS functions for data aggregations, this guide assumes the presence of a sample dataset named 'Sales' in the Power BI environment. This dataset has the columns 'Product', 'Region', 'SalesAmount', 'Cost' and 'UnitsSold' for simplicity. Different scenarios of applying SUMMARIZE and ADDCOLUMNS functions will be demonstrated.
To start with, ensure that your Power BI environment is set up properly and the Sales dataset has been loaded correctly.
Section I: Using SUMMARIZE Function
The SUMMARY function is used for grouping data and performing an operation such as count, sum, average, minimum, or maximum on the grouped data.
Sales_Sum = SUMMARIZE(
Sales,
Sales[Product],
Sales[Region],
"Total Sales", SUM(Sales[SalesAmount]),
"Total Cost", SUM(Sales[Cost]),
"Total Units", SUM(Sales[UnitsSold])
)
In the given formula, SUMMARIZE function is creating a summary of the 'Sales' dataset by grouping based on 'Product' and 'Region', and then calculating the sum of 'SalesAmount', 'Cost', and 'UnitsSold' for each group.
Section II: Using ADDCOLUMNS Function
The ADDCOLUMNS function is used to add calculated columns to the given table in Power BI.
SalesDetails = ADDCOLUMNS(
Sales,
"Profit", Sales[SalesAmount] - Sales[Cost],
"ProfitMargin", DIVIDE((Sales[SalesAmount] - Sales[Cost]), Sales[SalesAmount])
)
In the provided formula, ADDCOLUMNS function is adding two new columns – 'Profit' and 'ProfitMargin' to the 'Sales' table. 'Profit' is calculated by subtracting 'Cost' from 'SalesAmount' whereas 'ProfitMargin' is calculated by dividing 'Profit' by 'SalesAmount'.
Section III: Combining SUMMARIZE and ADDCOLUMNS Functions
Both SUMMARY and ADDCOLUMNS can be combined to provide a more thorough data analysis.
CombinedSalesDetails = ADDCOLUMNS(
SUMMARIZE(
Sales,
Sales[Product],
Sales[Region],
"Total Units", SUM(Sales[UnitsSold])
),
"Total Sales", [Total Units] * AVERAGE(Sales[SalesAmount]),
"Total Cost", [Total Units] * AVERAGE(Sales[Cost]),
"Profit", [Total Sales] - [Total Cost],
"ProfitMargin",DIVIDE(([Total Sales] - [Total Cost]), [Total Sales])
)
In this example, SUMMARIZE function is creating a summary of the 'Sales' dataset by grouping the data based on 'Product', and 'Region', and then calculating sum of 'UnitsSold' for each group. Subsequently, ADDCOLUMNS function is adding calculated columns for 'Total Sales', 'Total Cost', 'Profit', and 'ProfitMargin'.
Note: The syntax and column names used might differ based on the actual dataset and requirements. The examples provided are generic. Please replace with appropriate dataset/column/field names as necessary.
Please try the provided DAX scripts in your Power BI environment using your 'Sales' dataset.
Unit 4: Developing Filters with CALCULATETABLE and FILTER
In this section, we will cover the practical implementation of the CALCULATETABLE and FILTER functions to create a dynamic filter. These functions are an important part of Data Analysis Expressions (DAX) and serve a broad range of functionalities in Power BI.
Creating Filters with CALCULATETABLE
First, let's review the CALCULATETABLE function syntax:
CALCULATETABLE(,,,...)CALCULATETABLE function receives a table as input and returns an altered version of that table according to the filters applied.
Let's say that you have an existing Sales table with a number of columns, and you want to create a new table that only contains this year's sales data.
SalesThisYear = CALCULATETABLE(
Sales,
YEAR(Sales[OrderDate]) = YEAR(TODAY())
)
Developing Filters with FILTER
Now, let's look into the FILTER function:
FILTER(, )The FILTER function evaluates <expression>
for each row of <table>
and only keeps the rows where <expression>
is TRUE.
For example, from the same Sales table, we want to have a table that only includes recrods where product quantity is more than 500; we could implement the FILTER function like this:
HighQuantitySales = FILTER(
Sales,
Sales[Quantity] > 500
)
Combining CALCULATETABLE and FILTER
For a more complex scenario, we will now combine CALCULATETABLE and FILTER functions.
In this scenario, we want to retrieve sales that were made in the summer months and where the sales quantity was more than 500. A summer month is one where the 'Month' value is between June and August (6 and 8).
Here's how you might do that:
SummerHighQuantitySales = CALCULATETABLE(
Sales,
FILTER(
Sales,
Sales[Quantity] > 500
),
FILTER(
ALL(Sales[Month]),
Sales[Month] >= 6 &&
Sales[Month] <= 8
)
)
In this example, CALCULATETABLE uses two FILTER expressions to create a new table. The first FILTER expression returns a new table of sales where the quantity is greater than 500. The second FILTER expression further filters that table to only include sales from the summer months. The ALL function is used to clear any previously applied filters on the Sales[Month] column. When these two FILTER functions are used in combination with CALCULATETABLE, they operate on the result of previous filters, effectively creating an "AND" operation between the filters.
This concludes the practical application of CALCULATETABLE and FILTER DAX functions. Note that understanding and mastering these table functions is crucial as they form the basis of many complex DAX formulas and use-cases.
Unit 5: Advanced Techniques for JOIN Functions
In this unit, we will explore the advanced techniques for JOIN functions in DAX. Specifically, we will focus on two primary DAX JOIN functions - NATURALINNERJOIN
and NATURALLEFTOUTERJOIN
. These functions merge two tables into one, based on shared column names.
Prerequisites
Before we dive in, ensure you're already familiar with basic DAX functions, table functions, and table relationships. Also, it's assumed that you've completed the initial four units of this project.
Practical Implementation
Suppose we have two tables, Sales
and Products
. Let's first display these tables using EVALUATE
.
// display Sales table
EVALUATE Sales
// display Products table
EVALUATE Products
1. NATURALINNERJOIN
The NATURALINNERJOIN
function in DAX performs an inner join of two tables based on their shared column names. Only the rows containing matching values in the shared columns are returned. If a column name in table1 exists in table2, the join uses that column. The returned table includes all shared columns.
Let's join our Sales
and Products
tables on their shared columns.
EVALUATE
NATURALINNERJOIN(Sales, Products)
2. NATURALLEFTOUTERJOIN
The NATURALLEFTOUTERJOIN
function in DAX performs a left outer join of two tables based on their shared column names. All rows from the left table and any rows in the right table that match up are returned. If a shared column name is in both tables, the join uses that column. The resulting table includes all shared columns.
In the following example, a NATURALLEFTOUTERJOIN
operation is executed between Sales
and Products
.
EVALUATE
NATURALLEFTOUTERJOIN(Sales, Products)
Remember, in both functions, a maximum of two tables can be joined, and the joined tables should share at least one column name.
By mastering these JOIN functions, you can combine, filter, and aggregate data from different tables in your DAX data model. This allows for comprehensive analyses of the interactions between different sections of your data, providing you with a deeper understanding of your data's structure and relationships.
Hope this gives you a real-life implementation of advanced techniques for JOIN functions in DAX. Now you can easily apply these techniques in your DAX queries to manipulate data in a more advanced way.
Unit 6: Row Context and EARLIER Function in Depth
In Data Analysis Expressions (DAX), row context refers to the context where each row is evaluated individually in relation to other rows. DAX provides certain functions to manipulate, evaluate, or compare the values of the current row against other rows. One of these functions is the EARLIER
function.
The EARLIER
function is quite handy when you want to use a certain value of the current row within an aggregation function, such as SUM
, AVERAGE
, COUNT
, etc. In other words, it allows you to access data from an "earlier" row context in the calculation chain.
Sample Dataset
The sample dataset Sales
has following fields:
SalesID
: Unique identifier for each sale transaction
Year
: Year of the sale
ProductName
: Name of the product sold
UnitsSold
: Number of units sold
PricePerUnit
: Price per unit of the product
EARLIER Function Usage
Let's try to calculate the cumulative sales for each product for each year.
The DAX formula for this would be:
Cumulative Sales =
CALCULATE(
SUM(Sales[UnitsSold] * Sales[PricePerUnit]),
FILTER(
ALL(Sales),
Sales[Year] <= EARLIER(Sales[Year]) &&
Sales[ProductName] = EARLIER(Sales[ProductName])
)
)
Breakdown of the formula:
Here, CALCULATE
function modifies the context in which the data is filtered. In this case, it's calculating the total sales (SUM(Sales[UnitsSold] * Sales[PricePerUnit]
), but for a filtered subset of data.
Inside CALCULATE
, the FILTER
function generates a new table that includes all rows (because of ALL(Sales)
) that meet the specified conditions.
The conditions we have are Sales[Year] <= EARLIER(Sales[Year])
and Sales[ProductName] = EARLIER(Sales[ProductName])
, which basically compares the current row's 'Year' and 'ProductName' with all other rows in the table, gathering only those rows that have the same product name and a year less than or equal to the current row's year.
This way, for each row, we get cumulative sales by year for each product. EARLIER
function here allows us to refer to the current row's values while running the aggregate calculations inside CALCULATE
.
Now you can add [Cumulative Sales]
to any report/table vizualization you have in Power BI, and it will display the cumulative sales by each year for each product.
Unit 7: ALL Function Variations and Their Uses
The ALL Function in DAX (Data Analysis Expressions) essentially removes any filters that might've been applied to the column or table being referenced. It's often used in combination with other functions to create complex expressions. In this unit, we'll explore the use of the ALL function - along with some of its variants (ALLSELECTED, ALLEXCEPT), and how these can be utilized in real-world scenarios.
1. ALL Function in DAX
Let’s consider an example where you're working with a sales dataset, and you want to calculate the total sales. Later, you want to compare individual product sales with the total sales.
Imagine a table - FACT_Sales:
Product_ID
Sales_Amount
1
200
1
300
2
150
2
170
3
300
3
100
Total_Sales = SUM( FACT_Sales[Sales_Amount] )
Product_Total = CALCULATE (
SUM ( FACT_Sales[Sales_Amount] ),
ALL ( FACT_Sales[Product_ID] )
)
Total_Sales
is straightforward, it’s just the sum of Sales_Amount. Product_Total
uses CALCULATE to change the context of the calculation. It removes filters on Product_ID by using ALL, thus giving us overall total sales.
2. ALLSELECTED Function
ALLSELECTED returns all the rows in a table (or all the values in a column), while keeping intact the filters that have been applied to other columns.
Continuing the previous example, suppose you've a slicer connected to your sales dashboard, where you can pick the year. If you select 2020 and then calculate total sales using ALL, it will not respect your slicer selection. This problem is solved with ALLSELECTED.
Product_Total_AllSelected = CALCULATE (
SUM ( FACT_Sales[Sales_Amount] ),
ALLSELECTED ( FACT_Sales[Product_ID] )
)
This version respects filters on other columns, while removing filters on the Product_ID column.
3. ALLEXCEPT Function
ALLEXCEPT is a variation that keeps all filters on all other columns, except the ones specified. Suppose, to calculate the total sales for each product in a given year, you might do:
Product_Total_AllExcept = CALCULATE (
SUM ( FACT_Sales[Sales_Amount] ),
ALLEXCEPT ( FACT_Sales, FACT_Sales[Year] )
)
This variation will give us a total sales for each product and respect all filters except the ones on 'Year'.
In conclusion, the ALL function and its variants can be used effectively to control the context of your calculations in DAX according to your specific requirements.
Unit 8: Dealing with Blanks in DAX Tables
In DAX, dealing with blanks might be necessary especially when we are dealing with real world data, as there is often the chance that some values are missing in certain columns. These missing values are represented as blank in DAX. By understanding how to handle these blanks can aids in creating more robust and error-free reports.
A. Using IF to Replace Blanks
Often we may want to replace blank values with a placeholder or another value, and the IF
function in DAX allows us to do this effectively.
Suppose we have a 'Sales' table, which contains a 'Product' and 'Price' column. For some reason, some values in the 'Price' column are left blank. We can replace these blank values with 0 using the IF
function.
Sales :=
ADDCOLUMNS (
'OriginalSales',
"Price", IF ( ISBLANK ( 'OriginalSales'[Price] ), 0, 'OriginalSales'[Price] )
)
Here, ISBLANK
is used to query whether a cell is blank. If it is, the IF
function replaces it with 0.
B. Excluding Blanks Using FILTER
We might want to exclude rows containing blanks from our analysis. We can do this by creating a new table that only contains rows where the 'Price' is not blank using FILTER
.
FilteredSales :=
FILTER (
'OriginalSales',
NOT ( ISBLANK ( 'OriginalSales'[Price] ) )
)
Here, ISBLANK
is again used to check whether a cell is blank, but negated with the NOT
function, so the filter only includes rows where the 'Price' is not blank.
C. Handling Blanks in Measures
Sometimes you would like to display different value in measures when the result is blank. For instance, if we have a measure [Total Sales] that sums up the 'Price' column and we would like to display "N/A" when the result is blank, we can modify our measure as follows:
Total Sales :=
IF (
ISBLANK ( SUM( 'Sales'[Price] ) ),
"N/A",
SUM( 'Sales'[Price] )
)
This defensive coding technique helps to improve the readability of your reports.
D. Counting Blanks with DAX
If you want a count of rows where the 'Price' value is blank you can do so using the COUNTBLANK
function.
Blank Count :=
COUNTBLANK ( 'Sales'[Price] )
Conclusion
Blanks in your data can lead to incorrect calculations and misleading reports. These techniques shown here provide several ways you can handle blanks in DAX, but note that the best treatment always depends on the specific context and meaning of the data.
Unit 9: Practical Examples: Nested Functions and Complex Tables
Objective: To demonstrate the practical use of nested functions and complex table structures in DAX (Data Analysis Expressions)
Note: This unit assumes that you have a data model defined with necessary relationships needed for these examples.
I. Nested Functions in DAX
Nested functions are a concept in DAX where one function is used as an argument for another function. This functionality essentially allows a DAX function to use the result of another DAX function for computation. In order to illustrate the use of nested functions effectually, we will use a combination of DAX functions (such as SUM
, CALCULATE
, FILTER
and ALL
) in the examples below.
Example: Accumulative Sales per Category
This cumulative sales DAX measure uses nested SUM
and CALCULATE
functions together with a filter ALL
.
Accumulative Sales per Category =
CALCULATE(
SUM(Sales[SalesAmount]),
FILTER(
ALL(Sales[OrderDate]),
Sales[OrderDate] <= MAX(Sales[OrderDate])
),
VALUES(Products[Category])
)
- It first accumulates
SalesAmount
until the latest order date with SUM(Sales[SalesAmount])
.
- The
FILTER
function is then used to include all dates that are less than or equal to the latest order date.
VALUES(Products[Category])
makes sure the calculation is done per each category.
II. Complex Tables in DAX
Complex table structures entail nested tables, tables with multiple conditions and tables that use DAX functions to compute and filter data.
Example: Sales by Category and Year with Conditions and Nesting
This complex table DAX measure generates a summarized table presenting sales per category and year, but only including categories and years with sales above a given threshold.
Sales Summary =
ADDCOLUMNS(
SUMMARIZE(
FILTER(
ALL(Sales[OrderDate], Products[Category], Sales[SalesAmount]),
[Total Sales by Category and Year] > Threshold_Value
),
Products[Category],
"Year", YEAR(Sales[OrderDate])
),
"Total Sales by Category and Year", [Total Sales by Category and Year]
)
- The
FILTER
function is used to include all sales data but only where [Total Sales by Category and Year] > Threshold_Value
.
SUMMARIZE
then creates a summarized table based on the filtered sales data.
ADDCOLUMNS
finally adds a column to the summarized table to display the total sales by category and year.
You can replace Threshold_Value
with the desired sales threshold. The [Total Sales by Category and Year]
measure needs to be defined in your DAX model.
Please remember to tailor the table and field names according to your data model for these formulas to work.
Unit 10: Case Study: Applying DAX Table Functions in Real-World Scenarios
For this case study, we'll be making use of a hypothetical dataset from a bookstore, and we will focus on using a combination of various DAX functions to solve two real-world scenarios: Analyzing best selling books and finding target customers. We will be implementing all DAX functions via the Power BI environment.
Let's assume we have three primary tables in our data model: Sales (Sales
), Books (Books
), and Customers (Customers
).
Note: For this exercise, it's not important what these tables contain, we'll just assume that the relationships between these tables are already properly set up.
Scenario 1: Analyzing best selling books
We want to find the top 3 best selling books in terms of quantity sold.
Here is a DAX measure that does exactly that:
Top 3 Best Selling Books =
VAR SellingBooks = SUMMARIZE(
Sales,
Books[BookTitle],
"Total Quantity Sold", SUM(Sales[Quantity])
)
VAR Top3Books=TOPN(3, SellingBooks, [Total Quantity Sold], DESC)
RETURN
CONCATENATEX(Top3Books, [BookTitle], ",")
Where:
- SUMMARIZE generates a summary table over a set of groups.
- TOPN returns the top N rows of a table.
- CONCATENATEX concatenates the result of an expression evaluated for each row in a table.
The returned string will contain the top selling book title first, followed by the second and third best selling book.
Scenario 2: Finding target customers
Assume the marketing department wants to target customers who have purchased at least 5 different books.
We will use the COUNTROWS and DISTINCT functions, which allow us to manipulate data in a table and return the count of rows that are unique in a column, respectively.
Here's the DAX formula:
Target Customers =
VAR PurchasedBookCount = SUMMARIZE(
Sales,
Customers[CustomerName],
"Distinct Book Count", COUNTROWS(DISTINCT(Sales[BookID]))
)
VAR TargetedCustomers = FILTER(
PurchasedBookCount,
[Distinct Book Count] >= 5
)
RETURN
CONCATENATEX(TargetedCustomers , Customers[CustomerName], ",")
Where:
- FILTER allows us to reduce the number of rows in the table where the conditions that follow the function are met.
The return value will be a string, which contains the names of the target customers.
In this case study, by using DAX table functions, we are able to analyze complex real-world scenarios.
More Code Generators
Apache Flink Code Generator Apache Pig Code Generator Azure Data Factory Code Generator C/C++ Code Generator CouchDB Code Generator DAX Code Generator Excel Code Generator Firebase Code Generator Google BigQuery Code Generator Google Sheets Code Generator GraphQL Code Generator Hive Code Generator Java Code Generator JavaScript Code Generator Julia Code Generator Lua Code Generator M (Power Query) Code Generator MATLAB Code Generator MongoDB Code Generator Oracle Code Generator PostgreSQL Code Generator Power BI Code Generator Python Code Generator R Code Generator Redis Code Generator Regex Code Generator Ruby Code Generator SAS Code Generator Scala Code Generator Shell Code Generator SPSS Code Generator SQL Code Generator SQLite Code Generator Stata Code Generator Tableau Code Generator VBA Code Generator