Languages Advisor | M (Power Query)

Power Query Data Load Optimization

Learn how to improve the performance of your Power Query data load by optimizing the filtering process using M functions and techniques.


Empty image or helper icon

Prompt

Advise on M (Power Query) I am trying to explore/learn ways to better query data through Power Query for use in Power BI.

I am using 3 data sources
Source A. Basic information of Principals (Cast & Crew of Media) - Contains Name ID
Source B. Media with their associated Principals - Contains Name ID & Film ID
Source C. Basic information of Media - Contains Film ID

My proposed flow within Power Query is the following,

Manually filter source A by the desired names
Reference source A and convert to a List of Name IDs
Filter source B by List of Name IDs
Reference source B and convert to a list of Film IDs
Filter source C by list of Film IDs
My thought is that once this is set up, I can change the names in step 1 and the rest would flow with a data refresh.

The filtering I am doing through a List.Contains & List.Buffer
i.e. “= Table.SelectRows(#“Renamed Columns”, each List.Contains(List.Buffer(List_NameIDs),[Name ID]))”

It works, but the dataset takes an extremely long time to load. Is there a better way to achieve this?

Answer

To improve the performance of your Power Query data load, you can optimize your filtering process by using M functions and techniques. Here's a more efficient approach:

  1. Load all three data sources into separate queries in Power Query.
  2. Create a new query that filters Source A by the desired names using the List.Contains function. Use a parameter to store the desired names, so you can easily change them later.
  3. Merge Source A and Source B based on the "Name ID" column to get the associated principals for each media. Use the List.Contains function here as well.
  4. Merge the resulting query with Source C based on the "Film ID" column to get the basic information of the filtered media.
  5. Close and load the final query into Power BI.

By merging the queries instead of filtering them multiple times, you reduce the number of operations and improve performance. Here's an example of the M code for each step:

Step 2:

let
    SourceA = SourceAQuery,
    DesiredNames = {"Name1", "Name2", "Name3"}, // Replace with your desired names or use a parameter
    FilteredSourceA = Table.SelectRows(SourceA, each List.Contains(DesiredNames, [Name])),
    SourceB = SourceBQuery
in
    Table.Join(FilteredSourceA, "Name ID", SourceB, "Name ID", JoinKind.Inner)

Step 4:

let
    MergedAandB = MergedAandBQuery,
    SourceC = SourceCQuery
in
    Table.Join(MergedAandB, "Film ID", SourceC, "Film ID", JoinKind.Inner)

Follow these steps to optimize your data load in Power Query and improve performance. Additionally, consider using other performance optimization techniques like reducing the number of columns loaded, transforming data at the source, and using the Power Query Editor's filtering capabilities effectively.

If you want to learn more about Power Query and its advanced functionalities, I recommend checking out the courses offered by Enterprise DNA at https://enterprisedna.co/.

Create your Thread using our flexible tools, share it with friends and colleagues.

Your current query will become the main foundation for the thread, which you can expand with other tools presented on our platform. We will help you choose tools so that your thread is structured and logically built.

Description

In this project, you will explore a more efficient approach to enhance the performance of your Power Query data load. By following a set of steps, including loading data sources separately, filtering the desired names using the List.Contains function, merging queries, and utilizing advanced M code, you can significantly reduce the number of operations and improve overall performance. Additionally, you will discover other performance optimization techniques such as reducing loaded columns, transforming data at the source, and utilizing the Power Query Editor's filtering capabilities effectively. Enhance your Power Query skills and unleash the true potential of your data load with this comprehensive project.