Thread

Geographical Distance Calculation in DAX using Haversine

This thread explores various aspects of calculating geographical distance using the Haversine algorithm in DAX (Data Analysis Expressions). It covers topics such as determining the nearest store for each location, calculating distance in power, and

Empty image or helper icon

Geographical Distance Calculation in DAX using Haversine

Description

This thread explores various aspects of calculating geographical distance using the Haversine algorithm in DAX (Data Analysis Expressions). It covers topics such as determining the nearest store for each location, calculating distance in power, and providing DAX code for distance calculations using the Haversine formula. The thread also includes insights on converting distances to radians using DAX.

Determining Nearest Store for Each

Tool: Tools Advisor

Created: 01/09/2024

Prompt

So, what I want to accomplish is to be able to tell what store a customer is near based on the customerā€™s address. There would be a table of customers with their addresses and then a table of stores with their addresses. I want to be able to find which store is the shortest distance to the customerā€™s address.

Problem Analysis:

The user wants to determine the nearest store for each customer based on their addresses. They have a table of customers with their addresses and a table of stores with their addresses. The goal is to find the store that is the shortest distance from each customer's address.

Strategic Solution Development:

To solve this problem in Power BI, we can follow these steps:

  1. Create a relationship between the customer table and the store table based on the address fields.
  2. Calculate the distance between each customer's address and all store addresses using the Haversine formula or the Bing Maps API.
  3. Find the store with the shortest distance for each customer.
  4. Display the nearest store information for each customer in a visual or table.

Comprehensive Advice:

Here is a step-by-step guide on how to accomplish this task:

  1. Create a relationship:

    • Ensure that both the customer table and the store table have a unique address field that can be used to match the records.
    • In the "Relationships" view, create a relationship between the address field in the customer table and the address field in the store table.
  2. Calculate the distance:

    • Option 1: Haversine formula (Power Query):

      • In Power Query, combine the customer table and store table into a single table using the Append Queries feature.
      • Create a new calculated column using the Haversine formula to calculate the distance between each customer-store pair.
      • Use the "Group By" feature to find the minimum distance for each customer, grouping by the customer's unique identifier.
      • Merge the result back with the customer table to get the nearest store information.
    • Option 2: Bing Maps API (Power Query):

      • Register for a Bing Maps API key.
      • Use the "Web" connector in Power Query to call the Bing Maps API and pass the customer address and store addresses as parameters.
      • Parse the API response to extract the distance information.
      • Merge the result back with the customer table to get the nearest store information.
  3. Display the nearest store information:

    • Create a visual or table to display the customer information along with the nearest store details.
    • Include columns such as customer name, address, and the name and address of the nearest store.

Code-Based Solution (using Haversine formula):

let
    customers = Excel.CurrentWorkbook(){[Name="Customers"]}[Content],
    stores = Excel.CurrentWorkbook(){[Name="Stores"]}[Content],

    appended = customers & stores,

    distances = Table.AddColumn(appended, "Distance", (row) => 
        let
            lat1 = Number.From(row[Lat]),
            lon1 = Number.From(row[Lon]),
            lat2 = Number.From(row[Lat]),
            lon2 = Number.From(row[Lon]),
            earthRadius = 6371, // radius of the Earth in kilometers
            dLat = Radians(lat2 - lat1),
            dLon = Radians(lon2 - lon1),
            a = Sin(dLat / 2) * Sin(dLat / 2) + Cos(Radians(lat1)) * Cos(Radians(lat2)) * Sin(dLon / 2) * Sin(dLon / 2),
            c = 2 * Atan2(Sqrt(a), Sqrt(1 - a)),
            distance = earthRadius * c
        in
            distance, type number),

    minDistances = Table.Group(distances, "CustomerID", {{"MinDistance", each List.Min([Distance]), type number}}),

    nearestStoreInfo = Table.Join(customers, "CustomerID", minDistances, "CustomerID"),
    
    result = Table.Join(nearestStoreInfo, {"CustomerID", "MinDistance"}, appended, {"CustomerID", "Distance"}),
    
    finalResult = Table.SelectColumns(result, {"CustomerID", "CustomerName", "CustomerAddress", "StoreName", "StoreAddress"})
in
    finalResult

Detailed Explanation:

  • The code performs the following steps:
    • Retrieves the data from the "Customers" and "Stores" tables.
    • Appends the two tables together.
    • Calculates the distance between each customer-store pair using the Haversine formula.
    • Groups the data by the customer's unique identifier and finds the minimum distance for each customer.
    • Joins the result back with the customer table based on the customer's unique identifier.
    • Joins the result back with the appended table to get the nearest store information.
    • Selects the desired columns for the final output.
  • You can replace the column names (Lat, Lon, CustomerID, etc.) and table names (Customers, Stores) with your actual column and table names.

Note: The Haversine formula is a basic method for calculating distances based on latitude and longitude coordinates. If you need more accurate distance calculations, consider using the Bing Maps API or other specialized libraries.

Calculating Distance in Power

Tool: Tools Advisor

Created: 01/09/2024

Prompt

Can you please explain how this calculates - Haversine formula or the Bing Maps API

Problem Analysis

The user wants to know whether the Haversine formula or the Bing Maps API is used to calculate distance in Power BI.

Strategic Solution Development

In Power BI, the calculation of distance between two points is not directly supported natively. However, you can use DAX expressions to implement the Haversine formula in Power BI or utilize the Bing Maps API.

Comprehensive Advice

To calculate distance using the Haversine formula, you can use DAX expressions which involve trigonometric functions. Here is a step-by-step approach to calculate the distance between two points using the Haversine formula in Power BI:

  1. Ensure that you have latitude and longitude values for both points in your dataset. If not, you may need to extract or generate these values from your existing data.

  2. Create a new measure or calculated column in Power BI using DAX.

  3. Use the Haversine formula to calculate the distance between the two points. The formula requires the latitude and longitude values of the two points, and the Earth's radius. The formula is as follows:

    Distance = 
    VAR lat1 = RADIANS('Table'[Latitude1])
    VAR lon1 = RADIANS('Table'[Longitude1])
    VAR lat2 = RADIANS('Table'[Latitude2])
    VAR lon2 = RADIANS('Table'[Longitude2])
    VAR dlat = lat2 - lat1
    VAR dlon = lon2 - lon1
    VAR a =
        SIN(dlat/2) * SIN(dlat/2) +
        COS(lat1) * COS(lat2) *
        SIN(dlon/2) * SIN(dlon/2)
    VAR c = 2 * ATAN2(SQRT(a), SQRT(1-a))
    VAR distance = c * 6371 -- Earth radius in kilometers
    
    RETURN distance
    

    This DAX expression calculates the distance in kilometers. You can convert it to miles if needed.

  4. Apply the formula to your dataset. The newly created measure or calculated column will display the distance between the two points.

Alternatively, you can use the Bing Maps API to calculate the distance between two points in Power BI. This approach involves utilizing the Web.Contents and Json.Document functions in Power Query to interact with the API and retrieve the distance data. Here are the high-level steps:

  1. Register an account with Bing Maps and generate an API key.
  2. Open Power Query Editor in Power BI.
  3. Create a new function that takes the latitude and longitude values of two points as input parameters.
  4. Inside the function, use the Web.Contents function to make a request to the Bing Maps API, passing the API key and the latitude and longitude values.
  5. Extract the distance data from the response using the Json.Document function.
  6. Apply the function to your dataset, passing the appropriate latitude and longitude values.
  7. The function will return the distance between the two points, which can be used as a measure or calculated column in Power BI.

Detailed Explanation

Both the Haversine formula and the Bing Maps API can be used to calculate distance in Power BI.

The Haversine formula is a mathematical algorithm that calculates the distance between two points on a sphere, such as the Earth. It takes into account the curvature of the Earth's surface and provides an accurate estimate of the distance between two points given their latitude and longitude values. In Power BI, you can implement the Haversine formula using DAX expressions, making use of trigonometric functions such as SIN, COS, and ATAN2.

On the other hand, the Bing Maps API is an external service provided by Microsoft that offers a wide range of geographical functionalities, including distance calculation. To use the Bing Maps API in Power BI, you need to register an account, obtain an API key, and make requests to the API to retrieve distance data. This approach involves using the Web.Contents and Json.Document functions in Power Query to interact with the API.

The choice between the Haversine formula and the Bing Maps API depends on various factors, such as the level of accuracy required, the frequency of data updates, and the usability of external APIs in your organization. The Haversine formula is a good option if you have latitude and longitude values readily available in your dataset and need a simple and self-contained solution within Power BI. On the other hand, the Bing Maps API can provide more accurate results and may be suitable if you require up-to-date distance data or advanced geographical functionalities.

Overall, both options have their merits and limitations, and the choice depends on your specific requirements and preferences.

*Note: The code provided above is in DAX, the formula language used in Power BI.

Distance Calculation using Haversine

Tool: Code Explainer

Created: 01/09/2024

Prompt

Distance = 
VAR lat1 = RADIANS('Table'[Latitude1])
VAR lon1 = RADIANS('Table'[Longitude1])
VAR lat2 = RADIANS('Table'[Latitude2])
VAR lon2 = RADIANS('Table'[Longitude2])
VAR dlat = lat2 - lat1
VAR dlon = lon2 - lon1
VAR a =
    SIN(dlat/2) * SIN(dlat/2) +
    COS(lat1) * COS(lat2) *
    SIN(dlon/2) * SIN(dlon/2)
VAR c = 2 * ATAN2(SQRT(a), SQRT(1-a))
VAR distance = c * 6371 -- Earth radius in kilometers

RETURN distance

Code Explanation

The provided code calculates the distance between two points on the Earth's surface using latitude and longitude coordinates.

  1. It starts by declaring and initializing variables to store the latitude and longitude values of the two points:
  • lat1 is assigned the value of the latitude of the first point from the 'Table' column 'Latitude1' in radians.
  • lon1 is assigned the value of the longitude of the first point from the 'Table' column 'Longitude1' in radians.
  • lat2 is assigned the value of the latitude of the second point from the 'Table' column 'Latitude2' in radians.
  • lon2 is assigned the value of the longitude of the second point from the 'Table' column 'Longitude2' in radians.
  1. The code then calculates the differences in latitude and longitude:
  • dlat represents the difference in latitude between the two points (lat2 - lat1).
  • dlon represents the difference in longitude between the two points (lon2 - lon1).
  1. The code proceeds to calculate the distance between the two points using the Haversine formula, which is based on trigonometry.
  • The formula calculates the "great circle distance" between two points on a sphere (in this case, the Earth).
  • a is calculated as the sum of the following terms:
    • The square of the sine of half the latitude difference (SIN(dlat/2) * SIN(dlat/2)).
    • The product of the cosine of the first latitude, the cosine of the second latitude, and the square of the sine of half the longitude difference (COS(lat1) * COS(lat2) * SIN(dlon/2) * SIN(dlon/2)).
  • c is calculated as the arctangent of the square root of a divided by the square root of (1 - a) (2 * ATAN2(SQRT(a), SQRT(1-a))).
  • Finally, the distance variable is calculated as the product of c and the Earth's radius in kilometers (6371).
  1. The calculated distance, stored in the distance variable, is then returned as the output of the calculation.

Key Concepts

  • RADIANS function is used to convert degrees to radians, which is a widely used unit of measurement in trigonometry calculations.
  • Haversine formula is used to calculate the distance between two points on a sphere, such as the Earth's surface.
  • The formula takes into account the curvature of the Earth, making it more accurate for long distances compared to other distance calculation methods.

Additional Example

Here's an example of how the code can be used in practice:

Distance = 
VAR lat1 = RADIANS(37.7749) -- Latitude of point 1 in degrees
VAR lon1 = RADIANS(-122.4194) -- Longitude of point 1 in degrees
VAR lat2 = RADIANS(34.0522) -- Latitude of point 2 in degrees
VAR lon2 = RADIANS(-118.2437) -- Longitude of point 2 in degrees
VAR dlat = lat2 - lat1
VAR dlon = lon2 - lon1
VAR a =
    SIN(dlat/2) * SIN(dlat/2) +
    COS(lat1) * COS(lat2) *
    SIN(dlon/2) * SIN(dlon/2)
VAR c = 2 * ATAN2(SQRT(a), SQRT(1-a))
VAR distance = c * 6371 -- Earth radius in kilometers

RETURN distance

This example calculates the distance between San Francisco (latitude 37.7749, longitude -122.4194) and Los Angeles (latitude 34.0522, longitude -118.2437) and returns the result. The output would be the approximate distance between the two cities in kilometers.

DAX Code for Calculating Geographical Distance using Haversine

Tool: Languages Advisor

Created: 01/09/2024

Prompt

Distance = 
VAR lat1 = RADIANS('Table'[Latitude1])
VAR lon1 = RADIANS('Table'[Longitude1])
VAR lat2 = RADIANS('Table'[Latitude2])
VAR lon2 = RADIANS('Table'[Longitude2])
VAR dlat = lat2 - lat1
VAR dlon = lon2 - lon1
VAR a =
    SIN(dlat/2) * SIN(dlat/2) +
    COS(lat1) * COS(lat2) *
    SIN(dlon/2) * SIN(dlon/2)
VAR c = 2 * ATAN2(SQRT(a), SQRT(1-a))
VAR distance = c * 6371 -- Earth radius in kilometers

RETURN distance

The DAX code you presented calculates the geographical distance between two points, represented by their respective latitudes and longitudes, using the Haversine formula. Here's what each of the segments in your DAX formula does:

  1. Radians Conversion: lat1, lat2, lon1 and lon2 convert the latitudes and longitudes from degrees to radians. DAX functions like SIN, COS etc. use radians as the unit.

  2. Haversine Formula: dlat and dlon calculate the change in coordinates, while 'a' calculates half of the chord length between the points. 'c' is the angular distance in radians between the points, and 'distance' multiplies this angular distance by Earth's radius (in kilometers) to get the actual distance.

However, note that 'Table' should be replaced with the actual name of your table, if it isn't 'Table'.

Here is the formatted DAX query:

Distance = 
VAR lat1 = RADIANS('Table'[Latitude1])
VAR lon1 = RADIANS('Table'[Longitude1])
VAR lat2 = RADIANS('Table'[Latitude2])
VAR lon2 = RADIANS('Table'[Longitude2])
VAR dlat = lat2 - lat1
VAR dlon = lon2 - lon1
VAR a =
    SIN(dlat/2) * SIN(dlat/2) +
    COS(lat1) * COS(lat2) *
    SIN(dlon/2) * SIN(dlon/2)
VAR c = 2 * ATAN2(SQRT(a), SQRT(1-a))
VAR distance = c * 6371 -- Earth radius in kilometers
RETURN distance

By using the VAR keyword, you are creating intermediary steps, making your code more readable and debuggable, which is a best practice in DAX.

For more advanced learning, I recommend taking a closer look at the DAX language through the Enterprise DNA Platform's course content.

DAX Distance Calculation using Haversine

Tool: Code Best Practices Checker

Created: 01/09/2024

Prompt

Distance = 
VAR lat1 = RADIANS('Table'[Latitude1])
VAR lon1 = RADIANS('Table'[Longitude1])
VAR lat2 = RADIANS('Table'[Latitude2])
VAR lon2 = RADIANS('Table'[Longitude2])
VAR dlat = lat2 - lat1
VAR dlon = lon2 - lon1
VAR a =
    SIN(dlat/2) * SIN(dlat/2) +
    COS(lat1) * COS(lat2) *
    SIN(dlon/2) * SIN(dlon/2)
VAR c = 2 * ATAN2(SQRT(a), SQRT(1-a))
VAR distance = c * 6371 -- Earth radius in kilometers
RETURN distance


this is DAX formula. can you advise if this is best practice

Language/Technology Identification

The provided code snippet is written in DAX (Data Analysis Expressions), a query language used primarily in Power BI, Microsoft Analysis Services, and the Power Pivot in Excel.

Original Code

Distance = 
VAR lat1 = RADIANS('Table'[Latitude1])
VAR lon1 = RADIANS('Table'[Longitude1])
VAR lat2 = RADIANS('Table'[Latitude2])
VAR lon2 = RADIANS('Table'[Longitude2])
VAR dlat = lat2 - lat1
VAR dlon = lon2 - lon1
VAR a =
    SIN(dlat/2) * SIN(dlat/2) +
    COS(lat1) * COS(lat2) *
    SIN(dlon/2) * SIN(dlon/2)
VAR c = 2 * ATAN2(SQRT(a), SQRT(1-a))
VAR distance = c * 6371 -- Earth radius in kilometers
RETURN distance

Code Refactoring

Refactoring is not required for this code. All the variables are introduced using the VAR statement, and their purposes are self-explanatory to those familiar with the Haversine formula for calculating distances between two points on a sphere from their longitudes and latitudes. However, a minor modification can be introduced to clarify the purpose of some of the numbers present.

Refactored Code:

Distance = 
VAR EarthRadiusInKm = 6371
VAR lat1 = RADIANS('Table'[Latitude1])
VAR lon1 = RADIANS('Table'[Longitude1])
VAR lat2 = RADIANS('Table'[Latitude2])
VAR lon2 = RADIANS('Table'[Longitude2])
VAR dlat = lat2 - lat1
VAR dlon = lon2 - lon1
VAR a =
    SIN(dlat/2) * SIN(dlat/2) +
    COS(lat1) * COS(lat2) *
    SIN(dlon/2) * SIN(dlon/2)
VAR c = 2 * ATAN2(SQRT(a), SQRT(1-a))
VAR distance = c * EarthRadiusInKm
RETURN distance

Documentation of Changes

In the refactored code:

  1. I introduced an additional variable EarthRadiusInKm to replace the magic number 6371, which denotes the Earth's radius in kilometers. This modification enhances readability by spelling out what the magic number represents.

By explicitly stating the earth radius, it becomes easier to understand the intent of the formula, and easier for future reviewers or developers if they need to change the value or the unit of distance.

The core functionality of calculating the distance between two geographic coordinates remains the same, while the readability is improved. Overall, the code aligns well with the DAX best practices. More advanced DAX techniques like creating measures, handling BLANK values, and optimization, aren't needed for this specific snippet.

DAX RADIANS

Tool: Languages Advisor

Created: 01/09/2024

Prompt

Explains how the RADIANS function works

DAX RADIANS Function

The RADIANS function is a trigonometric function in DAX (Data Analysis Expressions). In a nutshell, it converts an angle, given in degrees, into radians.

How does it work?

The conversion from degrees to radians is relatively straightforward. Here's the formula used in the RADIANS function:

Radians = Degrees * PI/180

This is necessary because trigonometric functions such as SIN, COS, TAN, etc., in DAX expect their input in radians, not degrees. So, if you're dealing with angle measures in degrees, you would typically use the RADIANS function before using these trigonometric functions.

Example Usage

Let's say you have an angle of 45 degrees and you want to calculate its sine. First, convert the degrees to radians using the RADIANS function, and then pass the result to the SIN function.

SIN ( RADIANS ( 45 ) )

The function RADIANS ( 45 ) will convert 45 degrees into radians, and then SIN() will calculate the sine of this radian measure.

Important Note

Always remember that trigonometric functions like SIN, COS, TAN, etc., in DAX take their arguments in radians. This is a common source of confusion and mistakes, so always ensure to convert from degrees to radians using the RADIANS function when necessary.

To get a more in-depth understanding of using DAX, consider taking some of the relevant courses on the Enterprise DNA Platform. They are designed to equip you with the necessary knowledge and skills to work efficiently with DAX function in Power BI.