The Distance Between Two Points

13 Mar 2014

Algorithms, C#, Maths, SQL


What can mathematics do for your programming? That's an interesting question to explore, even more so when answered with an example. As luck would have it, such an example presented itself to me at work a few months ago.


The Problem

Our mobile phone application makes use of mobile phone GPS positioning to assist the user in finding a resource that is within their current vicinity. A list of resources is created, then sorted based on the distance between the user's location and the location of each resource. Distance is approximate, and calculated based on a straight line between the two points, and doesn't take into account streets, buildings, or anything else that modern mapping software does so well. It's a quick and dirty way to give the end user an approximate idea of how far they will need to travel. However it was super slow, taking at least 18 seconds to respond for only a handful of records - what was going on?


The Implementation

Now that we understand what needs to happen, let's have a look at how it was implemented. The mobile app first initiates the request to the web service, then:

This all seems pretty straight forward, so let's drill a little more into the key areas of the code. Note that this code appears in sanitised form! Here's the code that first queries the database via Entity Framework:

public IEnumerable GetMethod(int startRow, int rowCount, double lon, double lat, double eventRadiusKm) { using (var db = new Context { Log = new DebuggerWriter() }) { IEnumerable results = ( from e in db.Table.ToList() where e.Deleted == false && ((db.DistanceBetween(lat, lon, e.Latitude, e.Longitude) < eventRadiusKm)) orderby db.DistanceBetween(lat, lon, e.Latitude, e.Longitude) ascending select SetDistanceProperty(e, db.DistanceBetween(lat, lon, e.Latitude, e.Longitude)) ).Skip(startRow - 1).Take(rowCount); } } private static SomeObject SetDistanceProperty(SomeObject e, double? distance) { e.DistanceFromLocation = distance; return e; }

So here's what's this LINQ statement is essentially achieving:

  1. A new database context is created
  2. A list of records is retrieved from the underlying data structure, then materialised (serialised into objects) into the appropriate list form. At this point, it's important to understand that all records from the query are going to be materialised here, even if you only need a handful. So if there are 100000 records, you're going to get every single one
  3. The method db.DistanceBetween is invoked during each serialise operation, this is the method that performs the distance calculation
  4. Next, the LINQ statement performs an ORDER BY operation, again invoking db.DistanceBetween for each serialised item in the list
  5. Next, the LINQ statement performs a SELECT using the SetDistanceProperty method. This method sets the DistanceFromLocation property on the object, then returns it back to the LINQ query. Once again, the db.DistanceBetween method is invoked for a third time
  6. Lastly, the LINQ statement skips an amount of rows, then returns a list equal to the size of the row count

Key points from the above LINQ statement:

This is all well and good, until you dig into the code contained within the db.DistanceBetween method. And here it is:

[global::System.Data.Linq.Mapping.FunctionAttribute(Name="dbo.DistanceBetween", IsComposable=true)] public System.Nullable DistanceBetween([global::System.Data.Linq.Mapping.ParameterAttribute(Name="Lat1", DbType="Float")] System.Nullable lat1, [global::System.Data.Linq.Mapping.ParameterAttribute(Name="Long1", DbType="Float")] System.Nullable long1, [global::System.Data.Linq.Mapping.ParameterAttribute(Name="Lat2", DbType="Float")] System.Nullable lat2, [global::System.Data.Linq.Mapping.ParameterAttribute(Name="Long2", DbType="Float")] System.Nullable long2) { return ((System.Nullable)(this.ExecuteMethodCall(this, ((MethodInfo)(MethodInfo.GetCurrentMethod())), lat1, long1, lat2, long2).ReturnValue)); }

Yep, that's correct. This code goes back to the database, and executes the following scalar function:

ALTER FUNCTION [dbo].[DistanceBetween] (@Lat1 AS real, @Long1 AS real, @Lat2 AS real, @Long2 AS real)

RETURNS REAL

AS

BEGIN

DECLARE @dLat1InRad AS float(53);
SET @dLat1InRad = @Lat1 * (PI()/180.0);
DECLARE @dLong1InRad AS float(53);
SET @dLong1InRad = @Long1 * (PI()/180.0);
DECLARE @dLat2InRad AS float(53);
SET @dLat2InRad = @Lat2 * (PI()/180.0);
DECLARE @dLong2InRad AS float(53);
SET @dLong2InRad = @Long2 * (PI()/180.0);
DECLARE @dLongitude AS float(53);
SET @dLongitude = @dLong2InRad - @dLong1InRad;
DECLARE @dLatitude AS float(53);
SET @dLatitude = @dLat2InRad - @dLat1InRad;
/* Intermediate result a. */
DECLARE @a AS float(53);
SET @a = SQUARE (SIN (@dLatitude / 2.0)) + COS (@dLat1InRad) * COS (@dLat2InRad) * SQUARE(SIN (@dLongitude / 2.0));
/* Intermediate result c (great circle distance in Radians). */
DECLARE @c AS real;
SET @c = 2.0 * ATN2 (SQRT (@a), SQRT (1.0 - @a));
DECLARE @kEarthRadius AS real;
/* SET kEarthRadius = 3956.0 miles */
SET @kEarthRadius = 6376.5;        /* kms */
DECLARE @dDistance AS real;
SET @dDistance = @kEarthRadius * @c;
RETURN (@dDistance)

END

This function is obviously a hack (we don't use miles in Australia) and has just been copied in from somewhere.

Let's not forget that the following is executed 3 times per record! When you factor in all method calls involved in performing this calculation, it becomes one EF method, and one scalar function call per record, multiplied by 3. That's now 6 moving parts per record!

As you can see, end-to-end, there's a fair bit of code being pushed around here to calculate the distance between two points - too much code in my opinion.


The Solution

When I first uncovered this code, I spotted instantly the pattern, and new that there was a simple mathematical solution for this. I used the Pythagorean Distance Formula:

d = √ΔX2 + ΔY2

You can read this is: distance is the square root of delta X squared plus delta Y squared. ΔX denotes the change (Delta) in X.

Expanding this slightly:

Distance = √(X2 - X1)2 + (Y2 - Y1)2

In code, this became:

private const double KilometersPerDegree = 111.325; private static double DistanceBetweenTwoPoints(double Latitude1, double Longitude1, double? Latitude2, double? Longitude2) { double dDistanceBetweenTwoPoints = 0; if ((Latitude2.HasValue == true) && (Longitude2.HasValue == true)) { // Using Pythagoras to calculate this using the distance formula: // d = SQRT((x2 - x1)^2 + (y2 - y1)^2) dDistanceBetweenTwoPoints = Maths.Sqrt(Maths.Pow(((Latitude1) - (Latitude2.Value)), 2) + Maths.Pow(((Longitude1) - (Longitude2.Value)), 2)); // Converts our distance value into a kilometer value dDistanceBetweenTwoPoints *= KilometersPerDegree; } return dDistanceBetweenTwoPoints; }

Making a few other adjustments (including throwing away Entity Framework) reduced the code size dramatically, and reduced the execution time for equivalent operations from a minimum of 18 seconds to less than a millisecond.


The Conclusion

My approach was as pragmatic as I could be - get it done, and get it out the door. There are other ways of achieving a similar result. For example, a much nicer solution could involving sending the user's coordinates into the database query and return the results with the addition of computed column that contains the calculated value already.

It may seem that this article is simply a dig at the programmer who orignally wrote this code, after all, we all try and write the best possible solution at the time don't we? And it's very easy to be smug with the benefit of hindsight. That aside, that's not my intention at all. The goal of this article is to demonstrate how simple mathematics can improve your programming in so many ways:

I hope this article has been of some use to you. Don't be scared of maths, get into it, the results are worthwhile and are sometimes immediate.


 

Copyright © 2024 carlbelle.com