Skyhook Under The Hood: How We Scale Geopositioning for More Than 2 Billion Wi-Fi Access Points

May 8, 2015   

Posted by Kipp Jones

Lat. 42.351994 Long. -71.047663

In this series so far, we’ve discussed the multi-step process of geopositioning billions of Wi-Fi access points. As a refresher, here are the steps we have covered to date:

Step 1: Determine the location of a Wi-Fi access point

Step 2: Gather on-device signals and information

Step 3: Get signals to server or get beacon locations to device

Step 4: Compute location of device

In addition to each of these steps that we have covered so far, you will also need to take into account additional information that may be available. For example, if we also have Cell information available, we will make use of that information to help ascertain what we call the ‘Coarse Location’. This can help increase the overall accuracy of the system by anchoring the location within the sphere of influence of the given cell signal. We may also take advantage of sensor information to help deal with stationary devices (remove jitter) as well as using historic information to inform subsequent positions (smoothing).

Bottom line: using Wi-Fi, Cell, GPS, IP, and sensor data (with more to come), we are able to provide location that is:

  • Highly available (high yield, nearly anywhere)

  • Fast (quick Time To First Fix)

  • Highly accurate (deliver the best location based on available information)

  • Battery-conscious; and

  • Privacy preserving

A differentiating feature of the Skyhook location network is that we can provide location regardless of the type of device or the hardware installed.  Our software has been ported to many different operating systems, we have run on mobile devices, laptops, netbooks, gaming devices, tablets, watches, media devices, femto cells, offender trackers, pallets, and others. We can support location on-device, off-device, or ‘device-less’ via our SDKs and our APIs.


Step 5: Throw all newly gathered signal information into the soup and stir

Phew, that was easy.

While some of the signal information we use to expand and update our location network comes from planned collections, every request that we get also provides us with additional valuable signal data.  Processing billions and billions of location requests every month means that we are getting a lot of signal data directly from the devices we serve, and with that feedback, we are also getting a ton of additional information about the world every day.

All of this information is pumped back into our processing, allowing us to continuously update our database with the latest, greatest beacon location, keeping you and your devices locatable even as the world changes around you.

But we also get a lot of questions about how we deal with things like mobile APs and “What happens when somebody takes their Wi-Fi access point and moves?”. Good questions, and our challenge from the very beginning has been to develop innovative solutions to them.

For one thing, our positioning system relies largely on beacons that we do not own or control and that can — and do — move around.  Moreover, a lot of the data we receive is incredibly noisy (radio signals are notoriously misbehaving when put into real world situations).

To address this, we developed a number of methods, algorithms, and heuristics for dealing with ‘moved’, ‘misbehaving’ or ‘pathological’ beacons. In particular, we are able to discover when access points move based on spatiotemporal clustering of the observations. As we obtain new observations, we recompute the location — and if we see a new cluster of observations that is trustworthy, we will make the decision to reposition that access point.

Similarly, we developed methods to find pathological beacons. ‘Pathological’ to us is any beacon that is not trustworthy for use in computing a location. Some of the categories of pathological Wi-Fi access points include:

  • Mobile APs

  • Ubiquitous APs

  • Big Coverage APs

In today’s world, mobile Wi-Fi access points are the most prolific in the ‘pathological’ categorization. This encompasses such things as automotive, airplane, bus, train, etc. mounted access points. It also includes things such as MiFi devices and smartphones being used as Wi-Fi access points.


Similarly, there are misbehaving access points that have a ‘fake’ or ‘reused’ MAC address. While a MAC address is ‘supposed’ to be unique, sometimes the MAC addresses are incorrectly reused — when we detect these types of devices, they also get relegated to the Quarantine — and if they are really bad, we’ll throw them in the isolation tank (just kidding, that would be cruel). Once we have enough observations to see these mobile behaviors, we will put the offender into what we call ‘Quarantine’ — essentially removing it from our system to ensure that we don’t use that device for computing locations.

Another type of ‘pathological’ behavior that we look for are what we call ‘Big Coverage’ Wi-Fi access points. This is an access point that exhibits a large range of signal propagation. This may be the result of it being located in a tower (say the 40th floor of a building) or because it has been overpowered. For these, rather than isolating them, we classify them, allowing them to continue to be used, but only for ‘Coarse Location’.

Every week, we go through this full process, cleaning, updating, removing, characterizing and testing. Each week, we have a shiny new database with updated Wi-Fi, Cell, and IP location data available to power requests from our customers and end users.

Step 6. Do it again, and again, and again...

While I have focused this process on our Wi-Fi positioning system, we do the same thing for Cell positioning and IP positioning, using inbound observations to continually refresh, renew, correct, and improve our positioning system. And while those systems burn through the CPU cycles, we spend our human cycles adding features, improving our algorithms, creating new SDKs, slimming down our code for small devices for use in wearable and IoT, and continuing to grow our suite of location-based tools.

We’re in the process of building some really cool stuff here at Skyhook, so my next blog post might take a little bit longer. But in that next post, I’ll discuss a bit about how we rolled our experience in positioning and location into the location based contextual space and what we are doing to solve our customer’s problems in that arena.header_world

Download Skyhook Precision Location SDK

Topics: Company