Nightmare on GIS Street: Accuracy, Datums, and Geospatial Data

March 5, 2013 - By Eric Gakstatter

This subject scares me. I’m not a trained geodesist. I’m not a mathematician. Yet, I’d be derelict in my duty if I didn’t write about this subject. I know enough to be dangerous, and enough to know this subject is going to be a nightmare for people managing geospatial databases.

Headache today, nightmare tomorrow

The only reason it’s not a nightmare today is because most of you don’t know it’s even a problem. Or, you know it’s a problem, but let it slide because dealing with it is not easy. It’s going to get worse in the future, much worse. It’s going to get worse because sensors (GPS, GNSS, imagery, etc.) are getting much more accurate. The cost of acquiring high-precision (centimeter-level) data, whether it’s via GNSS, scanning or ?? is falling hard and fast. As I’ve written before, high-precision GNSS receivers are getting much cheaper. Geodata 2.0 is coming, and it is making existing geospatial databases look like my kids’ coloring books.

It reminds me of an experience I had nearly 20 years ago.

I was traveling through the southeastern U.S. demonstrating a new GPS mapping handheld that I helped develop. Mind you, this was in the early days of GPS mapping. WAAS/SBAS didn’t exist, sub-meter receivers didn’t exist, CORS didn’t exist, and real-time corrections were only a dream so almost everyone post-processed using a local base station, if they could find one — and achieving 1-3 meter accuracy was pretty dang good.

I was showing this new GPS mapping receiver to a forestry company that owned a lot of land in the southeast. We traversed a ~40 acre parcel of land, brought it back to the office and post-processed the data against a nearby GPS base station. After post-processing, the data looked very clean and I was eager to see it inserted into the company’s GIS, hoping it would slide into the right spot in the GIS and they would purchase a bunch of GPS units. No dice. When it was inserted into their GIS, the perfectly shaped polygon fit imperfectly into the GIS. It didn’t match up with the orthophotos and it didn’t match up with their existing vector data (point/line/polygon). It was offset enough to make you raise your eyebrows and think to yourself — hmmm, that’s a problem.

Of course, I did my due diligence by checking the integrity of the GPS base station data I used and verified its surveyed antenna location. Everything checked out. I was confident that my data was accurate. I started questioning the GIS manager about the horizontal datum used in their GIS database. It quickly became clear to me that the enterprise GIS database was referenced to something different than the modern horizontal datum of that era. It was also clear that there were varying types of accurate and less accurate data in the GIS. A mish-mash of geographic data with some of it based on the legacy NAD27 horizontal datum that was transformed to NAD83/86 using NADCON or something similar.

After discussing this a bit with the GIS manager, he admitted that my GPS data was likely more accurate than his GIS database, but he was clear that “I’m not going to readjust my entire GIS database for your GPS unit.” My counter-argument that “you’re going to have to do it eventually anyway” was met with “I honestly don’t see this happening anytime soon.”

I may have won the battle, but lost the war.

Later that same year, I had a similar experience in California. A major environmental consulting company wanted to delve into using GPS for mapping. I sent them one of my GPS units to try. After a few days of the company collecting GPS data and post-processing, I got the call.

“Your GPS unit isn’t accurate enough for our work.”

Whaaaat? From the outset, I was clear to them that the GPS unit would deliver accuracy within 1-3 meters, and they stated this was acceptable accuracy to them. I looked at the data. It was clean and point averages were tight. It looked very good. I verified the GPS base station they were using. No problems there.

“What are you comparing the GPS data to?” I asked.

“USGS 7.5’, 1:24,000 scale topo maps,” he replied.

Ruh roh.

Me: Wellllll, you know that USGS 7.5’ topos are referenced to NAD27 and have gross errors up to hundreds of feet in some places, especially rural areas, don’t you?

Him: We’ve used 7.5’ topo maps for many years and feel good about the accuracy they provide. Your GPS data is on the wrong side of the creek.

Me: Hmmm, how about you go occupy a survey mark with known coordinate and compare the GPS data to the survey mark coordinates? That will tell you how accurate the GPS is performing.

Him: We need it to work where we work, and it’s giving us data on the wrong side of the creek. Thanks for your time. Click.

Sigh, lost the battle, and lost the war.

After nearly 25 years in the GPS/GNSS and GIS industries, data mismatch (“my data doesn’t line up”) is still the #1 question I get from people.

The problem is two-fold.

People, even educated geospatial professionals, have a general lack of understanding of the different horizontal datums being used (not to mention vertical datums).
Software vendors (even the major ones), have generally done a poor job of keeping up with modern datum transformations. While most software makes it easy to transform data from one horizontal datum to another, they mostly do it wrong.

The errors can vary from a few centimeters to a few meters to tens of meters. In the world of GPS data collection, the most common datum transformation error is when software considers WGS-84 equivalent to NAD 83 and applies no transformation when, in reality, the difference between the latest version of NAD83 differs from the latest version of WGS-84 by more than a meter in most parts of the USA.

In this day of ever-increasing availability of public GIS data, it’s soooo easy to download an orthphoto (ortho-rectified aerial photograph), or GIS vector data from a public website and import it into your GIS. When importing, you’ll likely be asked to select the horizontal datum and the measurement units of the new data. More than likely, information about the new GIS data (metadata) isn’t readily obvious or available so you make your best guess from the list of choices presented. Is the data referenced to NAD83/86? Is it referenced to NAD83/HARN? Is it referenced to WGS-84? If so, which version of WGS-84? Your selection might significantly affect the accuracy of imported features of your GIS. What if you make the wrong selection with an aerial photo, then months or years later you have someone digitize (heads-up with a mouse) road centerlines, fire hydrants, manhole covers, etc., based on that aerial photo? Any transformation error you introduced when importing the original aerial photo will carry through to the digitized features.

The good news is that GIS software makes it very easy to import raster (images) and vector (points/polylines/polygons) data. That’s also the bad news. With a few clicks of a mouse, your GIS database can be infected with data you think is accurate to a certain level, but it’s really not, maybe due to the way you imported the data. I’m not saying that every piece of data imported into a GIS needs to be a certain (or the same) accuracy level. The problem is if you don’t keep track of the metadata for items that you import into your database, you will quickly lose a grip on the accuracy integrity of your GIS. As GIS data becomes more accurate, as I discussed above, the accuracy disparity among different layers in your GIS will increase. In other words, the problem will become bigger than it is today.

I’ll give you a scenario I’m involved with now that highlights this challenge. I used a pseudo-name for the company and have embellished a bit to emphasize some points, but the basic facts are correct.

ABC Company has tens of thousands of small infrastructure assets in the field across the U.S. It already has the desired location accuracy on some (within 30 cm, or 1 foot) on some of them. For the remaining assets, the company wants to improve the accuracy of the features. To do this, the company plans to use GPS/GNSS receivers to collect position and attribute information on the assets. A second requirement is to deploy GPS/GNSS receivers capable of sub-meter accuracy to navigate back to assets when necessary.

They are now in initial phase of testing various GPS/GNSS receivers.

Their first try was using a handheld GNSS receiver capable of “sub-foot” accuracy and post-processing against GPS CORS. It didn’t take long for them to figure out the workflow was a headache. I agree, the whole GPS post-processing workflow is so last decade (and mind you, I helped design one of the first Windows-based GPS post-processing software programs back in the 1990s).

For the second iteration, the workflow was much smoother. They used a GNSS receiver that utilized real-time WAAS corrections for sub-meter accuracy. The workflow was smooth due to real-time GNSS data being brought directly into ArcGIS Mobile in the field. The problem was accuracy. All of the coordinates collected during the testing were offset to the northwest by about 3 feet. Precision was great, but accuracy was unacceptable. Was it a problem with the GNSS receiver? No. When GPS/GNSS data is shifted consistently in one direction when compared to other data, it is almost always due to a difference in horizontal datums. In this case, it didn’t take long to determine that the difference was data referenced to ITRF (WAAS) vs. NAD83 (basemap). However, we had to do a little more investigation to understand which version of NAD83 was being used in order to find the best horizontal datum transformation choice in ArcGIS Mobile. It wasn’t obvious, not by a long shot. In fact, it was downright cryptic. There wasn’t a datum transformation labeled “WAAS” or anything close to it. As an example, one of the transformation names was cryptically named NAD_1983_To_WGS_1984_5. What does that mean? Which version of NAD83? Which version of WGS-1984? What does _5 mean?

With some investigation and experimenting with different transformation choices, we finally got it dialed in to a reasonable level. Remember, we were only looking for sub-meter accuracy so ~10 cm of datum transformation error here or there wasn’t significant. Even if we didn’t make the perfect transformation choice, we were close enough. However the investigation and experimenting drill was painfully time-consuming (locate a high-integrity survey mark nearby and occupy it), more than it should have been.

The next step, setting up the workflow for the “sub-foot” mapping GPS/GNSS receivers, wasn’t as easy. First of all, instead of using WAAS as a correction source (not accurate enough), they used an RTK network. The network base stations were tied to the latest version of NAD83, which is NAD83/2011. They really wanted to dial in the correct horizontal datum transformation. The challenges were a bit different than testing the datum transformation for the sub-meter equipment. They wanted to dial in the datum transformation as close as possible. Again, the datum transformation selection choices in ArcGIS Mobile were cryptic. But, this wasn’t the only challenge. Since they were using RTK GPS/GNSS receiver capable of 1-2 cm accuracy, errors within the different GIS layers emerged. Some layers were referenced to NAD83/2011, which was perfect, while other layers were referenced to much older versions of NAD83. To the software’s credit, an alarm popped up noting the difference in datums of the older layers, but didn’t give them any guidance as to how they should proceed. In that case, Cancel is the normal response and is what they selected.

After experimenting and testing the different datum transformations in ArcGIS Mobile, they found the one that seemed to produce the best results (confirmed by testing against a high-integrity survey mark). All in all, a very time-consuming process spread out over a few weeks.

A challenge that still remains is “hot-swapping” between using the RTK Network (NAD83/2011) or WAAS (ITRF08) as a source of GPS/GNSS corrections. ArcGIS Mobile doesn’t seem to deal with switching GPS/GNSS incoming datum changes very well on the fly (in the field).

If, after reading the above, you’re confused or feel the need to read it again to understand it, welcome to the club. Plenty of brainpower was spent sorting out this problem and verifying the solution. When your GIS has plenty of slop in it, no worries. When you start dissecting it at the centimeter level, you’ll really be forced to take a microscope to each data layer and all of the sudden metadata becomes very important.

This article is just an introduction to the challenge of dealing with disparate horizontal datums in your GIS. As the programmer for datum transformation at a major GIS software manufacturer said, “We are moving into a new era” in dealing with datum transformations. Although I mention Esri software in this article, other leading software vendors aren’t doing any better. I discussed the issue of supporting the 14-parameter transformation between NAD83/2011 and ITRF08 with another major software vendor late last year. Their CEO’s response? “Yeah, we just had an internal meeting on that and need to support it.” Whaaaat? I wonder how his thousands of users utilizing WAAS as a source of GPS corrections have been handling this in the past 10 years. Not surprisingly, they aren’t the only major geospatial software that is falling down in this area. More than likely the software you use isn’t handling this correctly.

Lastly, in speaking with Michael Dennis at the U.S. National Geodetic Survey, he said that while the 14-parameter transformation algorithm is important, the step that people are leaving out is reconciling epoch dates of the data. Why is a date stamp of the data important? That’s the focus of my next article on this subject.

Follow me on Twitter by clicking here.

Thanks, and see you next time.