Why Simple Geotagging Software Isn't Enough to Create a Map from Text

September 29, 2017 | Enterprise Search, Geotagging, Intelligence Analysis

geotagging software

For thousands of years, cartographers have drawn maps to help us define and navigate the physical world. And while there are few places left on earth that have not been mapped, the dawn of the digital world has created an entirely new way to map, visualize, and analyze the world around us. In particular, geospatial analysis via Geographic Information Systems (GIS) is nowadays a common type of analysis used in almost every industry, from environment and epidemiology to defense, disaster management, utilities, and natural resources. GIS renders spatial data on maps. Naturally, geospatial analysis is dependent on the spatial data it is based on: the more complete and accurate the spatial data, the more complete and accurate the analysis.

Traditionally, geospatial analysis is based on already structured spatial data (e.g., a database of oil wells with coordinates). There is, however, a wealth of spatial information to exploit in unstructured data and in particular in text (e.g., news, research reports, social media). In fact, unstructured data makes up 80% of all data and it continues to grow exponentially.

Exploiting text data requires geotagging software to automatically identify spatial references (e.g., place names) in text and assign latitude and longitude values to those.

Simple geotagging vs. advanced geotagging software

Not all geotagging software is created equal though. Simple geotagging software relies on gazetteers or geographical dictionaries with minimal or no attention to context. There are three areas where simple geotagging is inadequate:

  1. Language is ambiguous. For instance, the word “London” may refer to London, England or to Jack London. Simple geotagging tools will not perform disambiguation and will result in a vast number of false positives, which can lead to incorrect geographical analysis.
  2. Place names are often ambiguous. For instance, simple geotagging tools will not differentiate when “Springfield” refers to Springfield, Massachusetts vs. Springfield, Illinois.
  3. Third, geo-codable information is not limited to place names. There are other entities (e.g., people, organizations) and even events that can be geocoded. Basic geotagging tools will not recognize those other entities and events in text.

NetOwl’s Smart Geotagging and rich semantic extraction capabilities work together to correctly identify and disambiguate place names and other geo-codable pieces of information. NetOwl analyzes the content of the text using advanced Natural Language Processing to help differentiate ambiguous geo-codable entities.

In addition to disambiguation, NetOwl’s Smart Geotagging is also able to:

  • Output a confidence ranking on the geotagged location so as to convey estimative probability regarding the accuracy of the location
  • Assign latitude and longitude values to relative location phrases (e.g., “a town 30km northeast of Paris”)
  • Convert MGRS coordinates to latitude and longitude values

To perform geospatial analysis, simple geotagging software isn’t enough. This kind of basic software can often result in false positives and false negatives and fail to take into account context. NetOwl’s Smart Geotagging and rich semantic extraction offers superior precision by performing disambiguation and superior coverage by significantly expanding the range of geo-codable information.