Blog

Large-Scale Real-Time Media Monitoring with Entity Extraction

, , ,

There are many different applications where users need to monitor news that might come from traditional news sources, social media, RSS newsfeeds, or any one of the myriad sources now available. Examples include:

  • A media company that needs to monitor global news in a multitude of languages;
  • Financial analysts seeking to track information of economic interest, including stock price movements, exchange rates, etc. and events around the world that could affect their portfolios such as political unrest;
  • A data provider looking to collect adverse news information to support due diligence and screening;
  • Medical researchers tracking disease outbreaks;
  • Companies with long international supply chains that need to monitor all the environmental, political, etc. events that could disrupt those chains.

What is common to all the data sources for these applications and others is that they contain an enormous amount of unstructured data, well beyond the capacity of any purely manual attempt to review them. According to the International Data Corporation, new data is generated at a rate of up to 1.7 MB per person, per second. Still, organizations can use news and social media data to their advantage thanks to Entity Extraction software. Entity Extraction software can find the relevant information buried within these large amounts of unstructured data.

10 Things That Your Entity Extraction Software Should Be Able to Do

For Entity Extraction software to be truly useful for media monitoring, it is important that the software be able to:

  1. Automatically aggregate the news items that are on the same topic;
  2. Identify named entities within the news items such as, at a minimum, the names of people, organizations, places, time expressions, and monetary amounts (plus any terms more relevant to technical/scientific areas such as epidemiology);
  3. Identify links like the locations of companies, the relationships between companies, and the affiliations of people;
  4. Identify a large set of relevant event types such as political changes, crime, cyber security incidents, conflicts, and natural disasters, and also identify who the players are in the events (“Who did what to whom?”);
  5. Perform geotagging, that is, disambiguate and assign coordinates to the extracted places. Once place names are geotagged, the extracted information, including events, can be viewed on a map;
  6. Support the real-time generation of alerts for highly relevant news;
  7. Above all, allow users to perform a semantic search of all the extracted information above. Users need to be able to search for all events of a given type, e.g., all occurrences of disease outbreaks, political unrest, or violence;
  8. Be accurate as measured by standard metrics: maximize recall (i.e., have a low number of false negatives or missed information) and precision (i.e., have a low number of false positives or incorrect extraction);
  9. Be fast to support real-time monitoring;
  10. Be able to process data in multiple languages.

When your organization depends on critical news in order to flourish, it’s essential to use Entity Extraction software that goes beyond the usual standard.

Recent Posts

  • The Critical Role of Entity Extraction in Transaction Screening

    The Critical Role of Entity Extraction in Transaction Screening

    An especially challenging aspect of transaction screening is that it needs to also apply to the free-text fields in transaction…

    View Post

  • Colorful app icons including globe, RSS feed, music, and social media symbols on 3D tiles.

    Sentiment Analysis is Key for Social Listening and Social Media Monitoring

    Your products and services are talked about in blogs, review sites, social media, and call centers. Sentiment Analysis enables you…

    View Post

  • Earth at night with city lights visible across continents against a starry sky.

    Geotagging Text for Advanced GIS

    GIS tools rely mostly on structured data, but what about all the geospatial intelligence buried in unstructured data?

    View Post