NetOwl Extractor
Entity Extraction
Entity extraction (aka named entity recognition or NER) finds entities such as person, place, organization, and product names in text using AI-based natural language processing and machine learning technologies. NetOwl offers highly accurate, fast, and scalable entity extraction in multiple languages.

Advanced Entity Extraction Powered by NLP
Entity extraction is a critical technology that turns large volumes of unstructured text data into semantically structured information, which is then exploitable by many data analytics applications such as search, link analysis, and business intelligence tools.
However, entity extraction capabilities vary greatly across tools. Different use cases may call for different sets of capabilities. Does your use case require accurate entity extraction not only on well-formed texts (e.g., news articles) but also informal texts (e.g., texting, social media)? Does it require fast and reliable entity extraction for real-time processing and/or high scalability for data surge? Does it need an extensive set of entity types beyond basic types? Does your specific domain require customization?
NetOwl’s NLP-based entity extraction offers best-of-breed performance in terms of accuracy, speed, and scalability with the most comprehensive set of advanced capabilities, including relationship extraction, event extraction, and geotagging.
Key Product Features
Accurate
Provides state-of-the-art entity extraction accuracy even on noisy text.
Extensive Ontology Coverage
Extracts to a semantic ontology consisting of over 100 types of entities.
Customizable
Creator Edition (CE) enables the customization of existing entity types or addition of new entity types.
Multilingual
Supports multiple languages including English, Arabic, Chinese (traditional and simplified), French, German, Persian (Farsi and Dari), Russian, and Spanish.
Fast & Scalable
Extremely fast for real-time analysis. Highly scalable entity extraction software with Docker and Kubernetes support.
Easy Integration
Easy-to-integrate entity extraction product with a REST API. Pre-integrated with popular search and analytics tools like Elasticsearch and Esri ArcGIS.
The Challenges of Entity Extraction
NetOwl addresses a wide variety of entity extraction challenges including:
- Entity Type Ambiguity: The same name may refer to entities of different types depending on context:
- Washington State (state) vs. Washington DC (city) vs. George Washington (person)
- Part of Speech Ambiguity: The same word may have multiple grammatical functions depending on context:
- May Stevens (first name) vs. Tom May (last name) vs. May (month) vs. may happen (auxiliary verb)
- Semantic Ambiguity: The same word may have different meanings that result in an ambiguous context for entity extraction:
- crane operator Chris Johnson (person) vs. tour operator XYZ Adventure (organization) vs. logical operator XOR (a function in programming)
- Natural Language Creativity: New names are constantly created for companies, products, and even personal and place names.
- Noisy Text: Unlike well-edited texts such as newspaper articles, casual language, which is common in social media, email, and texting, often lacks proper capitalization and punctuation, contains typos and misspellings, and may not be grammatical. Also, certain sources like OCR and ASR output frequently contain errors.
Extensive Ontology for Entity Extraction
With over 100 types of entities, NetOwl entity extraction offers a broad semantic ontology that goes beyond that of standard named entity extraction software. It includes people, various types of organizations (e.g., companies, governments), several types of places (e.g., countries, cities), addresses, artifacts, phone numbers, titles, etc. This expansive named entity recognition forms the foundation for more advanced relationship extraction and event extraction.
Domains include Business, Finance, Politics, Homeland Security, Law Enforcement, Military, National Security, and Social Media.
Multilingual Entity Extraction
NetOwl supports entity extraction in multiple languages, including:
- English
- Arabic
- Chinese (traditional and simplified)
- French
- German
- Korean
- Persian (Farsi and Dari)
- Russian
- Spanish
The same comprehensive entity ontology available in English is also available for all the other languages, making multilingual and cross-lingual text analytics applications easier to develop.
Coreference Resolution for Advanced Entity Extraction
Entity Extraction is not only about identifying Named Entities, that is, names of people, organizations, places, etc., but also about identifying what names are mentions of the same entity. In fact, named entities often appear in various forms, from full names (e.g., Katherine Isabel Jones) to partial names such as last name only (e.g., Jones), first name only (e.g., Katherine), no middle name (e.g., Katherine Jones), and acronyms (e.g., FBI for Federal Bureau of Investigations, NYC for New York City).
Entity Extraction recognizes what names refer to the same entity via a complex process called Coreference Resolution. In addition to recognizing possible name variants, Coreference Resolution takes into account other factors such as:
- Gender. For instance, a last name like “Jones” in the context “Mr. Jones” will be assigned gender male and will not be a possible reference to “Katherine Jones”.
- Distance. The further away a name is from the closest previous mention (i.e., the so-called antecedent), the less likely it is that they refer to the same entity.
Coreference Resolution is required for more complete and accurate Entity Extraction. It is also important in order to calculate entity frequencies, which determine saliency and document topics, and in order to capture an entity’s correct and complete relationship and event information from content scattered throughout a document (e.g., “General Dynamics… GD purchased Praxis in 2018”)
Name Normalization
NetOwl not only performs entity extraction but also assigns normalized forms to extracted person, organization, and place names, taking into account capitalization, acronyms, abbreviations, nicknames, etc. When Smart Geotagging is used, place names are both disambiguated and normalized. Name normalization is ideal for cross-document name resolution for applications such as faceted search, and various forms of intelligence analysis such as geospatial analysis and link analysis.
NetOwl integrates easily with many popular search, geospatial, and business intelligence tools such as Elasticsearch and Esri ArcGIS.
Smart Name Translation
NetOwl Extractor’s Smart Name Translation provides English translation of named entities extracted from foreign language texts that use different alphabets/scripts, for example, Arabic, Chinese (traditional and simplified), Korean, Persian (Farsi and Dari), and Russian. Smart Name Translation gives an effective way for monolingual users to search and gist foreign language content.
Deploy on Premise or in the Cloud

Entity Extraction Solutions

Intelligence Analysis

Enterprise Search

E-discovery

Trusted by leading global organizations

Featured Blog Posts

What is Entity Extraction?
What Does Entity Extraction Do? It is commonly said that about 80% of all data is unstructured data, which means it…

How to Choose an Entity Extraction Product
From accuracy to coverage, scalability, customization, and others. There are quite a few factors to consider when choosing an entity…

Entity Extraction for Knowledge Discovery
Organizations of all types need to discover critical knowledge contained in overwhelming, ever larger data volumes. That’s where Entity Extraction…
Frequently Asked Questions
-
What are the advantages of NetOwl Entity Extraction over other commercial or open-source entity extraction software?
NetOwl Extractor offer several advantages:
- Unparalleled breadth and depth of the out-of-the-box entity ontology;
- High accuracy even with documents that are not well edited and lack proper capitalization or punctuation (e.g., social media);
- High throughput. When evaluators test various entity extraction software products, they often tell us that NetOwl Extractor processes input at least 10 times faster while producing more accurate and detailed output.
-
How does NetOwl Extractor’s out-of-the-box entity ontology compare with alternatives?
NetOwl Entity Extraction offers a rich hierarchical entity ontology consisting of over 100 types and subtypes of entities. It includes broad concepts like person, organization, place, artifact, address, and numeric entities at the top level while also providing more specific subtypes (e.g., country, province, city, etc. under place).
-
Can NetOwl extraction be customized?
Yes, NetOwl offers both simple and advanced customization options to extract new entity types and/or expand the coverage of the existing entity types.
Discover what NetOwl can do for you!
Schedule a demo
Request a no-cost evaluation
Request our whitepaper