|
|
|
|
 |
NetOwl DocMatcher
 |
NetOwl Home > Products
> NetOwl DocMatcher
Overview
NetOwl DocMatcher Version 2 is the latest version of SRA's document comparison/categorization software. Through the use of advanced linguistic features, as well as sophisticated and robust machine learning algorithms, DocMatcher is used for a variety of question-answer matching tasks, ranging from intelligence analysis applications that match incoming reports against established information needs to Customer Relationship Management (CRM), patent analysis, and resume routing. DocMatcher is also used to identify both exact duplicate and near-duplicate documents. Unlike other trainable categorization products, DocMatcher overcomes the bottleneck of requiring training documents. It ranked the highest by a significant margin among other categorization tools in a large-scale operational benchmark conducted by a large Government Agency.
Product Features
- Provides for each target document a ranked list of most similar documents with a similarity score.
- Provides ability to incrementally update the trained model.
- Leverages information extraction technology for advanced linguistic features to aid in similarity computation.
- Provides built-in thesauri/stop word lists as well as capabilities to add custom thesauri and stop word lists.
- Allows combination of matching results from multiple document sections into a single overall similarity score.
- Provides advanced "FeatureSet" processing to require particular document features be present.
- Provides high-throughput evaluations through highly-efficient comparison algorithms working on in-memory document models.
|
|
|
|