Entity, concept, and event extraction: NaturalExtractor
Bitext has developed NaturalExtractor, a tool which, by means of linguistic technology, allows for the extraction of various types of information from large textual databases:
- Entities: proper names of people, companies, products, places, etc., such as “Barack Obama”, “Spanish Agency of Intercultural Communication”, “Boulevard of the Allies”
- Concepts: ideas or topics found in a text, such as “global warming”, “developing countries”, “main sources of urban noise”, etc.
- Events: relationships between entities and concepts: for example, from the sentence “President Barack Obama has visited the US allied countries in the Persian Gulf”, the following relationship is extracted: author “Barack Obama (president)”, action “to visit”, object “US allied countries in the Persian Gulf”.
These applications are very useful in areas such as:
- Business Intelligence
- Press Clipping
- Automatic E-mail Management
- Forensic computing, fight against fraud
- etc.
In addition, NaturalExtractor can be adapted to solve specific problems such as:
- detection of all the appointments of new positions which appear in various sources, like specialized press (for the private sector) or official bulletins ( for the public sector)
- creation of a list containing all the legal and physical persons who are involved in civil relations, mercantile relations, etc., in legal documents, like sale and purchase agreements
- collecting all the press news about a specific company or topic
- identifying relationships between the people who appear in a specific set of documents or emails
- and many more
Moreover, NaturalExtractor allows for the creation and maintenance of the knowledge bases (specialized dictionaries) which are specific for each customer, in order to tackle problems such as automatic cataloguing or ontology creation. NaturalExtractor offers a wide range of personalization and configuration features.