Extract and classify different types of entities
The entity extraction service detects and extracts:
- Proper names such as: Lionel Messi, Tom Brady, Puerto Rico, United Nations. These ones can be classified into different categories: people, places, organizations.
- Numeric entities like: bank accounts, money amounts or phone numbers.
- Alphanumeric entities as: car plates, web addresses, dates, identity cards.
- E-mail addresses, URLs, social media users and hashtags.
Normalizes variants into standard forms
The service detects entities even though they may be written in different forms (for example: 20:00, 20 hours, 20h, 8 pm).
In addition, it applies a normalization process to the entities, presenting them in a standard form in order to consistently handle all instances of the same entity (NYSE, New York Stock Exchange, NY Stock Exchange are instances of the same entity).
The service can provide on demand the detection of entities which are not written in upper case: “I am in new york”.
Our technology distinguishes between Barack Obama (person) and Barack Obama (avenue)
Bitext’s linguistic engine assigns types to entities depending on syntactic rules: for example, in the sentence “I live at Barack Obama” the name of the president is interpreted as the name of an avenue, whereas in the sentence “As Barack Obama said” the proper noun is identified as the name of the US president. This feature is provided on demand.
If you want additional info schedule your demo