Common problem:

How to find useful information buried in data?

Raw data can be difficult to handle when you need to look for specific information within the text. Entities are a good example: they are useful for different purposes (text anonymization, knowledge graph generation, etc.) – still, they pose ambiguity problems that are hard to solve with high accuracy.


Entity extraction service delivers structure, clarity, and insight out of raw text data

Bitext Entity Extraction is able to locate and classify over 16 types of entities such as names, persons, and organizations using a combination of NLP technologies:

  • Deep Linguistic Analysis based on grammars

  • Alphanumeric pattern detection using regular expressions

  • Monolingual and multilingual dictionaries

Use entity analysis to detect personal data in a secure way to ensure compliance with European GDPR Data Privacy legislation

Entity Extraction

I want to try it


Enhance search quality

Having access to loads and loads of text data can be a real opportunity. Entity extraction tools allow for taking advantage of this opportunity, as data is of no use at all unless it is analyzed and understood.

Build a knowledge graph

Entities are more than just isolated strings. They have properties; Entities are connected to other entities; entities perform actions, etc. Pure entity detection doesn’t do the job, does not extract all there is to learn: actions, properties... Bitext Entity Extraction is the perfect tool to create Advanced Knowledge Graphs.


Extract and classify different types of entities

The entity extraction service detects and extracts:

  • Proper names such as: Lionel Messi, Tom Brady, Puerto Rico, United Nations. These ones can be classified into different categories: people, places, organizations.

  • Numeric entities like: bank accounts, money amounts or phone numbers.

  • Alphanumeric entities as: car plates, web addresses, dates, identity cards.

  • E-mail addresses, URLs, social media users and hashtags.

Normalizes variants into standard forms

The service detects entities even though they may be written in different forms (for example: 20:00, 20 hours, 20h, 8 pm).

In addition, it applies a normalization process to the entities, presenting them in a standard form in order to consistently handle all instances of the same entity (NYSE, New York Stock Exchange, NY Stock Exchange are instances of the same entity). The service can provide on demand the detection of entities which are not written in upper case: “I am in new york”.

Our technology distinguishes between Barack Obama (person) and Barack Obama (avenue)

Bitext’s linguistic engine assigns types to entities depending on syntactic rules: for example, in the sentence “I live at Barack Obama” the name of the president is interpreted as the name of an avenue, whereas in the sentence “As Barack Obama said” the proper noun is identified as the name of the US president. This feature is provided on demand.

Download our Entity Extraction Benchmark

bitext madrid offices


José Echegaray 8, building 3, office 4
Parque Empresarial Las Rozas
28232 Las Rozas

san francisco bitext offices


541 Jefferson Ave., Ste. 100
Redwood City
CA 94063