Applications

Bitext has developed internally a set of linguistic technology to work in a multilingual context. No third party or open source software has been used:

Download an example now!

Segmentation:

Phrase identification. Knowing when the phrases end. For example, our technology can differentiate and when “.” Is a full stop instead of an abbreviation like in “J. Smith”

 

Tokenization:

Identify the different words in the text.

 

Chunking:

Identification of meaningful concepts that can be formed by multiword expressions.

 

Parsing:

Generation of the relevant parse tree of a sentence. To understand the syntactic structure of each phrase.

 

Incremental Parsing:

“On the fly parsing” that generates the different parsing possibilities in real time as the sentence is being written.

 

Reference Resolution:

Used to identify what is the source word that pronouns refer to.

 

Semantic Roles Labeling:

Identify what role plays a participant in an action. Used to know that in both phrases “Acme Inc. is being acquired by John Smith Industries” and in “John Smish Industries are acquiring Acme Inc.” the company acquired is the same.

 

Lemmatization:

Identification of the lema (the base or dictionary form) of a word. This is a manageable task in regular languages but the complexity grows with irregular languages like Spanish or Hungarian.

 

Disambiguation:

To know with lema to choose when going from an inflected form. For example to know that the lema in “plays” is referring to the verb or to the noun.

Test-drive our Text
Analytics tools, for FREE!

Our cloud services help market research professionals and data scientists perform sentiment analysis, categorization and entity & concept extraction, easily and effectively.

Free trial. No credit card required. No obligation.

Start Analyzing

madrid

Madrid, SPAIN

José Echegaray 8 , building 3, office 4
Parque Empresarial Las Rozas
28232 Las Rozas

san-francisco

SAN FRANCISCO, USA

1700 Montgomery Street, Suite 101
CA 94111