Nowadays, there is a frequent need to extract trends and other relevant information from large collections of text (social media, online reviews, etc.) fast and accurately. However, when dealing with unstructured “messy” data, it is not easy to extract meaningful information.
Bitext Phrase Extraction service allows you to go beyond hashtags or keywords, no matter if it’s for high-quality texts, like news or legislation, or for colloquial ones, like blogs or social media. Additionally, it extracts the type of phrase: noun, verb, adjective or adverb phrase.
Our linguistic expertise makes our phrase extraction unique to provide more accurate and less noisy input out of your raw data.
General information to build Knowledge Graphs
Extract topics like nouns to enhance Topic Modelling
Relations between topics, for example noun phrases via verbs
The phrase extraction service detects and extracts:
Simple phrases: “checking account”, “twin brother”
Compound or nested phrases: “my brother’s checking account”
Combinations of the above: “account”, “checking account”, “checking account of the bank”, “bank”, “my brother’s bank”
E-mail addresses, URLs, social media users and hashtags
The phrase extraction service applies a normalization process to the phrases in order to coherently handle all instances of the same phrase. As an example the following phrases are instances of the same concept:
“The checking account”
“These checking accounts”
“One of my checking accounts”
A correct normalization of concepts is essential for services such as categorization or for trend detection.
José Echegaray 8, building 3
28232 Las Rozas
1700 Montgomery Street