The impact of lemmatization for morphologically-rich languages Abstract Are there ways to improve the performance of language models, beyond increases in size -both in the number of model parameters or in the size of training corpora? Our benchmarks show that another...
The case for evaluation of NLU platforms Synthetic image and video have proven to be a big success for cost-cutting. Synthetic text is following suit: tabular data (that is the data organized in a table with rows and columns) is becoming mainstream already, and the...
What Is Synthetic training data? Synthetic Training data is the data that is used to train an NLU engine. An NLU engine allows chatbots to understand the intent of user queries. The training data is enriched by data labeling or data annotation, with information about...
Arabic is a complex language for NLP tasks, even for simple ones like lemmatization. There are several reasons for this: Arabic creates words based on roots: for example, the word کتاب (kitab, “book”) is derived from ك ت ب (k t b). Many related words are derived from...
How Synthetic Text can solve your training and evaluation problems for your virtual assistants / chatbots When shopping online, customers frequently have the need to modify their order: exchanging an item in the basket, deleting something already added… Customers ask...
Recent Comments