The case for evaluation of NLU platforms

Synthetic image and video have proven to be a big success for cost-cutting. Synthetic text is following suit: tabular data (that is the data organized in a table with rows and columns) is becoming mainstream already, and the next step is synthetic unstructured text, which is the data that doesn`t have a predefined format.

Synthetic unstructured text supports more complex cases, where actual text in the form of full sentences or documents is required.

One of the most popular use cases of synthetic unstructured text is evaluation of NLU engines or intent classification engines. Evaluating an NLU engine like Dialogflow, Lex, RASA, Ada or Kore-ai is a time-consuming task. It involves:

  • finding and augmenting the data, or generating it by hand
  • making sure the data is comprehensive enough to test all intents or classes
  • making sure the data captures the language of different user profile: young people use more colloquial language and typos, while senior users tend to be more formal, etc.

This is particularly relevant in multilingual scenarios, where languages like Arabic, Japanese or German have low resources compared to English, even if they are mainstream languages in terms of business.

Additionally, synthetic unstructured text provides the usual advantages of synthetic data: 

  • Speed up evaluation cycles: using NLG (Natural Language Generation) is faster than compiling manual data
  • Avoiding GDPR issues: anonymized text is not 100% safe as synthetic data
  • Guarantee wider coverage: there is virtually no limit to the amount of text that can be generated

The key point: unstructured text allows us to handle more complex cases than tabular data.

To help push forward research on this use case, we have published a dataset with more than 260,000 utterances, labeled with intent, semantic category, language register and more.

Take a Look to our GitHub Repository and access to our Dataset to try it by yourself.


Github RepositoryHugging Face Repository




Please, feel free to use it for your testing tasks and share results.

Synthetic unstructured text is being used for training purposes too, but we will cover that in another post

Sharing is caring!