In a previous post, we conducted a comprehensive benchmark on the role of synthetic text generation for intent detection using traditional Natural Language Understanding (NLU) platforms. In that study, we specifically examined the performance of Rasa as an example, along with other NLU platforms. You can find the previous benchmark here.

Since the emergence and widespread adoption of chatbots based on Large Language Models (LLMs) in various platforms such as, LivePerson, ADA and Rasa, we have updated our benchmark to investigate how integrating discriminative LLMs can enhance traditional NLUs in one key task: increasing accuracy in intent detection:

One primary objective of our updated benchmark is to assess how LLMs, particularly BERT LLMs, can improve intent detection accuracy. By leveraging the power of LLMs, we aim to enhance the NLU’s ability to accurately understand and classify user intents.


Impact of Increasing Training Utterances:

Our benchmark results clearly demonstrate that increasing the number of training utterances, particularly when using synthetically generated data, significantly boosts the performance of traditional NLUs.

On average, we observed a 15% increase in accuracy when incorporating larger training datasets. This finding underscores the importance of ample and diverse training data for achieving superior intent detection results.

Results and Discussion

We report the accuracy of each model in the external test dataset. The accuracy is the percentage of examples that each model was able to successfully classify into their correct intents with high confidence. The following table contains the results:




Incorporating LLMs into Rasa

LLMs have revolutionized the field of conversational AI and have been widely adopted across various chatbot platforms, including Rasa. By integrating LLMs, such as BERT, with Rasa NLU, businesses can leverage the strengths of both technologies to create more powerful and accurate chatbot experiences.

We believe that by enhancing traditional NLUs with LLMs, businesses can unlock new opportunities to provide more accurate and context-aware chatbot interactions. The findings of our benchmark offer valuable insights into the benefits and potential applications of integrating LLMs into NLU frameworks.

 Note: As with any AI technology, ethical considerations must be taken into account. It is crucial to ensure responsible use of LLMs, including ongoing monitoring, evaluation, and refinement, to mitigate any risks associated with biased or inappropriate responses. 


In conclusion, our updated benchmark showcases the effectiveness of integrating LLMs, particularly BERT-based LLMs, with traditional NLUs like Rasa. By doing so, businesses can enhance intent detection accuracy and broaden the semantic scope of their chatbots. The significant impact of increasing training utterances, along with the power of LLMs, enables improved performance and delivers more accurate and contextually aware chatbot experiences.

We hope that this updated benchmark inspires businesses to leverage the potential of LLMs to enhance their traditional NLUs and deliver more effective and intelligent chatbot interactions.

Sharing is caring!