Getting GPT to Answer Consistently and with Style
Behavior of GPT-3.5 Standard
Behavior of Fine-tuned GPT-3.5 by Bitext
We’ve approached the problem using data: can we make GPT to answer properly to some questions? By properly we mean in the right tone and corporate style, with the right content and without glitches/contradictions/hallucinations.
For this experiment, we’ve generated a dataset verticalized for the banking sector (retail banking, loans, wealth management…) and customized for Customer Support. Available at Hugging Face for free on Customer Support Dataset.
We call this dataset hybrid because it combines the advantages of synthetic text (low cost, high scalability and speed, privacy, and corporate identity) without the disadvantages (hallucination and bias mainly). It contains 30M tokens which have been infused with anonymized PII (Personally Identifiable Information), establishing an empathic connection with users while being tailored to the sector’s target demographic. Below is a sample from the dataset:
Question:
“how much does it cost to open an account for international banking.”
Response:
“Thank you for considering opening an account for international banking! I can provide you with the information you need.
The cost of opening an account for international banking may vary depending on the financial institution and the specific account type you choose. To get the most accurate and up-to-date information. I recommend [….]
You can reach our customer service team at {{Customer Support Phone Number}} or through the Live Chat on our website at {{Company Website URL}}. They are available {{Customer Support Working Hours}} to […]
We look forward to the possibility of serving you and providing a seamless international banking experience!”
These are the answers provided by the data-finetuned version of GPT 3.5 for the same question ‘How much does it cost to open an international banking account?’:
When the questions used for testing are part of the training data, it’s less surprising that the answers are positively influenced by the training.
It’s a bit more surprising when using questions that where not used in the training, like the one we used “how much does it cost to open an account for international banking”.
As we can see, they all:
- Provide correct content (according to training).
- Follow the same structure (company policy for Customer Support).
- They are consistent among them, eliminating the “disparateness” factor we saw in GPT answers.
Conclusion
- The creativity of the generative capabilities of GPT.
- The accuracy and consistency of fine-tuning.