From General-Purpose LLMs to Verticalized Enterprise Models

An Extremely Simple Approach to Domain Adaptation for Enterprise GenAI Use

Verticalization is a necessary step for deploying AI in the enterprise. But what does verticalizing a model mean, anyway? In practical terms, this means that when we ask the AI model, for example, “what’s needed to open an account?”, if the model is for the Banking domain it will know that the user is referring to a bank account (savings, current account…) and not an e-commerce account. In technical terms: the model knows how to disambiguate between the different meanings of a word depending on the vertical/domain. Verticalizing covers means more things (for example, the model will speak in the tone and style typical for that industry: polite, verbose…), but we will not focus on those here.

So far, there are two approaches to this:

Build a foundation model from scratch, like Bloomberg or SambaNova have done in the Finance domain. This approach is extremely expensive in every aspect, not only budget but also time; it’s reserved to a few powerhouses.

Start with a general-purpose model (GPT, Mistral…), add end-user vertical data and apply it to a use case. This approach is the most widely used. It still requires a significant amount of work in terms of data (selection, cleaning, normalization…), human resources (good talent capable of handling fine-tuning, evaluation…) and tools (data and cloud platform…) Besides, delivering the expected results remains a challenge. As a sign of this, most of the use cases targeted by enterprises are internal use cases; external ones remain too risky.

We propose the use of a faster and more effective approach to using general-purpose GenAI for any domain at the enterprise level. The approach decomposes the problem into two steps:

Step 1 – Verticalize your favorite model(s) for a particular domain. Note: we’ve run this process both with GPT and Mistral for the Banking vertical.

Step 2 – Customize this verticalized model to your enterprise particular use case(s) with your own data.

What are the advantages? This two-step approach reduces needs on all fronts:

Time: it can be executed in a matter of weeks

Processing power: it can be executed on typical hardware (e.g., A100 GPU servers)

Tools: it can be executed using the regular fine-tuning tools provided by model

The time & resource savings come from the fact that vertical models can be pre-built (as we do in Bitext) and the task can focus only on Step 2. Bitext bases its pre-built models on proprietary Natural Language Generation technology, free of the typical issues with Generative AI and generating training data: hallucinations, PII, bias…

For more references about our finetuning services and the copilot demo performed with finetuning, here:

https://www.bitext.com/datasets-for-fine-tuning-llms

https://www.bitext.com/travel-copilot