The Evolution of LLM Architecture: From Simple Chatbot to Complex System

Justin Torre's headshotJustin Torre· September 18, 2025

Figuring out the right tech stack can be challenging. This simplified guide illustrates how a basic LLM chatbot application can evolve in complexity.

Example: Evolution of a Chatbot

Let's consider a simple internal chatbot designed to help employees of a small business manage their inbox.

Stage 1: The Basics

Initially, you can simply copy and paste the last 10 emails into the context.

LLM Stack Example - Stage 1

System:

HERE ARE THE LAST 10 EMAILS IN THE INBOX

EMAILS: [{
...
}, ...]

Answer the user questions.

User:

What is the status of the order with the id 123456?

Stage 2: Observability

As your app gains popularity, you may find yourself spending $100 a day on OpenAI. At this stage, basic observability becomes essential.

LLM Stack Example - Stage 2

Stage 3: Scaling

Users may complain that the chatbot only considers the last 10 emails. To address this, implement a Vector DB to store all emails and use embeddings to retrieve the 10 most relevant ones.

LLM Stack Example - Stage 3

Stage 4: Gateway

To manage costs, you may need to rate-limit users and add a caching layer. This is where a gateway comes into play.

LLM Stack Example - Stage 4

Stage 5: Tools

Enhance functionality by adding tools that perform actions on behalf of users, such as marking emails as read or adding events to a calendar.

LLM Stack Example - Stage 5

Stage 6: Prompting

Implement a robust prompt management solution to handle prompt versions for testing and observability.

LLM Stack Example - Stage 6

Stage 7: Agents

Some actions may require multiple tool calls in a loop, where tools decide on the next action. This is where Agents come into play.

LLM Stack Example - Stage 7

Agents are advanced integrations that operate within complex environments, allowing for sophisticated interactions through prompts instead of direct provider calls.

Stage 8: Model Load Balancer

As your application grows, different models may be better suited for specific tasks. A model load balancer can help distribute the workload effectively.

LLM Stack Example - Stage 8

Stage 9: Testing

To make data actionable, implement a testing framework that provides insights and evaluators to assess the quality of your model's outputs.

LLM Stack Example - Stage 9

Stage 10: Fine Tuning

Fine-tuning is typically employed for workloads requiring significant customization, especially when optimizing for specific problems or cost savings.

LLM Stack Example - Stage 10


Questions or feedback?

Are the information out of date? Please raise an support or contact us, we'd love to hear from you!