How Agents Talk to LLMs

Understanding API Calls, Messages, and Roles

Interactive Diagram: Agent-LLM Communication Flow

Agent-LLM Communication Loop AGENT (Your Code) manages state messages = [ {role: "system", content: "..."}, {role: "user", content: "..."}, {role: "assistant", content: "..."} ] Full conversation history LLM API (OpenAI, Claude) STATELESS 1. Build 2. POST 3. Response → Extract content → Append to messages 1 2 3

Step 1: The Basic Structure

When an agent talks to an LLM, it sends an HTTP POST request with a JSON body:

{
  "model": "gpt-4o",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
}

Step 2: The Three Roles

Role Purpose
system Sets behavior/personality. The LLM follows these instructions.
user Human input. Questions, commands, context.
assistant The LLM's previous responses (for conversation history).

Step 3: Real OpenAI API Call (curl)

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a pirate. Respond in pirate speak."},
      {"role": "user", "content": "What is 2+2?"}
    ]
  }'

Response structure:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Arrr! That be 4, matey!"
      }
    }
  ]
}

Step 4: Real OpenAI API Call (Python)

# Install: pip install openai
from openai import OpenAI

client = OpenAI()  # Uses OPENAI_API_KEY env var

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

# Extract the response
answer = response.choices[0].message.content
print(answer)  # "The capital of France is Paris."

Step 5: Multi-Turn Conversation

Agents maintain conversation history by appending messages:

messages = [
    {"role": "system", "content": "You are a math tutor."}
]

# Turn 1
messages.append({"role": "user", "content": "What is 5 * 3?"})
response = client.chat.completions.create(model="gpt-4o", messages=messages)
assistant_msg = response.choices[0].message.content
messages.append({"role": "assistant", "content": assistant_msg})

# Turn 2 - LLM now has context from Turn 1
messages.append({"role": "user", "content": "Now divide that by 5"})
response = client.chat.completions.create(model="gpt-4o", messages=messages)
# LLM remembers "15" and answers "3"
Key insight: The LLM is stateless. The agent must send the entire conversation each time.

Step 6: The Agent Loop Pattern

def agent_loop():
    messages = [{"role": "system", "content": "You are a helpful assistant."}]

    while True:
        user_input = input("You: ")
        if user_input == "quit":
            break

        # 1. Add user message
        messages.append({"role": "user", "content": user_input})

        # 2. Call LLM
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages
        )

        # 3. Extract and store assistant response
        assistant_content = response.choices[0].message.content
        messages.append({"role": "assistant", "content": assistant_content})

        # 4. Display to user
        print(f"Assistant: {assistant_content}")

Key Takeaways

  1. Messages are the interface - Everything flows through the messages array
  2. Roles define context - system (instructions), user (input), assistant (history)
  3. LLMs are stateless - Agents must track and resend full conversation
  4. The loop pattern - User input → append → call API → append response → repeat