AI Agent Design Patterns: ReAct, Tool Use, Memory, and Multi-Agent Systems

AI agents combine LLM reasoning with external action. This guide covers the core patterns used in production agent systems and the failure modes you must plan for.

What makes something an agent?

An agent is an LLM that decides which actions to take and observes the results in a feedback loop, rather than responding in a single pass. The loop can run for many steps until the task is complete.

while (!done) {
  thought = llm.think(context)   // reason about next step
  action = llm.act(thought)      // select tool and args
  result = tool.execute(action)  // run the tool
  context.push(thought, action, result)
  done = llm.shouldStop(context)
}

ReAct pattern (Reason + Act)

ReAct interleaves reasoning traces with tool calls, making agent behavior inspectable:

Thought: The user wants to know the current BTC price.
         I should call the market data tool.
Action: get_price({ symbol: "BTC-USD" })
Observation: { price: 67420.15, timestamp: "2026-06-10T05:00:00Z" }
Thought: I have the price. I can now answer the user.
Answer: The current BTC price is $67,420.15.

Implementation tip: include Thought:, Action:, and Observation: prefixes in your system prompt to guide structured output.

Tool definition

const tools = [
  {
    type: 'function',
    function: {
      name: 'search_database',
      description: 'Run a SQL query against the analytics database. Use for data questions.',
      parameters: {
        type: 'object',
        properties: {
          query: { type: 'string', description: 'A valid read-only SQL SELECT statement' },
        },
        required: ['query'],
        additionalProperties: false,
      },
    },
  },
];

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages,
  tools,
  tool_choice: 'auto',
});

Tool execution loop

async function runAgent(userMessage: string) {
  const messages = [
    { role: 'system', content: systemPrompt },
    { role: 'user', content: userMessage },
  ];

  for (let step = 0; step < 10; step++) {  // max steps guard
    const response = await openai.chat.completions.create({ model: 'gpt-4o', messages, tools });
    const msg = response.choices[0].message;
    messages.push(msg);

    if (response.choices[0].finish_reason === 'stop') {
      return msg.content;  // done
    }

    if (response.choices[0].finish_reason === 'tool_calls') {
      for (const call of msg.tool_calls ?? []) {
        const result = await executeTool(call.function.name, JSON.parse(call.function.arguments));
        messages.push({
          role: 'tool',
          tool_call_id: call.id,
          content: JSON.stringify(result),
        });
      }
    }
  }
  throw new Error('Agent exceeded maximum steps');
}

Memory strategies

Memory type	Storage	Use case
In-context	Messages array	Single session, short task
Summary memory	LLM-compressed history	Long conversations, reduce tokens
Entity memory	Key-value store	Track named entities across turns
Episodic memory	Vector store	Long-term recall across sessions
Procedural memory	Tool definitions	Learned task-specific workflows

// Summary memory — compress history to save tokens
async function summarizeHistory(messages) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      { role: 'system', content: 'Summarize the following conversation concisely.' },
      { role: 'user', content: messages.map(m => `${m.role}: ${m.content}`).join('\n') },
    ],
  });
  return response.choices[0].message.content;
}

Multi-agent patterns

Orchestrator + Worker

One planner agent breaks the task into sub-tasks and delegates to specialist agents:

// Orchestrator decides which specialist to call
const plan = await plannerAgent.run(userRequest);
const results = await Promise.all(
  plan.subtasks.map(task => specialistAgents[task.type].run(task))
);
const finalAnswer = await plannerAgent.synthesize(results);

Pipeline (sequential)

Each agent's output is the next agent's input — good for structured multi-step transformations:

const extracted = await extractionAgent.run(rawDocument);
const validated = await validationAgent.run(extracted);
const formatted  = await formattingAgent.run(validated);

Debate (critic + generator)

A generator produces output; a critic scores it; repeat until quality threshold is met:

let draft = await generatorAgent.run(task);
for (let i = 0; i < 3; i++) {
  const critique = await criticAgent.evaluate(draft);
  if (critique.score >= 0.9) break;
  draft = await generatorAgent.refine(draft, critique.feedback);
}

Failure mode checklist

Infinite loops: always enforce a max-step limit.
Tool hallucination: validate tool names and arguments before execution.
Context overflow: monitor token count, summarize or trim old messages.
Cost explosion: set a per-session token budget with hard cutoff.
Prompt injection: sanitize tool outputs before reinserting into context.

Human-in-the-loop gates

// Require human approval before irreversible actions
if (isDestructiveAction(action)) {
  const approved = await requestHumanApproval(action, context);
  if (!approved) return { status: 'cancelled', reason: 'user denied' };
}

Takeaway

Agent reliability is determined by tool quality, guard-rail coverage, and memory design. Build agents with strict step limits, validate every tool input, and log every reasoning step for post-mortem debugging.