AI Agent Design Patterns: ReAct, Tool Use, Memory, and Multi-Agent Systems
AI agents combine LLM reasoning with external action. This guide covers the core patterns used in production agent systems and the failure modes you must plan for.
What makes something an agent?
An agent is an LLM that decides which actions to take and observes the results in a feedback loop, rather than responding in a single pass. The loop can run for many steps until the task is complete.
while (!done) {
thought = llm.think(context) // reason about next step
action = llm.act(thought) // select tool and args
result = tool.execute(action) // run the tool
context.push(thought, action, result)
done = llm.shouldStop(context)
}ReAct pattern (Reason + Act)
ReAct interleaves reasoning traces with tool calls, making agent behavior inspectable:
Thought: The user wants to know the current BTC price.
I should call the market data tool.
Action: get_price({ symbol: "BTC-USD" })
Observation: { price: 67420.15, timestamp: "2026-06-10T05:00:00Z" }
Thought: I have the price. I can now answer the user.
Answer: The current BTC price is $67,420.15.Implementation tip: include Thought:, Action:, and Observation: prefixes in your system prompt to guide structured output.
Tool definition
const tools = [
{
type: 'function',
function: {
name: 'search_database',
description: 'Run a SQL query against the analytics database. Use for data questions.',
parameters: {
type: 'object',
properties: {
query: { type: 'string', description: 'A valid read-only SQL SELECT statement' },
},
required: ['query'],
additionalProperties: false,
},
},
},
];
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages,
tools,
tool_choice: 'auto',
});Tool execution loop
async function runAgent(userMessage: string) {
const messages = [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: userMessage },
];
for (let step = 0; step < 10; step++) { // max steps guard
const response = await openai.chat.completions.create({ model: 'gpt-4o', messages, tools });
const msg = response.choices[0].message;
messages.push(msg);
if (response.choices[0].finish_reason === 'stop') {
return msg.content; // done
}
if (response.choices[0].finish_reason === 'tool_calls') {
for (const call of msg.tool_calls ?? []) {
const result = await executeTool(call.function.name, JSON.parse(call.function.arguments));
messages.push({
role: 'tool',
tool_call_id: call.id,
content: JSON.stringify(result),
});
}
}
}
throw new Error('Agent exceeded maximum steps');
}Memory strategies
| Memory type | Storage | Use case |
|---|---|---|
| In-context | Messages array | Single session, short task |
| Summary memory | LLM-compressed history | Long conversations, reduce tokens |
| Entity memory | Key-value store | Track named entities across turns |
| Episodic memory | Vector store | Long-term recall across sessions |
| Procedural memory | Tool definitions | Learned task-specific workflows |
// Summary memory — compress history to save tokens
async function summarizeHistory(messages) {
const response = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{ role: 'system', content: 'Summarize the following conversation concisely.' },
{ role: 'user', content: messages.map(m => `${m.role}: ${m.content}`).join('\n') },
],
});
return response.choices[0].message.content;
}Multi-agent patterns
Orchestrator + Worker
One planner agent breaks the task into sub-tasks and delegates to specialist agents:
// Orchestrator decides which specialist to call const plan = await plannerAgent.run(userRequest); const results = await Promise.all( plan.subtasks.map(task => specialistAgents[task.type].run(task)) ); const finalAnswer = await plannerAgent.synthesize(results);
Pipeline (sequential)
Each agent's output is the next agent's input — good for structured multi-step transformations:
const extracted = await extractionAgent.run(rawDocument); const validated = await validationAgent.run(extracted); const formatted = await formattingAgent.run(validated);
Debate (critic + generator)
A generator produces output; a critic scores it; repeat until quality threshold is met:
let draft = await generatorAgent.run(task);
for (let i = 0; i < 3; i++) {
const critique = await criticAgent.evaluate(draft);
if (critique.score >= 0.9) break;
draft = await generatorAgent.refine(draft, critique.feedback);
}Failure mode checklist
- Infinite loops: always enforce a max-step limit.
- Tool hallucination: validate tool names and arguments before execution.
- Context overflow: monitor token count, summarize or trim old messages.
- Cost explosion: set a per-session token budget with hard cutoff.
- Prompt injection: sanitize tool outputs before reinserting into context.
Human-in-the-loop gates
// Require human approval before irreversible actions
if (isDestructiveAction(action)) {
const approved = await requestHumanApproval(action, context);
if (!approved) return { status: 'cancelled', reason: 'user denied' };
}Takeaway
Agent reliability is determined by tool quality, guard-rail coverage, and memory design. Build agents with strict step limits, validate every tool input, and log every reasoning step for post-mortem debugging.