AI agent: definition
An AI agent is a software system powered by a large language model that can perceive its environment, decide what to do next, and take actions to accomplish a goal, all without a human telling it each step. The model provides the reasoning. The agent layer wraps the model with memory, tools, and the ability to act on its own decisions.
In practice, when someone in 2026 says "AI agent", they almost always mean a system built on a frontier LLM (Claude, ChatGPT, Gemini) that has three things a chat interface does not: persistent memory between runs, the ability to call external tools, and a goal it is trying to reach without step-by-step instructions.
How an agent differs from a chatbot
This is the first confusion worth clearing up because most people\'s mental model of "AI" is still ChatGPT in a browser tab.
A chatbot is reactive. You send it a message, it sends one back, and the conversation ends when you close the window. It does not remember you next time. It does not do anything unless you ask. It does not have a goal beyond responding to your current message.
An AI agent is proactive. It runs on its own (often on a schedule, or triggered by an event). It remembers context across runs. It can take actions in the world: send emails, create files, query databases, post to social media, hand off tasks to other systems. It is trying to accomplish something specific, and it makes its own decisions about how to get there.
The shortest version: chatbots talk. Agents do.
How an agent differs from a script or automation
This is the second confusion. People who already use Zapier, Make.com, or custom Python scripts sometimes ask "is that not just an automation?". The answer is no, and the difference is the judgement layer.
A script or automation follows a fixed sequence of steps you defined when you wrote it. If the input matches what you expected, it works. If the input is shaped slightly differently, or the situation is one you did not anticipate, the script breaks or produces wrong output.
An agent uses a language model to reason about each step. Faced with an unexpected input, it can adapt: "this looks unusual, let me try a different approach". Faced with an error, it can recover: "the API returned an error, let me retry with different parameters". Faced with a vague instruction, it can interpret: "the user said \'summarise this\', so I should produce a 3-paragraph summary because that fits the document length".
That judgement is what makes an agent worth building. If your workflow is fully predictable, a script is cheaper, faster, and more reliable. If your workflow needs interpretation, adaptation, or recovery, an agent is the right tool.
The four things that make a system an agent
In my work building production systems, I use four criteria to decide if something is genuinely an agent or just a wrapper around an LLM call:
- An LLM at the reasoning core. The judgement comes from a frontier model (Claude, ChatGPT, or similar), not from code you wrote yourself. If the decisions are all hardcoded, it is automation, not an agent.
- Tools the LLM can call. The system has a defined set of actions the model can take: search, file access, API calls, database queries. The model decides which tool to use based on the situation.
- Memory across runs. The system remembers what it has done before. This can be a database, a file system, a knowledge base, or a vector store. Without memory, the system is just a one-shot prompt.
- A goal, not a script. The system is trying to accomplish something. You tell it the goal, not the sequence of steps. It works out the steps itself.
If a system has all four, it is an agent. If it has only the LLM core (reasoning without tools or memory), it is closer to a chatbot. If it has tools and memory but no LLM judgement, it is a traditional automation.
Real examples
Concrete examples make the definition stick. Here are three real agent systems built in 2025-2026, including one I built myself:
The Camille AI Operating System. Four specialised Claude Code agents that run a social media manager\'s entire workflow. One agent fires every Monday at 8am to produce a sector briefing per client. Another generates draft posts on demand. A third produces monthly client reports. A fourth runs continuous research in the background. They share an Obsidian vault as memory and hand off tasks to each other. Goal: handle 70-80% of the manager\'s repetitive client work without losing the human voice. Read the full case study.
Customer support triage agents. An agent reads incoming support tickets, classifies them by urgency and topic, drafts a first response, and either sends it directly (for low-stakes issues) or routes it to a human agent (for anything sensitive). Memory: ticket history per customer. Tools: ticket system API, knowledge base search.
Code review agents. An agent reviews pull requests in a Git repository, runs the test suite, checks the code against project conventions, and posts comments asking for changes or approving the merge. Memory: project history and previous review patterns. Tools: Git, file system, test runner.
Each of these has the four characteristics: LLM reasoning, tools, memory, and a goal it pursues without step-by-step instructions.
When to build one
Agents are powerful but they are not always the right answer. They make sense when:
- The workflow has multiple steps that depend on judgement at each step, not just at the start.
- The system needs to remember context across runs (clients, history, decisions).
- The work is high-volume enough that human babysitting is expensive.
- The output needs to adapt to inputs you cannot fully predict in advance.
Agents are the wrong answer when:
- A simple deterministic script would do the job.
- The workflow runs once a week or less and a human can handle it faster.
- The team is not ready to commit to a frontier LLM as the reasoning core (agents are tightly coupled to one model).
- The use case is exploratory and the requirements are still changing weekly.
If your workflow fits the first list, the AI Agent Architecture service covers building production multi-agent systems on Claude Code. If you are not sure whether you need an agent or a simpler automation, the free 30-minute consultation is the right place to find out.