What an AI agent actually is in 2026
The word “agent” got beaten to death in 2025. Every product launch, every framework, every demo. By the end of the year it had drifted so far from any specific meaning that vendors were calling a chatbot with a function-call tool an “autonomous AI agent.”
It’s 2026. The dust has settled enough to say something useful.
A working definition
An AI agent is a program that:
- Has a goal, not just a prompt.
- Can take actions in the world — not just generate text.
- Decides what to do next, in a loop, until the goal is met or it gives up.
- Persists state across turns, often across sessions.
If a system fails any of those, call it what it is. A chatbot. A copilot. A function-calling LLM. They’re useful. They’re not agents.
The thing that actually changed
The shift in 2025-26 wasn’t the models — though Claude 4.x and GPT-5 are obviously stronger. The shift was that the loop got cheap and reliable enough to leave running.
A year ago, leaving an agent running overnight meant waking up to a $40 API bill, a corrupted state file, and three half-finished tasks. Today my OpenClaw cron jobs run forty times a day on a local Qwen3.6 model. Cost: zero. Failures: rare enough that I notice them.
That’s the unlock. Not intelligence. Reliability.
Where agents still break
Three places, in my experience:
- Long-horizon planning. They drift. They forget the original goal three steps in. The fix is structural — explicit task decomposition, persistent memory, hard checkpoints — not bigger models.
- Tool selection. Give an agent twenty tools and it picks badly. Give it five and it picks well. Sub-agents with focused tool sets beat generalists every time.
- Knowing when to stop. This is the hardest one. Models will happily “keep going” forever if you don’t tell them what done looks like. Verifiable success criteria are non-negotiable.
What I’m watching
The interesting frontier in 2026 isn’t single-agent capability. It’s agent fleets — systems where dozens of agents collaborate, hand off work, criticise each other’s output, and self-organise around a goal. OpenClaw is my own small experiment in that direction. Anthropic’s Claude Agent SDK and OpenAI’s Swarm work hint at where this goes.
If you’re still asking “is this an agent?” you’re asking the wrong question. Ask: what loop is it running, and what stops it? That tells you everything.