The AI Tool Landscape

Field Notes

The first time you look at the AI tool market, it feels like a wall of logos.

ChatGPT. Claude. Claude Code. Codex. Claude Desktop. Claude CoWork. OpenCode. OpenClaw. Hermes. Cursor. Zapier. n8n. MCPs. Browser agents. Local models. Cloud models. Open-source models. Closed-source platforms.

The mistake is treating all of it as one category called "AI tools." That is how people end up asking the wrong question.

Which one is best?

That question sounds reasonable, but it breaks down quickly. Claude may be better for one kind of reasoning. Codex may be better for a different software task.

AI is not either-or.

It is and.

If you catch yourself trying to pick one model, one company, or one tool for everything, you are probably thinking about the category too narrowly. The better operators are building a stack that can use the right brain (model), body (harness), and tool for the job.

In The Agentic Tool Test, the question was whether your software has doors agents can use. Can they read, act, stop, and leave proof?

In What Is an Agent, Actually?, I broke the agent into parts. The model is the brain. The harness is the body. The tools are the hands. Memory is the backpack. Receipts are the trail.

Now we can talk about the market with more precision.

The tool landscape is really a platform map.

Some platforms are built for conversation. Some build software. Some live on your desktop. Some let you change the model, tools, memory, and rules.

The difference matters.

Software is the foundation of almost everything you do on a computer. Every app, dashboard, form, workflow, report, and internal tool is software.

That used to make software creation feel like a locked room.

That wall is coming down.

I wrote about this in The Real Measuring Stick. Coding is collapsing toward language. You do not need to understand every technical layer to start building. You need to describe the software solution you want clearly enough that an agent can help create it.

This is why coding agents matter even if you do not think of yourself as technical.

They are for anyone who sees a recurring problem and says, "I wish software existed for this."


Playbook

Here is the practical map I would use.

1. Coding agents

A coding agent helps you build or change software.

That may mean an app, a dashboard, a script, a website, a database tool, a report generator, or a small system that removes manual work.

Claude Code is Anthropic's coding agent. You describe what you want to build, fix, test, or ship, and Claude Code works through the software task.

Codex is OpenAI's version of that idea. If Claude Code is Anthropic's software-building agent, Codex is ChatGPT's software-building agent.

One is not permanently better than the other.

Models matter. Tools matter. The harness matters too.

The right question is not, "Is Claude Code better than Codex?"

The right question is, "What is Claude Code better at, what is Codex better at, and which one fits this job?"

This is the same frame from Specified Intelligence. AI capability is jagged. One model can be excellent at one task and weaker at another.

The winner changes by job.

Use coding agents when you want to create software solutions. A simple internal tool that saves you three hours a week counts.

2. Closed-source personal agents

Think Claude Desktop and Claude CoWork style tools.

These are useful because they move closer to your real work surface. They can connect to files, apps, messages, browsers, calendars, and other tools through approved paths like MCPs, connectors, and product features.

Powerful, but bounded.

With a closed-source personal agent, the provider controls the model, the feature list, the permissions, the interface, the roadmap, and the ways the agent can connect to the rest of your work.

You can use what Anthropic gives you well. But you are still waiting on Anthropic to decide which features exist, which integrations are allowed, and how much control you get.

This is why I wrote Don't Marry Claude. I like Claude. I use Claude. But loyalty to one provider becomes expensive when the whole market keeps moving.

3. Open personal-agent systems

OpenClaw and Hermes-style agents are different.

They are closer to an operating layer than a single app.

The practical ceiling is not the vendor's feature list. The ceiling is what you can wire together, what you allow the agent to access, and how well you verify the work.

You can change the model.

You can change the tools.

You can add memory.

You can connect local files, browser control, APIs, CLIs, scripts, review gates, and multiple agents.

This is why open systems feel so different once you understand them.

The iPhone analogy makes this simple.

When the first iPhone came out, you could not put your own photo behind the app icons as a background. But that was not because the iPhone could not technically show a background image.

It could.

Apple just had that ability walled behind the features it was willing to give you.

If you "jailbroke" the phone, which is just a fancy way of saying you hacked it open and removed the closed-source limits, you could put a background on it. Same phone. Same screen. Different level of control.

AI is starting to rhyme with that.

When a new Claude Desktop or Claude CoWork feature gets announced, a lot of open-agent users have already been doing some version of it for months. Not because they are smarter. Because they are not waiting for one company to package the feature.

They can assemble the pieces themselves.

This is the reason I use open personal-agent systems. I want the freedom to route across models, connect to the tools I use, build my own review gates, and keep the system moving when one provider is behind.

It is more responsibility.

It is also more control.

4. Open-source coding harnesses

OpenCode and Pi give you coding-agent bodies without locking you to one brain. OpenCode is the fuller open-source coding agent. Pi is the minimal open-source coding agent harness you can shape around your own workflow.

The model race changes constantly.

If a new model gets better at software planning, you should be able to try it. If another model is cheaper and good enough for cleanup work, you should be able to use it. If Claude is better at one task and Kimi, Qwen, DeepSeek, or OpenAI is better at another, your system should not force a fake choice.

Use Artificial Analysis when you want to compare models by quality, speed, latency, price, and context window. Treat it like a scouting report, then test the model on your actual work.

Closed-source tools can be easier to start with.

Open systems can take you further once you know what you are trying to build.

5. The decision rule

Here is the simple version.

If you want help thinking, use chat.

If you want help building software, use a coding agent.

If you want a polished personal-agent experience with the provider managing the edges, use a closed-source desktop agent.

If you want control over the model, tools, memory, browser, files, scripts, and review gates, use an open personal-agent system.

If you want repeatable trigger-and-action work, use automation.

If you want agents to operate your existing software, go back to the agentic tool test and ask whether the software has doors.

Do not ask one tool to be everything.

Build the stack so the right tool can do the right job.


Orientation

This is where the AI tool market starts to make sense. People keep comparing tools that are doing different jobs.

Claude and ChatGPT are conversation surfaces. Claude Code and Codex are software-building agents. Claude Desktop and Claude CoWork are closed-source personal-agent surfaces. OpenClaw and Hermes-style systems are open personal-agent operating layers. OpenCode-style tools are flexible coding harnesses. Zapier, n8n, and scripts still matter when the workflow is repeatable.

Once you see the categories, the choice gets easier.

Start with the job.

Then choose the brain (model), the body (harness), the tools, the memory, and the review gate.

That is the useful map.

Next is [The New Software Stack].

This will not just repeat the keep, wrap, replace, or watch audit from the agentic tool test. That was the single-tool question.

The next question is bigger: how do you arrange the whole stack so agents can move work through it with context, permissions, memory, and proof?

For now, comment below with one AI tool you are confused by.

I will tell you where I think it fits on the map.

I read every one.

— Brian