AI Lies When You Do Not Give It an Out

Audio field note

Listen to this post

Audio version. 9:18.

Field Notes

I caught a small AI failure recently that would have been a much bigger problem if it reached a customer.

We were looking at a guest-facing workflow.

The kind of thing that should eventually help someone answer normal travel questions. Local recommendations. Nearby activities. What is happening in the area. The practical stuff guests actually ask when they are trying to enjoy a trip.

And the system did something AI systems love to do.

It tried to be helpful.

Too helpful.

Instead of staying at the category level, it drifted toward specifics that were not verified. Event names. Places. Details that sound useful, but become a trust problem the second they are wrong.

That is the part people miss.

If I am brainstorming with AI and it gives me a weird idea, fine. I can catch that.

If a guest asks about a local event and the system invents one, that is different. Now the problem is the business saying something it cannot stand behind.

Luckily, this was caught before real guest use.

The fix was changing the path the system is allowed to take.

If there is no real local source, it should not invent one. It should answer at the category level or say it needs current local verification.

That sounds boring.

Boring is exactly what you want when facts matter.

I taught this same idea in a room with a leadership team recently, and the analogy that clicked was a multiple-choice test.

When you get to a question you do not know, what do you do?

You guess.

Why?

Because if you leave it blank, you know you get it wrong. At least if you guess, there is a chance.

AI does a version of that.

It is designed to be helpful. The loop keeps moving until the system can return something that looks like an answer. If it has enough context, great. If it has the right source, great. If it has the right tool, great.

But if it does not have those things, and you still force it to answer, it will fill the blank.

That is what most people call hallucination.

I think hallucination is the least useful word for it.

It makes the problem sound mysterious. Most of the time, the issue is more practical.

The system had no safe way to stop.

No source path.

No confidence standard.

No rule that says, "If you cannot verify this, say you do not know."

No human checkpoint before the answer leaves the building.

So it guessed.

That does not make AI useless. It makes the architecture incomplete.

And once you see it that way, every wrong answer becomes more useful.

The first question is, "Why did AI get this wrong?"

The better question is, "What path was missing that would have helped it get this right, or stop before pretending?"

That is where the work starts.

Playbook

If you are putting AI into real workflows, especially anything customer-facing, you need to design for uncertainty.

Most people design for the answer.

That is the mistake.

You also need to design for the moment where the system does not know.

Start with one workflow where bad information would actually matter.

Guest messaging. Sales follow-up. Pricing recommendations. Legal or compliance context. Customer support. Financial reporting. Anything that touches a client, customer, owner, investor, or team member.

Then write down what the AI is allowed to know.

Where should it look first?

Your database?

Your notes?

Your CRM?

A current website?

A source document?

A human?

That is the first gate: source hierarchy.

If the answer should come from a source, name the source. Do not leave the agent to vibe its way into confidence. Vibes are not a retrieval strategy.

Second, define what counts as verified.

For some work, one source is enough. For other work, the agent should cross-check. If it is making a recommendation, it may need the original data, the current policy, and the relevant history.

Third, separate facts from assumptions.

Do not let the AI blend "I found this" with "I think this."

Facts should have receipts.

Assumptions should be labeled.

If something is missing, the system should say what is missing.

Fourth, give it the out.

Literally write the rule:

If you cannot verify the answer from the approved sources, say "I do not know" and explain what would be needed to answer safely.

That one line changes the behavior of the system.

Because now "I do not know" is not failure.

It is a valid output.

And for an operator, that is gold.

When an agent tells you it does not know, it is showing you the missing pathway in your business.

Maybe the information is trapped in someone's head.

Maybe the source exists but the agent cannot access it.

Maybe the rule was never written down.

Maybe the workflow relies on judgment that has never been turned into a checklist, rubric, or approval gate.

That is an operating-system problem.

The agent just exposed it.

The deeper version is where you place the rule.

To be fair, closed tools can still help here. In Claude, put this in Claude.md or project instructions. In ChatGPT, put it in custom instructions. When the answer is sensitive, put the rule in the prompt too.

If you do not know, say so. If you cannot verify it, stop. Tell me what source is missing.

That helps.

But understand the layer you are touching.

You are editing the part of the product the company exposes to you.

In an open system, you can push the rule deeper.

You can put it in core files. Soul files. Memory rules. Tool wrappers. Source hierarchy. Review gates. The harness itself.

Now "I do not know" is part of how the system behaves.

That is one reason I keep coming back to AI sovereignty.

Claude is useful.

ChatGPT is useful.

The issue is control.

If you only rent the surface, you wait for someone else to decide which controls you get.

If you own the architecture, you can build the control now.

The simple version:

Define the goal.
Define the sources.
Define the tools.
Define the stop signs.
Define what needs a human.
Define what proof must come back with the answer.

That is how you stop treating AI like a magic box and start treating it like a system.

And systems can be improved.

Orientation

This connects directly to the last few posts.

If workflows are the new primitive, then the workflow has to include uncertainty.

If you are the bottleneck, part of your job is turning hidden judgment into visible rules.

If your business is in the refactor period, this is one of the first places the rebuild has to happen.

And if Don't Marry Claude is about sovereignty, this is one reason sovereignty matters.

AI speeds up good work.

It also speeds up bad assumptions.

It speeds up missing context.

It speeds up overconfidence.

That is why I do not trust AI systems that cannot admit uncertainty.

I want the agent to stop before it guesses.

I want it to show me what it checked.

I want it to tell me what it could not find.

I want it to ask for a source instead of inventing one.

That is how I use Atlas.

That is how I think about guest-facing agents.

That is how I think about client work.

Early iPhones could have had custom backgrounds.

The hardware was capable.

Apple just did not expose the feature yet.

That is how a lot of closed AI products feel to me.

The intelligence is more powerful than the product window they give you.

The company decides when a capability becomes a feature.

Open systems let you reach more of that capability sooner.

Open source is not magic.

Architecture is the advantage.

The goal is not to remove humans from the loop as fast as possible.

The goal is to know exactly where the loop should stop.

Money should stop.

Legal should stop.

Destructive actions should stop.

Private data should stop.

Low-confidence facts should stop.

External communication should stop until the system has earned the right to move.

That is making AI usable.

The more I build with agents, the more I think trust is the real product.

Can the system do the work?

Can it show the receipt?

Can it stop when it should?

Can it learn from the correction?

That is the difference between a toy and an operating system.

And this is why "AI lies" is the wrong ending to the conversation.

The better operators will ask what source was missing.

They will ask what rule was vague.

They will ask where the human checkpoint belonged.

They will ask what the agent should have been allowed to say instead.

That is the practical work.

Give the system a way to say "I do not know."

Then use every "I do not know" to improve the system.

Comment below and tell me the last thing AI confidently got wrong for you.

What source should it have checked before answering?

I read every one.

— Brian

Field Notes

Playbook

Orientation

Get the next dispatch.