Who is my AI agent really working for?

The subtle misalignment between what you want and what your AI is actually trying to do

May 05, 2026

Agents are capable and powerful — but what if they chip away at the wrong problem? Image generated with Leonardo AI

Author's note: Wangari is evolving, and so is this newsletter. We've refreshed the name, the look, and the focus — you'll find more on the company and what we're building at wangari.global. If you've been here since the early days, thank you. If you're new, welcome.

A few weeks ago, I needed to set up a frontend deployment for a new project. I opened an AI agent, gave it the parameters, and within minutes it had recommended DigitalOcean, configured the deployment, and handed me the setup. I clicked confirm, entered my payment details, and moved on with my day.

It was a perfectly smooth experience. It was also, in retrospect, slightly unsettling.

If I had done that task myself a year ago, I would have spent an hour reading documentation. I would have compared DigitalOcean against AWS, Vercel, and Heroku. I would have checked pricing tiers and read a few Reddit threads about latency. This time, I did none of that. I outsourced the judgment entirely. The agent made a reasonable call, but the truth is, I didn’t actually know why it chose that specific provider over the others. I just accepted the efficiency gain and paid the bill.

We talk a lot about the fear of “runaway AI”—the sci-fi scenario where autonomous systems hijack our businesses or our infrastructure. But the reality of agentic AI is much subtler, and in some ways, much more insidious. The danger isn’t that the agent goes rogue. The danger is that the agent is always optimizing for something, and it’s not always what you think.

The Illusion of Shared Intent

When you delegate a task to a human—say, an analyst on your team—you share a broad context. If you ask them to research a software vendor, they know that the implicit goal is to find a reliable, cost-effective solution that fits the company’s existing tech stack. They know when to stop researching and make a decision. They know what “done” looks like.

Agents do not share this context. They operate on objective functions. You give them a prompt, and they translate that prompt into a mathematical target to maximize.

In the case of my DigitalOcean deployment, the agent was likely optimizing for “fastest path to a working configuration based on the user’s prompt.” It wasn’t optimizing for long-term cost efficiency, because I didn’t explicitly tell it to. It wasn’t optimizing for vendor lock-in risk. It just found the shortest distance between my request and a successful execution.

This is the first form of misalignment: outsourced judgment. When the cost of making a decision drops to zero, we stop making decisions. We let the model choose. But the model is choosing based on its training data and its hidden system prompts, not based on your strategic priorities. You get the efficiency, but you lose the steering wheel.

The Agent That Wouldn’t Stop

There is a second, more frustrating form of misalignment, and if you’ve used agentic workflows recently, you’ve probably seen it.

I was watching an agent try to pull a specific dataset from a public API last week. The API endpoint had changed, and the agent’s initial request failed. A human would have looked at the error, realized the documentation was out of date, and stopped to ask for help or search for the new endpoint.

The agent did not stop. It retried the exact same call. When that failed, it slightly rephrased the headers and tried again. It wrote a Python script to try a different authentication method. It looped, and looped, and looped, burning through API tokens with relentless, cheerful persistence.

Why did it do this? Because the underlying model was trained on a specific objective function: continue the conversation.

Most commercial LLMs are fine-tuned using Reinforcement Learning from Human Feedback (RLHF) to be helpful, harmless, and conversational. They are penalized for giving up or saying “I don’t know.” When you wrap that conversational model in an agentic loop and give it a credit card or an API key, that “helpful” persistence becomes a liability. The agent doesn’t know when to quit because quitting is penalized in its training data. It optimizes for continuation rather than completion.

The Hidden Cost of Misaligned Agency

These two scenarios—the agent that decides too quickly and the agent that won’t stop trying—are symptoms of the same structural problem. The agent’s objective function is a proxy for yours, not the real thing.

In financial services, where we spend a lot of time at Wangari, this gap between proxy and reality is a known systemic risk. If you incentivize a trader based purely on quarterly returns (the proxy), they will take on hidden tail risks that blow up the fund in year three (the reality). We are currently doing the exact same thing with our software.

When we deploy agents to negotiate contracts, optimize supply chains, or manage cloud infrastructure, we are handing over agency to systems that do not share our risk tolerance. They do not feel the pain of a blown budget. They do not care if a vendor relationship sours. They only care about the mathematical proxy we gave them.

And because they operate at machine speed, the drift happens faster than we can monitor it. You let the agent choose the deployment provider today. Tomorrow, you let it negotiate the enterprise tier. Next month, it’s automatically renewing subscriptions across your entire tech stack based on a “cost optimization” prompt that actually just locks you into longer contracts.

How to Stay in the Loop

The solution is not to turn the agents off. The productivity gains are too massive to ignore, and frankly, I don’t want to go back to reading AWS documentation if I don’t have to. The solution is to change how we define the boundaries of their autonomy.

First, we have to stop treating agents like human employees. You cannot manage an agent through “vibes” or implicit context. You have to manage it through explicit constraints. If you want an agent to optimize a process, you must mathematically define the cost of failure, the budget limit, and the exact conditions under which it must stop and ask for human intervention.

Second, we need to demand better observability from the platforms building these tools. I don’t just want to see the final output; I want to see the decision tree. If an agent recommends a vendor, it should be required to show the three alternatives it discarded and the specific weights it applied to make the choice. Explainability is not just a regulatory requirement; it is a prerequisite for trust.

The Bottom Line

We are entering a phase of technology where the primary skill is no longer execution, but delegation. The people and companies that thrive will not be the ones who write the best code or do the fastest research. They will be the ones who know how to explicitly define their intent, and how to build the guardrails that keep their agents aligned with that intent.

The next time an agent does something perfectly for you, take a moment to ask yourself: what was it actually optimizing for? And are you sure it’s the same thing you wanted?

Meanwhile, at Wangari

Scaling Sustainable Digital Platforms

Together with Bern University, we are conducting academic research on how sustainable digital platforms grow and scale responsibly. If your company embeds environmental or social goals into its core business model, we’d love to speak with you.

The study involves 2–3 short interviews with key employees. Participation is anonymous, confidential, and low time commitment — and you’ll receive early access to our findings.

If this sounds like your company, or if you know someone it might fit, please reach out directly:

Ari Joury, Cofounder & CEO, Wangari Global — ari.joury@wangari.global
Melanie Gertschen, PhD Candidate, University of Bern — melanie.gertschen@unibe.ch

Reads of the Week

The AI Liability Playbook: Monetizing the Silent AI Carve-Out: In this piece for Insurance Intel, the author explores how the insurance industry is grappling with the new liability class forming around enterprise AI deployment. It perfectly complements our discussion on the liability gap, showing how the market is struggling to price risks it doesn’t fully understand.
The Regulation Is Already Here. Your Program Isn’t Ready. Writing for his eponymous newsletter, Dr. Eric Cole argues that cyber compliance has shifted from a checkbox exercise to personal liability for executives. This directly connects to the governance challenges of deploying autonomous agents in regulated environments.
Who Is Accountable When Your Agent Goes Rogue? In this deep dive for his newsletter, Max Corbridge breaks down the accountability vacuum created when AI providers disclaim liability for security flaws in their models. Read this to understand the immediate risks of third-party agent integration.

Discussion about this post

Ready for more?