·7 min read

Agent Identity: Why Bearer Tokens Fail AI API Authentication

If an AI agent can fetch your data, can you prove which user, model, or workflow authorized it—and revoke that authority instantly? This post compares wallet-based identity, short-lived JWT delegation, and MCP session tokens, and shows why bearer tokens alone can’t answer the attribution problem.

When Microsoft disclosed CVE-2024-21413 in January 2024, the interesting part wasn’t just that Outlook could be abused to leak NTLM hashes. It was that the mailbox, the service, and the user all looked “legitimate” to the systems in the middle. That is the same class of problem AI agents create when they fetch data through bearer tokens: the API sees a token, not the human, model, workflow, or approval path that minted it.

Bearer tokens answer “is it valid?” and stop there

Bearer tokens are fine at one job: they prove possession. If the token is unexpired and signed correctly, the API accepts it. That is exactly why OAuth 2.0 bearer tokens became the default glue for Slack, Google Workspace, GitHub, and every SaaS integration team that wanted to ship before quarter-end. But possession is not attribution. A bearer token does not tell you whether the request came from a ChatGPT connector, a LangChain agent, a cron job, or a compromised runner on an EC2 instance.

That distinction matters the moment an AI agent is allowed to act on behalf of a user. If the agent can read a customer record in Salesforce, pull a Jira ticket, or query a Snowflake warehouse, “valid token” is not a security answer. It is a receipt.

The problem gets worse because bearer tokens are deliberately transferable. Copy one from a browser profile, a log file, a memory dump, or a misconfigured proxy, and the API has no native way to tell whether the original user still intended the call. This is why token theft keeps showing up in real incidents: the attacker does not need to break cryptography; they just need to reuse the thing the system already trusts.

Why AI agents break the old “service account and hope” pattern

A lot of teams are still wiring agents through a single service account and calling that “least privilege.” That is not least privilege; it is shared fate with a nicer dashboard. If three workflows use the same bearer token, you cannot revoke one workflow without breaking the other two, and you cannot answer who approved the call after the fact.

That attribution gap is not theoretical. In the Okta support-system breach disclosed in 2023, attackers abused support access paths to pivot into customer environments, and the postmortem became a master class in how identity systems fail when the trust chain is too coarse. AI agents magnify that weakness because they don’t just call one API once. They chain calls, retry failures, and fan out across systems faster than most audit pipelines can stitch together.

The common advice is to “just use short-lived tokens.” Short-lived is better than long-lived, obviously, but it does not solve provenance. A 5-minute bearer token stolen at minute 4 is still useful, and a 5-minute token issued to “agent-17” still tells you nothing about which user prompt, policy decision, or model invocation authorized the access. Expiry reduces blast radius; it does not create accountability.

Wallet-based identity gives you a stronger subject than a string in a header

Wallet-based identity, whether you call it decentralized identity, verifiable credentials, or just “stop pretending a random bearer token is a person,” tries to bind requests to a cryptographic subject with a usable audit trail. The point is not blockchain theater. The point is that the caller can prove control of a key tied to an identity, and that identity can be scoped to a role, workflow, or tenant.

For AI agents, that matters because the agent can carry a credential that is not just “valid” but attributable. If the workflow is approved by a human in Okta, tied to a workload identity in AWS IAM Roles Anywhere, or anchored to a device key in a hardware-backed wallet, you can at least reconstruct who delegated what to whom. That is a lot more useful than staring at a JWT and guessing from the sub claim like it’s a horoscope.

The catch: wallet-based identity is operationally ugly. Key recovery, rotation, and user experience are all harder than handing out OAuth tokens. But that friction is the point. If an AI agent can spend money, move data, or trigger side effects, you should want the authorization ceremony to be annoying.

Short-lived JWT delegation is the least-bad option when you need auditability

Short-lived JWT delegation sits in the middle ground. Instead of giving the agent a standing bearer token, the user or workflow broker mints a narrowly scoped JWT with an explicit issuer, audience, expiry, and delegation chain. Done properly, that lets you encode “Alice approved this agent to read these Jira issues for 10 minutes” instead of “someone once logged in.”

This is the model worth stealing from systems like Google Cloud’s service account impersonation and AWS STS session chaining, where the effective principal can be traced through a sequence of assumptions. It is also the model most teams skip because they confuse “we can make the call work” with “we can defend the call later.”

The contrarian bit: JWTs are not magic just because they are signed. A beautifully signed JWT with a 24-hour lifetime and no delegation context is still a bearer token with better branding. If you cannot answer who minted it, why, and under which policy, you have built a more legible liability, not a safer system.

MCP session tokens help with agent continuity, not root-cause attribution

Model Context Protocol session tokens are useful because they keep an agent’s conversation state and tool access coherent across multiple calls. That solves a real engineering problem: agents need continuity, and every request should not require a full re-authentication ceremony. But an MCP session token is still only as good as the identity assertion behind it.

If the session token is just a long-lived capability attached to a model runtime, you have recreated the same bearer-token problem in a shinier wrapper. The right pattern is to bind the MCP session to a delegated identity with a short lifetime, then log the chain: user, policy decision, model instance, tool call, and downstream API request. Without that chain, your incident response team gets to play archeology with JSON blobs.

This is where most “agent security” advice falls apart. People obsess over prompt injection and ignore the fact that the agent’s tool token is often the real crown jewel. Prompt injection is annoying; unauthorized data extraction through a valid token is how you end up explaining yourself to legal.

Revocation has to kill the delegation chain, not just the token

If your revocation story is “wait for expiry,” you do not have revocation. You have a timer. Real revocation means the authority that minted the delegation can invalidate the session, the downstream APIs can reject the chain immediately, and your logs preserve the revoked subject for forensic review.

That is why bearer tokens alone fail the attribution problem. They are excellent at saying “presented correctly,” and useless at answering “who authorized this, under what policy, and can we kill just that authority now?” For AI agents, that second question is the one that matters when the model starts reading the wrong bucket or the wrong customer tenant.

The Bottom Line

Stop handing AI agents raw bearer tokens and calling it identity. Use short-lived delegated JWTs or session-bound credentials that encode the human, workflow, policy, and expiry, and make sure revocation kills the delegation chain, not just the current access token.

Instrument every tool call with issuer, subject, audience, and delegation ID, then test whether you can trace a single agent request from user approval to downstream API hit in under five minutes. If you cannot, you do not have attribution; you have a pile of logs.

References

← All posts