Zero Configuration Infrastructure
The issue of agent generality isn’t intelligence. It’s plumbing.
I’ve been spending a lot of time lately researching and building infrastructure for AI agents at a startup, specifically around identity, credentials, and trust. And the more I dig into this space, the more I’m convinced there’s a gap that nobody’s really thinking about.
We keep talking about making agents smarter. Bigger context windows, better reasoning, more capable tool use. And sure, that matters. But I think there’s a much more immediate bottleneck that gets almost no attention: the environment these agents operate in. (not just good computer use like manus)
Here’s what I mean. I want to be able to tell my agent (clawdbot, or whatever I end up using six months from now) “hey, go find the cheapest flight to sf next weekend, book it, and get me a window seat.” (flights is like the most used example for general agent abilities idk why) And I want to hand it a few bucks and let it figure it out. Crucially, I don’t want to pre-configure which airline API it uses. I don’t want to set up accounts on five different travel services. I don’t want to paste API keys into env files. I just want it to go do the thing.
Right now? That’s impossible. Not because the model can’t reason about it. It absolutely can. Claude probably thinks things through better than you can. But it’s because every single service out there requires a human to sign up, verify an email, add a credit card, generate an API key, and wire it all together before the agent can make its first request. The agent’s capability ceiling isn’t set by its intelligence. It’s set by how much configuration I’ve done ahead of time.
The issue of agent generality isn’t intelligence. It’s plumbing.
I keep coming back to a pretty simple idea: what if an agent could show up to a service it has never used before, prove who made it, and who its acting for, and pay on the spot? All without a human pre-configuring an API or an account or anything.
The identity/token replaces the data a signup flow would provide. The payment protocol (eg; x402) replaces the billing relationship. Together they replace the API key.
Let me try to sketch out what I think this actually looks like.
Identity that is bound to the agent
In my mind an agent’s identity has two sides, and I think this distinction really matters.
There’s the developer identity, the “who built this thing” side. This should be cryptographically bound to the agent at credential time. It’s permanent. It tells you: this agent was built by this organization, they’ve been verified to a certain degree, here’s where they’re incorporated, here are the safety evaluations the agent has passed. This is the trust root. It’s what makes the agent traceable and the developer accountable. You can only approve agents made by anthropic for example, and reject all the ones from labs overseas.
Then there’s the user identity, the “who is this agent acting for right now” side. This is more like a session. Right now the agent is acting for me, with my authorization, within the boundaries I’ve set. In parallel it might be acting for someone else. Same software, different principal. Standard KYC linked to the agent works great here. A verified email at the very least.
The credential should encode the developer identity permanently and carry the user context as something that can rotate. A delegation, an attestation or a disclosure, whatever the right mechanism ends up being. The point is that a service receiving a request can answer both “should I trust this software?” and “who authorized this specific action?” from a single credential chain.
Payments at the protocol layer
The other half of this is payment. An agent can have a perfect credential proving who it is and what it’s capable of, but if it can’t pay for things, it still can’t do anything autonomously. This in my opinion is a major infrastructure problem.
I think payment needs to happen at the protocol layer, accompanying individual requests rather than requiring a pre-existing billing relationship. Something like x402, where HTTP status code 402 (Payment Required) becomes a real part of the conversation between agent and service. The service says “this costs X,” the agent pays X, the service responds. Done. Completely replace api billing all together.
For services that want more traditional arrangements (enterprise contracts, volume pricing), key management services could bridge the gap. The credential identity becomes the underlying account, and the KMS issues conventional API keys on top of it. But the point is that the credential + micropayment path should work by default, with traditional billing as an optional layer, not a requirement.
What the credential should carry
For this to work at scale with reliability and trust, the credential can’t just be a binary “verified” stamp. A service needs enough information to make its own trust decision. I think it needs to be rich. Something like:
The developer’s verification tier (how thoroughly they’ve been checked), the agent’s safety evaluation scores (prompt injection robustness, PII leakage resistance, tool abuse handling), what data categories the agent processes, its technical profile, operational contacts.
A healthcare API can look at this and say “I need level 3 verification and I need to confirm no PII retention.” A weather API can look at it and say “yeah, just needed the email anyway, whatever, you’re fine.” The credential provides the information. The policy is up to the service.
Where I’m still a little blurry
There are parts of this I haven’t fully thought through.
Service discovery. How does the agent find services it’s never used? My intuition says this mostly looks like web search, service compilers and hubs. But I think there’s a more interesting version of this: what if services exposed a special kind of endpoint specifically for agents? Not documentation meant for humans, but.. Maybe a skill file. Something that tells the agent what the service does, how to use it well, what the expected inputs and outputs look like, what the pricing is, what credential tier is required. Think of it like a robots.txt but for agent capabilities. The agent hits this endpoint, reads the skill, and now it knows how to interact with a service it’s never encountered before. Discovery and onboarding in one step. I don’t think this needs a grand registry. The web is already pretty good at discovery, and agents are already pretty good at reading instructions.
Credential ownership. I keep going back and forth on this, but I think I land on a self-sovereign model. You own your credential. The issuer can revoke it if you violate terms or something goes wrong, and that’s a necessary safety valve. But external validators and verifiers should be able to add signals to it too. Think of it less like a license that someone grants you and more like a passport that accumulates stamps. Different parties can attest to different things, but you hold it.
The interaction model. I envision a world where AI becomes the abstraction layer between you and platforms. Not another UI. Not another app. The agent just... does things on your behalf. And the authorization should feel as natural as turning to someone next to you and saying “yeah, go ahead” or “yep, do this for me please.” We’re not there yet on the UX side, but the infrastructure needs to be ready for when we are. Clawdbot is just the beginning towards a more… app-less future.
Adoption. The classic chicken-and-egg. Agents won’t carry credentials if no services accept them. Services won’t accept credentials if no agents carry them. But I actually think the self-sovereign model helps here. If the credential is something the developer owns and carries regardless, because it’s useful for traceability and trust even without universal acceptance, then adoption can happen gradually. Services opt in as the agent economy grows. There will emerge a need, and this could be one answer.
This isn’t a brand new idea
I want to be clear: none of this is entirely novel. There are people thinking about verifiable credentials for machines, agent payment protocols, decentralized identity. Some of this overlaps with ideas from the W3C Verifiable Credentials spec, HTTP Message Signatures (RFC 9421), the broader self-sovereign identity movement.
What I wanted to do here is just pull these threads together into a coherent picture of what I believe the ideal system looks like, and make the case for why it matters right now. Because the models are getting good enough that the infrastructure bottleneck is becoming the binding constraint. Environment and infrastructure matter more than intelligence for unlocking real long-horizon agency.
The agents of tomorrow won’t be more general because they’re smarter. They’ll be more general because we built them a world they can actually move through.
The unlock
At scale, this becomes the difference between an agent that can do 5 things you’ve configured it to do, versus the same agent being able to finish any task within the scope of the boundaries you’ve set.
Think about what changes. Today, adding a new capability to an agent means a code change, a new API key, a new billing account, a deployment. In a credential-based world, the agent just... uses the new service at runtime. It discovers it, presents its credential, pays for the interaction, and moves on. The agent’s capability space goes from “what the developer integrated” to “what exists.”
I think this is actually more important than making models smarter, at least for the near term. The METR insights and long-horizon task research are crucial, and that’s a real intelligence problem I want to write about separately. But even with today’s models, if we just solved the infrastructure problem, agents would be dramatically more useful. Intelligence is already there for a huge range of tasks. The plumbing is what’s missing.
~ pranav




