Skip to main content
← Back to BlogAgent Name Service and Web Bot Auth: The New Permission Layer for AI Crawlers

Agent Name Service and Web Bot Auth: The New Permission Layer for AI Crawlers

AIHelpTools TeamMay 5, 2026
agentic-aiweb-standardsapi-authcontent-monetization

Table of Contents

  1. The Problem with Today's AI Crawlers
  2. What Agent Name Service Actually Does
  3. How Web Bot Auth Works
  4. Implementation Guide for Site Owners
  5. Implementation for Agent Developers
  6. Monetization and Access Control Options
  7. What This Means for Content Publishers

The Problem with Today's AI Crawlers

Right now, AI crawlers show up at your site claiming to be GPTBot or Claude-Web, and you have exactly two options: believe the User-Agent string or don't. That's it. You can't verify the identity. You can't charge for access. You can't audit what they took.

Analogy: This is like letting anyone into your building as long as they're wearing a delivery uniform. No ID check, no signature, no tracking.

The old crawler contract worked because Google needed your content and you needed Google's traffic. Everyone played nice because the incentives aligned. AI training models broke that contract. They scrape once, train forever, and send you nothing back.

Cloudflare and GoDaddy just announced a technical fix: Agent Name Service (ANS) and Web Bot Auth. These aren't policy statements or gentleman's agreements. They're actual protocols that give every AI agent a verifiable identity and let you set real access rules.

What Agent Name Service Actually Does

ANS is a DNS-based registry that anchors AI agent identities to domain names. Think of it as a phone book, but for bots.

When an AI company registers an agent with ANS, they publish a DNS TXT record at a subdomain like _ans.agent.company.com. That record contains:

  • Agent identifier
  • Public key for signature verification
  • Links to transparency logs
  • Contact information

The transparency log piece matters. Every ANS registration gets logged in a Certificate Transparency-style append-only log. You can audit who registered what and when. No stealth agents.

Here's what a typical ANS lookup returns:

FieldExample ValuePurpose
Agent IDopenai.com/gptbotCanonical identifier
Public KeyJWK formatVerify signatures
Log EntryCT log URLAudit trail
Contactabuse@openai.comReport issues

The DNS anchoring is clever. If someone tries to impersonate GPTBot, they'd need to compromise OpenAI's DNS. That's a much higher bar than spoofing a User-Agent header.

How Web Bot Auth Works

Web Bot Auth is the authentication layer. It's an IETF draft standard that lets agents prove their identity on every request using cryptographic signatures.

The flow looks like this:

Agent Requests Content Server Checks Signature ANS Lookup Public Key Verify Identity Apply Rules Grant/Deny Access

Web Bot Auth request flow

The agent includes these HTTP headers with every request:

  • Signature-Agent: The ANS identifier
  • Signature: Cryptographic signature of the request
  • Signature-Input: Details about what was signed

Your server fetches the public key from ANS, verifies the signature, and knows exactly who's asking.

Implementation Guide for Site Owners

If you run a content site, you have three implementation paths:

Option 1: Use a CDN that supports it

Cloudflare is building native ANS/Web Bot Auth support. You'll get a dashboard where you can:

  • Allow verified agents automatically
  • Block unverified crawlers
  • Set rate limits per agent
  • Track which agents accessed what

DataDome already added Web Bot Auth to their bot protection service. If you use either platform, this becomes a configuration change.

Option 2: Implement verification yourself

The verification logic is straightforward:

async function verifyBotRequest(request) {
  const agentId = request.headers.get('Signature-Agent');
  const signature = request.headers.get('Signature');
  
  // Fetch public key from ANS
  const publicKey = await fetchANSPublicKey(agentId);
  
  // Verify signature
  const isValid = await crypto.subtle.verify(
    { name: 'RSASSA-PKCS1-v1_5' },
    publicKey,
    signature,
    buildSignatureInput(request)
  );
  
  return isValid ? agentId : null;
}

The hard part is deciding what to do with verified vs. unverified traffic.

Option 3: Proxy through an auth service

Several startups are building ANS verification as a service. You route crawler traffic through their API, they handle verification, you get back a clean decision.

Implementation for Agent Developers

If you're building an AI agent or crawler, Web Bot Auth is your credibility signal.

First, register with ANS. Publish your DNS record:

_ans.myagent.company.com TXT "v=ANS1; id=company.com/myagent; key=<JWK>; log=<CT-URL>"

Then sign every request. Here's the critical code from Stytch's implementation guide:

async function createWebBotAuthHeaders(url, signatureAgent, publicJWK, privateKey) {
  const now = Math.floor(Date.now() / 1000);
  const tomorrow = now + (24 * 60 * 60);
  const nonce = crypto.randomUUID();
  
  const hostname = new URL(url).hostname;
  
  const signatureInput = `("@method" "@target-uri" "@authority" "signature-agent" "signature-nonce" "signature-created" "signature-expires");created=${now};expires=${tomorrow};nonce="${nonce}";alg="rs256";keyid="${publicJWK.kid}"`;
  
  const signature = await signRequest(signatureInput, privateKey);
  
  return {
    'Signature-Agent': signatureAgent,
    'Signature-Nonce': nonce,
    'Signature-Created': now.toString(),
    'Signature-Expires': tomorrow.toString(),
    'Signature': signature,
    'Signature-Input': signatureInput
  };
}

Key points:

  • Use a real keypair, not a shared secret
  • Include timestamps to prevent replay attacks
  • Sign the full request context (method, URL, authority)
  • Use a unique nonce per request

Monetization and Access Control Options

This is where it gets interesting for publishers. Once you can verify agent identity, you can start charging.

Here are the access models publishers are testing:

ModelHow It WorksBest For
AllowlistOnly verified agents allowedHigh-value content
Tiered AccessFree tier + paid APINews sites
Pay-per-requestMicropayments via headerResearch databases
Attribution RequiredFree but must cite sourceAcademic content
Training-only LicenseDifferent price for training vs. inferenceAll content

The pay-per-request model is particularly clever. An agent includes a payment proof header (think macaroon tokens or Lightning Network invoices), and you verify payment before serving content.

Some publishers are thinking bigger. If you can identify the agent, you can track ROI. Did that Perplexity crawl lead to traffic? Did Claude's training run include proper attribution? You finally have audit data.

What This Means for Content Publishers

The practical takeaway: you're about to get real control over AI access to your content.

In six months, your crawler management will probably look like this:

  • Verified agents with good track records get automatic access
  • Unknown crawlers get blocked or rate-limited
  • Commercial AI companies pay per request or via license
  • You get logs showing exactly what was accessed

This doesn't solve every problem. Agents can still lie about what they'll do with your content. Enforcement still requires legal action. And some crawlers will ignore the new standards entirely.

But it solves the identity problem. You'll know who's asking. That's the foundation for everything else.

The bigger shift is philosophical. The old web ran on implicit permissions and good faith. The agentic web is moving toward explicit permissions and cryptographic proof. That's probably the right direction when billions of dollars of AI training are at stake.

For content owners: start thinking about your access policy now. Cloudflare's tools will be shipping soon. You'll need a plan for which agents to allow, which to charge, and which to block.

For agent developers: implement Web Bot Auth before site owners start blocking unverified traffic by default. Your credibility as a good actor depends on it.

The crawler contract is being rewritten. This time in code, not just convention.