The Agent Commerce Tech Stack — Every Layer, Every Tool, the Real Monthly Cost

Q: Can I ship without UCP and just use ACP?

Yes. ACP (Stripe + OpenAI) is the lower-effort path if you are already on Stripe — enabling agentic payments is as little as one line of code. UCP is more ambitious and more work. The two are not mutually exclusive. Incremental effort to add ACP after UCP is 2–4 hours; adding UCP after ACP is 8–16 hours.

Q: Do I need a vector database from day one?

No. For catalogs under 1,000 SKUs, Postgres full-text search works. Add pgvector when the catalog grows past 1,000 SKUs, when agents start sending paraphrased natural-language queries, or when you want find-similar-product semantics. Embedding 100K products with text-embedding-3-small costs about $0.40 one-time.

Q: Vercel or Fly.io for the API runtime?

Vercel Pro at $20/mo is the documented best practice for solo founders shipping FastAPI — auto-detected entrypoint, native ASGI, Fluid Compute billing on Active CPU rather than wall-time. Fly.io at roughly $3.32/mo for shared-cpu-1x 512MB always-on, with $0.02/GB egress, wins when you need WebSockets, SSE streaming, or absolute lowest bandwidth cost.

Q: How do I distinguish AI agent traffic from human traffic in my API logs?

No tool does it automatically. Four practical paths: parse User-Agent for known SDK strings; verify RFC 9421 HTTP Message Signatures from agents that ship them; issue separate API keys for agent vs. human-facing clients and tag every request by key type; put Cloudflare in front and use Bot Analytics.

Q: What is the cheapest way to serve product images to agents?

Cloudflare R2 with zero egress fees. Storage at $0.015/GB-month with 10 GB free; zero egress regardless of volume. S3 at $0.09/GB egress would cost roughly $90 for 1 TB of agent-driven image fetches; R2 costs nothing.

Q: Do I really need rate limiting on day one?

Yes. Agents call at machine speed. A runaway agent loop can exhaust a metered free tier in minutes; an attacker can do it on purpose. Implement per-IP for anonymous, per-API-key for authenticated, backed by Redis. UCP's spec already assumes rate limiting exists.

Q: What is the minimum I need to test before shipping?

Four things: MCP Inspector for interactive checks if you ship MCP; Prism mock server to validate your OpenAPI spec against simulated agent calls; the UCP conformance suite if you ship UCP; EvalView for regression testing of agent tool-call behavior in CI.

Q: Why is OpenAPI specifically important for agents — isn't it just developer docs?

Because LLMs consume it directly. AI Agents rely on precise, machine-readable signals — when the API lacks predictable schemas, typed errors, and clear behavioral rules, AI agents cannot function. OpenAPI is the contract; MCP tools, the OpenAI Agents SDK, LangChain, Speakeasy, and FastMCP all auto-generate agent integrations from it.

§ 01 · Lede

The complete stack at a glance.

Eight layers, every one of which has to exist before an AI agent can find your store, query your catalog, transact, and report the result back to its user. Most solo founders show up thinking "I just need a Stripe button." The actual stack is closer to this:

┌─────────────────────────────────────────────────────────────────────────┐
│                    AGENT COMMERCE TECH STACK (2026)                     │
├──────────────┬──────────────────────────────────────────────────────────┤
│ Layer        │ Tool (primary → alternative)                             │
├──────────────┼──────────────────────────────────────────────────────────┤
│ L1 Hosting   │ Vercel Fluid Compute → Railway → Cloud Run               │
│ L2 Database  │ Supabase (Postgres + pgvector) → Neon                    │
│ L3 Caching   │ Upstash Redis → Cloudflare KV                            │
│ L4 API GW    │ Zuplo → Cloudflare Workers → Kong                        │
│ L5 Auth/Keys │ Unkey → JWT + Supabase Auth                              │
│ L6 Billing   │ Stripe → Lemon Squeezy (tax-inclusive)                   │
│ L7 Agent I/F │ OpenAPI spec + FastMCP server + UCP checkout             │
│ L8 Observ.   │ Grafana Cloud (metrics/logs) + Sentry (errors)           │
│ CI/CD        │ GitHub Actions + Vercel preview deployments              │
│ Vector Search│ pgvector (free/bundled) → Pinecone → Qdrant Cloud       │
└──────────────┴──────────────────────────────────────────────────────────┘

The second diagram shows where this maps to the broader protocol and payment stack:

┌─────────────────────────────────────────────────────────────┐
│ 1. AI Surface             Google AI Mode · ChatGPT · Claude │
├─────────────────────────────────────────────────────────────┤
│ 2. Protocol               UCP · ACP · MCP · A2A             │
├─────────────────────────────────────────────────────────────┤
│ 3. Discovery              /.well-known/ucp · Schema.org ·   │
│                           Merchant Center feed              │
├─────────────────────────────────────────────────────────────┤
│ 4. API + OpenAPI          FastAPI on Vercel Fluid Compute   │
├─────────────────────────────────────────────────────────────┤
│ 5. Data                   Postgres (Supabase/Neon) +        │
│                           pgvector + Upstash Redis          │
├─────────────────────────────────────────────────────────────┤
│ 6. Payment Authorization  AP2 mandates · Stripe SPTs ·      │
│                           Visa Trusted Agent Protocol       │
├─────────────────────────────────────────────────────────────┤
│ 7. Billing + API Keys     Stripe Billing · Unkey / Zuplo    │
├─────────────────────────────────────────────────────────────┤
│ 8. Observability          Sentry · Logfire · Cloudflare     │
│                           Bot Analytics · structured logs   │
└─────────────────────────────────────────────────────────────┘

Per the UCP specification at ucp.dev, this stack collapses N×N integration complexity into a 1-to-Many model: implement once, every UCP-compliant agent can reach you. The eight layers below are what "implement once" actually means.

What agents actually touch

Every purchase-path request from a shopping agent passes through L4 (API gateway / rate limiting), L7 (auth keys), the FastAPI router, L5 (database), and the OpenAPI/MCP response. Caching, observability, and vector search are non-blocking supporting infrastructure.

Group 1 · Layers 1–2

Surface & Protocol

Where the buyer's agent actually shows up — and the language it speaks before it touches your API.

Layer 01

AI Surface Layer

Where the buyer is Build for, not build

You don't build this

What it is: The consumer-facing AI app — Google AI Mode, ChatGPT, Claude, Gemini, Perplexity — that the buyer is talking to.

Why it's in the stack: You don't build this layer. You build for it. Every other layer below is calibrated to what these surfaces can actually call.

Solo founder action: Pick at least one surface to optimize for on day one. Per Google's UCP Guide, Google AI Mode requires an active Merchant Center account plus UCP integration. For ChatGPT Instant Checkout, you need the Stripe-backed Agentic Commerce Protocol (ACP). Pick one. Don't try to ship both at MVP.

Decision before Day 1

Choose one surface to optimize for first — Google AI Mode (UCP path) or ChatGPT Instant Checkout (ACP path). The rest of the stack inherits from this pick.

Layer 02

Protocol Layer

UCP · ACP · MCP · A2A Open standards

UCP or ACP

What it is: The agreed language between agent and merchant — Universal Commerce Protocol (UCP), Agentic Commerce Protocol (ACP), Model Context Protocol (MCP), Agent-to-Agent (A2A), with the Agent Payments Protocol (AP2) underneath for payment authorization.

Why it exists: Without a protocol, every agent integration is a one-off. Per Google's UCP announcement, UCP is "an open-source standard designed to power the next generation of agentic commerce. By establishing a common language and functional primitives, UCP enables seamless commerce journeys between consumer surfaces, businesses, and payment providers."

Tools that implement it

UCP — /.well-known/ucp JSON manifest plus REST or MCP transport. Open-source, backed by Google, Shopify, Etsy, Wayfair, Target, Walmart, Stripe, Visa, Mastercard.
ACP — Stripe + OpenAI's open standard at agenticcommerce.dev, Apache 2.0, powers ChatGPT Instant Checkout.
MCP — Anthropic's JSON-RPC 2.0 protocol, now under the Linux Foundation's Agentic AI Foundation per WorkOS's MCP guide.

Tradeoff

UCP is the most ambitious — full commerce journey, decentralized via /.well-known/. ACP is the most pragmatic if you're already on Stripe; per Stripe's announcement, existing Stripe users can enable ACP with "as little as one line of code." MCP is the lowest-level — it's just the tool-calling pipe.

End of Layer 2 deliverable

One protocol picked. If on Stripe already, ACP. If building catalog-first and want decentralized discovery, UCP. MCP gets added as the agent-facing adapter on top of either.

Group 2 · Layers 3–4

Discovery & API

How an agent finds you in the first place, and the HTTP surface it actually hits.

Layer 03

Discovery Layer

Schema.org · /.well-known/ucp Merchant Center feed

Static JSON

What it is: How an agent finds you. The /.well-known/ucp manifest, Schema.org Product markup on every product page, and a Merchant Center feed.

Why it exists: Per Schema App's analysis quoting Google: "Structured data is critical for modern search features because it is efficient, precise, and easy for machines to process… Schema Markup is no longer an SEO tactic. It is core infrastructure for AI-driven search."

Tools

Schema.org Product type on every product page.
Google Merchant Center account + product feed.
A static JSON file served at /.well-known/ucp — no runtime needed for the manifest itself.

Minimum Google AI Mode requirements

Per Google's structured data docs and dataiads.io feed analysis: GTIN, MPN, brand, product titles of at least 30 characters, descriptions of at least 500 characters, a minimum of 3 additional images at 1500×1500px, and real-time inventory sync every 15–60 minutes.

End of Layer 3 deliverable

Schema.org JSON-LD on every product page, a /.well-known/ucp manifest served over HTTPS, and a feed live in Google Merchant Center.

Layer 04

API + OpenAPI Layer

FastAPI · Vercel Fluid Compute 6 minimum endpoints

FastAPI + Vercel

What it is: The HTTP endpoints an agent calls — products, cart, checkout, orders — plus the OpenAPI spec that describes them.

Why it exists: Per the Postman State of the API Report: "AI Agents rely on precise, machine-readable signals, not tribal knowledge. When your API lacks predictable schemas, typed errors, and clear behavioral rules, AI agents can't function as they're intended to." Only 24% of developers actively design APIs with agents in mind.

Tools

FastAPI on Python 3.12/3.13/3.14 — gives you a working OpenAPI spec at /openapi.json for free.
Vercel Fluid Compute as the runtime (per Vercel's FastAPI docs, auto-detected, native ASGI, Fluid Compute enabled by default).
Alternatives: Cloud Run, Fly.io, Railway, AWS Lambda + Mangum — see the Cost Breakdown table for tradeoffs.

Minimum UCP REST surface

Per the dev.to UCP vs ACP technical comparison, six endpoints are required:

GET  /products
GET  /products/:id
POST /checkout-sessions
PUT  /checkout-sessions/:id
POST /checkout-sessions/:id/complete
GET  /orders/:id

Cold start caveat

Python serverless cold starts run 600ms–1,500ms on Vercel and Lambda depending on bundle size. Agents operating synchronously will time out if cold starts exceed their patience window. Mitigation: Railway ($5/mo always-on container) for any endpoint agents hit on a strict SLA, or use Vercel's keep-alive configuration.

End of Layer 4 deliverable

FastAPI app deployed to Vercel Pro with the six UCP endpoints live, /openapi.json auto-generated, and a custom domain pointed at the deployment.

Group 3 · Layers 5–6

Data & Payments

The store under the API, the vector index next to it, and the cryptographic proof that the buyer actually authorized the charge.

Layer 05

Data Layer

Postgres · pgvector · Redis Supabase or Neon

Supabase Pro

What it is: A relational store for the product catalog, orders, users, and sessions; a vector store for semantic product search; a Redis cache for rate limiting and idempotency keys.

Why it exists: Per upsun.com's UCP analysis: "Implement Vector Databases (like Qdrant or pgvector) to cache the meaning of a request. If the agent asks a question that is semantically close to a cached result, the system can serve the cached JSON without hitting the primary SQL database." Agents query in natural language; keyword search is the wrong index.

Tools

Postgres on Supabase ($25/mo Pro) or Neon (usage-based, ~$15/mo at low scale) — both ship with pgvector.
pgvector as the vector index, free if you're already on Postgres. Per Firecrawl's vector database guide, pgvector + pgvectorscale delivers 471 QPS at 99% recall on 50M vectors — 11.4× the throughput of standalone Qdrant and roughly 75% cheaper than Pinecone at scale.
Upstash Redis — 500K commands/month free; pay-as-you-go at $0.20/100K thereafter (Upstash pricing).

Embedding cost is negligible

A 100,000-product catalog at roughly 200 tokens each = 20M tokens = $0.40 one-time using OpenAI text-embedding-3-small at $0.02/1M tokens. Voyage AI gives the first 200M tokens free on voyage-4-lite. Embedding cost is rarely the constraint.

Supabase free tier trap

Supabase free projects pause after 1 week of inactivity. Catastrophic for an always-on agent API. Neon free suspends compute when the monthly CU-hour limit is hit. Upgrade to Supabase Pro ($25/mo) or Neon Launch before your first real user — this is non-negotiable for agent workloads.

End of Layer 5 deliverable

Postgres running on Supabase Pro with pgvector enabled, product catalog embedded, Upstash Redis connected, and the FastAPI app reading from both.

Layer 06

Payment Authorization Layer

AP2 · Stripe SPTs · Visa TAP Cryptographic mandates

Stripe SPTs

What it is: The cryptographic proof that the buyer authorized this specific purchase — separate from the act of charging the card.

Why it exists: Per Stripe's agentic commerce announcement: "Trust can't be inferred — it has to be explicitly granted, scoped, and enforced in code." Shopping agents should never see raw payment credentials.

Tools

AP2 (Agent Payments Protocol) at ap2-protocol.org — Verifiable Digital Credentials chained together; Cart Mandate (human-present, hardware-signed) and Intent Mandate (human-not-present, pre-signed with TTL).
Stripe Shared Payment Tokens (SPTs) — scoped to a specific business, time-limited, amount-capped, webhook-revokable.
Visa Trusted Agent Protocol — per Visa's announcement, built on RFC 9421 HTTP Message Signatures and aligned with Cloudflare Web Bot Auth.

Replay protection

Per the CSA AP2 security analysis: TTL on Intent Mandates (24h recommended), creation_time timestamps, hardware-backed ECDSA signatures, SHA-256 checksums on all A2A messages, and idempotency keys on every checkout endpoint. The CSA reports AP2 reduces fraud from 2.1% (API-based) to 1.15% in simulated transactions.

End of Layer 6 deliverable

Stripe SPT or AP2 Cart Mandate flow live on /checkout-sessions/:id/complete, with idempotency keys enforced at the database layer and TTL on every Intent Mandate.

Group 4 · Layers 7–8

Billing & Ops

The money in, and the dashboard that tells you when something is wrong.

Layer 07

Billing + API Key Management

Stripe Billing · Unkey · Zuplo Per-key rate limits

Stripe + Unkey

What it is: How you charge for API access — free tier metering, paid tier enforcement, key issuance, rate limits per key.

Why it exists: A free-to-paid funnel is the only way an agent-first API scales. Stripe handles money, Unkey handles keys, Zuplo (optional) handles the gateway in front.

Tools

Stripe Billing at 0.7% of subscription volume on top of 2.9% + $0.30 base. Compounding fee is real — see Shortcuts below.
Unkey for API key issuance, rate limiting, and verification. [PARTIALLY VERIFIED — Unkey's public pricing page has pivoted to a CPU/memory/egress infrastructure model; per-verification quotas referenced come from their engineering RFC.] (unkey.com/pricing)
Zuplo ($25/mo Builder) as the heavier-duty alternative — 100K requests included, but overage is steep at $100 per additional 100K (Zuplo pricing).

Stripe Billing math

At $50/mo subscriptions: 2.9% + $0.30 base + 0.7% Billing = 4.2% effective fee. At $10K MRR, that's $700/month in Billing fees alone. The annual Billing plan ($620/mo) breaks even above roughly $90K MRR — not before.

End of Layer 7 deliverable

Stripe checkout for API subscriptions live; Stripe webhook creates/revokes Unkey keys on payment events; per-key rate limits enforced at request time.

Layer 08

Observability Layer

Sentry · Logfire · Grafana Bot Analytics

Sentry + Logfire

What it is: Error tracking, distributed tracing, structured logs, agent vs. human segmentation, and uptime monitoring.

Why it exists: Per Cloudflare Radar's Year in Review, AI bot share of HTML traffic averaged 4.2% globally, with crawl-to-refer ratios reaching 25,000:1 to 500,000:1 for Anthropic's ClaudeBot. You will get drowned in bot traffic; you need to see it.

Tools

Sentry — 5K errors/mo free; $26/mo Team plan for production (Sentry pricing).
Pydantic Logfire — built by the Pydantic team; one-line FastAPI instrumentation; first-class LLM/token tracking.
Grafana Cloud — the most generous free tier in the category: 10K metric series, 50 GB logs, 50 GB traces, 50 GB profiles (Grafana pricing).
Cloudflare Bot Analytics — the only network-level service that classifies bot vs. human reliably; full bot management is Enterprise-only.

The honest answer on agent traffic detection

No tool automatically segments human vs. AI agent traffic. Practical approaches: parse User-Agent for known SDK strings, scope API keys by client type, or verify RFC 9421 HTTP Message Signatures per Cloudflare's Web Bot Auth and Signed Agents posts.

End of Layer 8 deliverable

Sentry catching errors, Logfire instrumenting FastAPI, Grafana dashboards on request volume + latency percentiles, and spend alerts set on Vercel and Supabase at 80% of expected monthly burn.

§ 02 · Cost Reference

Full stack cost by MRR stage.

All prices fetched from official pricing pages. "$1K MRR" assumes ~1,000 active customers and ~3M API calls/month. "$10K MRR" assumes ~10K customers and ~30M API calls/month.

Layer	Tool	Free Tier	At $1K MRR	At $10K MRR	Source
Hosting	Vercel (Fluid Compute, Pro)	1M invocations + 4 CPU-hrs (Hobby, non-commercial)	$20/mo (Pro, $20 usage credit included)	$60–120/mo	vercel.com/pricing
Database	Supabase Pro	500 MB DB, pauses at 1 wk idle	$25/mo + $10 Micro compute	$35–60/mo	supabase.com/pricing
Database (alt)	Neon Launch	100 CU-hrs + 0.5 GB storage	~$15/mo (~80 CU-hrs)	~$40–60/mo	neon.tech/pricing
Cache / KV	Upstash Redis	500K cmds + 256 MB	~$5/mo (~2.5M cmds)	~$20/mo (~10M cmds)	upstash.com/pricing
Vector DB	pgvector (on Postgres above)	Free (extension)	$0 (bundled)	$0 (bundled)	pgvector
Payments	Stripe (cards)	None	~$29/mo (2.9% + $0.30 × ~50 charges)	~$290/mo	stripe.com/pricing
Subscriptions	Stripe Billing	None	$7/mo (0.7% of $1K)	$70/mo (0.7% of $10K)	stripe.com/pricing
API Key Mgmt	Unkey	~150K req (per RFC)	$5/mo (Starter)	$25/mo (Pro)	unkey.com/pricing [PARTIALLY VERIFIED]
Error Tracking	Sentry	5K errors/mo	$26/mo (Team, 50K)	$26–50/mo	sentry.io/pricing
Tracing/Logs	Grafana Cloud	10K series + 50 GB logs	$0–10/mo (within free)	$30–80/mo	grafana.com/pricing
CDN + WAF	Cloudflare	Unlimited DDoS, CDN, SSL	$0 (Free is fine)	$20/mo (Pro)	cloudflare.com/plans
Domain	Cloudflare Registrar	N/A	$0.87/mo (.com at-cost)	$0.87/mo	cloudflare.com
Email (transactional)	Resend	3,000/mo (100/day)	$20/mo (Pro, 50K)	$35/mo (100K)	resend.com/pricing
Estimated total	—	—	~$1/mo (domain)	~$130–145/mo	~$580–810/mo

Vector search add-on (if needed)

pgvector handles product catalog semantic search up to ~1M vectors with an HNSW index at zero additional cost on Supabase Pro. Switch to a dedicated vector database at three signals: (a) catalog exceeds 1M vectors, (b) p99 semantic search latency exceeds 500ms under concurrent agent load, or (c) you need multi-tenant vector isolation.

Tool	Free Tier	At $1K MRR	At $10K MRR	Source
pgvector (bundled)	$0 (included in Supabase/Neon)	$0	$0	pgvector
Pinecone	$0 (2GB, 2M write, 1M read units)	$20/mo (Builder, 10GB)	$50+/mo (Standard)	pinecone.io/pricing
Qdrant Cloud	$0 forever (0.5vCPU, 1GB RAM, 4GB disk)	Usage-based	Usage-based	qdrant.tech/pricing [PARTIALLY VERIFIED]
Weaviate	$0 (14-day trial only)	$45/mo (Flex, min)	$45–400/mo	weaviate.io/pricing

Reading the table: At zero revenue you can run a production-grade agent commerce stack for about a dollar a month, but only because Vercel Hobby is non-commercial. The moment you take a paying customer, you owe Vercel $20/mo for Pro and Supabase $25/mo for a Postgres that doesn't pause itself. From there, the next $80/mo gets you everything else — error tracking, billing, API keys, transactional email, domain. The first $130/mo carries you to roughly $1K MRR. From $1K to $10K MRR, the largest cost growth is Stripe processing fees, not infrastructure.

§ 03 · Build Order

Build order & dependencies.

One row per stack layer. "Hard Dependency On" lists the layers that must exist first. "Can Be Deferred" means the stack can ship and transact without it.

Layer	What It Enables	Hard Dependency On	Can Be Deferred?	Day-One Required?
1. Structured Product Data (Schema.org + Merchant Center)	Agents find products via Google AI Mode and crawler indexing	Product pages + GTINs + 3+ images at 1500×1500	No — discovery starts here	Yes
2. FastAPI commerce REST API	Programmatic access to products, cart, checkout, orders	Database + hosting	No — UCP REST transport depends on this	Yes
3. OpenAPI spec	LLMs read endpoints, parameters, error shapes; auto-generated MCP	API exists (FastAPI emits this for free)	No — agent legibility starts here	Yes
4. UCP `/.well-known/ucp` manifest	Decentralized agent discovery; full UCP journey	API + OpenAPI spec; HTTPS; no 3xx redirects	Yes — can ship private-beta API without it	No — only for UCP-compliant discovery
5. MCP server	Direct tool-use from Claude/ChatGPT/Cursor	OpenAPI spec or REST API	Yes — UCP MCP transport is optional	No — defer unless your buyer uses Claude/Cursor
6. Billing gate	Free-to-paid funnel; key issuance; rate limit enforcement	API + database + Stripe	Yes — can launch free-only for first week	No — but defer with caution
7. Vector search (pgvector)	Semantic product discovery for natural-language agent queries	Postgres + embedding pipeline	Yes — keyword search works under 1K SKUs	No — defer until catalog or queries justify
8. Observability	Error visibility, agent traffic segmentation, spend alerts	API deployed somewhere	No — flying blind is how $1,141 Vercel bills happen	Yes (at minimum: Sentry free + spend alerts)

The minimum viable day-one stack

Per the dev.to UCP vs ACP comparison: "If you have an existing e-commerce backend with product and checkout APIs, adding UCP can take a few hours. Shopify merchants can deploy in under 48 hours using the native integration."

FastAPI app with the six UCP REST endpoints listed in Layer 4
Postgres database (Supabase Pro or Neon Launch)
HTTPS-served /.well-known/ucp JSON manifest pointing at the API
Stripe account configured for ACP or SPTs
Sentry + Cloudflare spend alerts

§ 04 · Free-Tier Math

When each free tier breaks.

The free tier is a marketing-acquisition channel for every vendor in this stack. They are calibrated to break at slightly different points. Knowing the order of breakage is how you avoid the $1,141 Vercel bill.

Tool	Breaks at roughly…	Why
Resend	~90–100 signups/day	3,000 emails/mo, 100/day hard cap
Postmark	First real user	100 emails/mo, permanent hard limit
Sentry	One bad deploy	5K errors/mo is one outage
Vercel Hobby	First paying customer	Hobby is personal/non-commercial only
Supabase free	One week of inactivity	Projects pause after 7 days idle
Upstash Redis free	~5–8K daily requests	500K commands/mo at 2–3 cache lookups/req
Neon free	<1 week of continuous compute	100 CU-hrs/mo; a 0.5 CU instance running 24/7 needs 360 CU-hrs
Pinecone free	~50K queries/day	1M read units/mo; locks you to AWS us-east-1
Cloudflare Workers free	~3M requests/month	100K req/day — generous
Zuplo free	100K requests/month	Builder overage is $100 per additional 100K — steepest jump in the stack

§ 05 · Agent Specifics

Agent-specific considerations.

What changes when your API is built for agents rather than humans.

Traffic patterns

AI crawler traffic grew 15× in 2025 globally; AI bot share of HTML requests reached 4.2% average (Cloudflare Radar Year in Review).
Anthropic's ClaudeBot crawl-to-refer ratio peaks at 500,000:1; OpenAI's GPTBot reaches 3,700:1.
Visa cited a 4,700% surge in AI-driven traffic to U.S. retail sites as justification for launching Trusted Agent Protocol.
Bursty traffic patterns can look like abuse to a naive rate limiter. Configure horizontal scale, not vertical.

Headers well-behaved agents send

OpenAI GPTBot / OAI-SearchBot — declared User-Agent + dedicated IP ranges.
Anthropic-ai / ClaudeBot — declared User-Agent, dedicated IPs.
ChatGPT Agent, Goose, Browserbase, Anchor Browser — RFC 9421 HTTP Message Signatures with Signature-Agent, Signature-Input, Signature headers.

Per Simon Willison's writeup, actual ChatGPT agent headers look like:

Signature-Agent: "https://chatgpt.com"
Signature-Input: sig1=("@authority" "@method" "@path" "signature-agent");
                 created=1754340838;keyid="...";alg="ed25519";tag="web-bot-auth"
Signature: sig1=...

Verification per Arcjet: check Signature-Agent equals "https://chatgpt.com", fetch public keys from that domain's .well-known directory, verify the signature per RFC 9421.

Testing an agent-facing API

MCP Inspector (npx @modelcontextprotocol/inspector) — interactive dev-time tool for testing MCP servers. Not CI-grade.
UCP Conformance Suite at github.com/Universal-Commerce-Protocol/conformance — tests manifest correctness, capability schemas, checkout state machine.
Prism (Stoplight) — open-source OpenAPI mock server + contract testing proxy; runs in CI.
EvalView — "Playwright, but for tool-calling agents." Snapshots agent behavior as a baseline, then diffs new behavior against it. Catches silent regressions when a model update changes tool-call patterns.

Logging that distinguishes agent vs. human

No tool does it natively. The practical pattern is structured JSON logging where every request includes a parsed client_type:

{
  "timestamp": "2026-05-19T13:54:00Z",
  "request_id": "uuid",
  "api_key_id": "hashed_id",
  "endpoint": "POST /checkout-sessions",
  "http_status": 200,
  "latency_ms": 312,
  "user_agent": "OpenAI/Agents-SDK/0.4.2",
  "client_type": "agent",
  "signature_agent": "https://chatgpt.com"
}

This is what makes a client_type:agent filter on your dashboard possible. None of the analytics vendors give it to you by default.

Security: what's different

Replay attacks matter more. Agents retry aggressively. AP2's TTL on Intent Mandates (24h recommended), ECDSA signatures, and idempotency keys exist for this reason.
Credentials don't belong on the agent. Per Stripe: "Tokens can be scoped to a specific business, limited by time or amount, revoked at any time, and monitored via webhook events." Use Stripe SPTs or AP2 Closed Payment Mandates; never give the agent the raw card.
Rate limiting needs per-key + per-IP + per-tool granularity. Per Panther on securing MCP: "Cap concurrent sessions per client, and throttle expensive tool categories (outbound HTTP, database writes, file operations) more aggressively than read-only tools."
Production MCP reality. Per Lenses.io's analysis of ~7,000 MCP server implementations: 86% run locally on developer machines, only 5% in production, 43% contained command injection flaws, 25% had zero authentication. STDIO transport "failed catastrophically under concurrent load (20 out of 22 requests failed with just 20 simultaneous connections)" — if you ship MCP, you must ship Streamable HTTP, not STDIO.

§ 06 · Production Wiring

The stack in production.

What a real deployed instance of this looks like end-to-end.

GitHub repo
    │
    ├─ push to main → GitHub Actions CI/CD
    │                      │
    │                      ├─ ruff (lint)
    │                      ├─ mypy (type-check)
    │                      ├─ pytest (unit + integration)
    │                      ├─ OpenAPI schema validation
    │                      ├─ pip-audit (dependency security)
    │                      └─ vercel deploy --prod
    │
    └─ PR branch → preview deployment URL on Vercel

Custom domain (Cloudflare DNS, $0.87/mo .com)
    └─ CNAME → vercel.app (auto Let's Encrypt SSL)

Secrets (Vercel env vars → pydantic-settings on boot)
    ├─ DATABASE_URL (Supabase or Neon)
    ├─ UPSTASH_REDIS_REST_URL
    ├─ STRIPE_SECRET_KEY
    ├─ UNKEY_ROOT_KEY
    └─ SENTRY_DSN

Discovery
    ├─ /.well-known/ucp  (static JSON manifest)
    ├─ /openapi.json     (auto from FastAPI)
    └─ /sitemap.xml + Schema.org Product on every PDP

Runtime
    ├─ FastAPI on Vercel Fluid Compute (Active CPU billing)
    ├─ Postgres + pgvector (Supabase Pro $25/mo)
    ├─ Upstash Redis for rate-limit counters + idempotency keys
    └─ Unkey for API-key verification (middleware)

Observability
    ├─ Sentry (ASGI middleware) — error tracking, 5K free
    ├─ Logfire (logfire.instrument_fastapi(app)) — distributed tracing
    ├─ Cloudflare Bot Analytics (Free plan) — bot vs. human signal
    └─ Vercel spend alerts ON (this is non-negotiable)

Minimum pre-deploy CI checks

ruff check . — lint
mypy app/ — type-check Pydantic models
pytest tests/ -v — unit and route tests
python -c "from app.main import app; import json; json.dumps(app.openapi())" — OpenAPI schema validity
pip-audit — known CVEs in dependencies

Tip

Disable Vercel's native git auto-deploy in favor of GitHub Actions gates. Otherwise a broken OpenAPI schema reaches production. The tiangolo/full-stack-fastapi-template (43K GitHub stars) ships with FastAPI, PostgreSQL, SQLModel, Docker Compose, GitHub Actions CI, and Pytest end-to-end coverage — a sound base.

§ 07 · Shortcuts

Common shortcuts that backfire.

Four places solo founders cut corners on agent infrastructure that cause real problems later.

Backfire01

Shipping without idempotency keys

Duplicate charges

The shortcut: "I'll add idempotency when I see duplicates."

The fail: An AI agent retries a failed POST /orders over a flaky network and creates a duplicate charge. Per Stripe's talk on idempotent endpoints: "By ensuring APIs are idempotent, servers process requests exactly once, avoiding unintended behaviors like duplicate transactions."

The fix on day one: Accept an Idempotency-Key header, store it with a TTL in Redis, return the cached response for duplicates. The UCP spec already requires idempotency-key on every transaction.

Backfire02

No spend alerts on Vercel or AWS

$1,141 bills

The shortcut: "I'll watch the dashboard."

The fail: A Reddit user reported a $1,141 Vercel bill on a "small" Next.js site, driven by bot/crawler traffic to ISR pages. A separate HN post documented a $1,000 NAT Gateway charge from a single misconfiguration.

The fix on day one: Set a Vercel spend alert at 2× expected, an AWS Budget at 1.5× expected, and route both to email + push.

Backfire03

Shipping MCP over STDIO transport

Concurrent load fails

The shortcut: "It works on my machine."

The fail: Per Lenses.io citing Stacklok load tests: "STDIO fails catastrophically under concurrent load (20 out of 22 requests failed with just 20 simultaneous connections)." 86% of MCP servers in the wild are still local; only 5% are production-grade.

The fix on day one: If you ship MCP, ship Streamable HTTP transport with OAuth 2.1 — not STDIO with environment-variable API keys.

Backfire04

Stripe Billing without knowing fees compound

4.2% effective rate

The shortcut: "Stripe is just 2.9% + 30¢."

The fail: Stripe Billing adds 0.7% on top of the base 2.9% + $0.30. On a $50/month subscription, the total fee is 4.2% ($1.75 base + $0.35 Billing = $2.10). At $10K MRR, that's $700/month in Billing alone. Per Stripe's pricing page, the annual Billing plan starts at $620/month — only worth it above ~$90K MRR.

The fix on day one: Calculate Billing fees into your contribution margin from the first SKU. If unit economics break at 4.2% fees, the product was already broken.

§ 08 · FAQ

Frequently asked questions.

Can I ship without UCP and just use ACP?

Yes. ACP (Stripe + OpenAI) is the lower-effort path if you are already on Stripe — enabling agentic payments is as little as one line of code. UCP is more ambitious and more work. The two are not mutually exclusive. Incremental effort to add ACP after UCP is 2–4 hours; adding UCP after ACP is 8–16 hours.

Do I need a vector database from day one?

No. For catalogs under 1,000 SKUs, Postgres full-text search works. Add pgvector when the catalog grows past 1,000 SKUs, when agents start sending paraphrased natural-language queries, or when you want find-similar-product semantics. Embedding 100K products with text-embedding-3-small costs about $0.40 one-time.

Vercel or Fly.io for the API runtime?

Vercel Pro at $20/mo is the documented best practice for solo founders shipping FastAPI — auto-detected entrypoint, native ASGI, Fluid Compute billing on Active CPU rather than wall-time. Fly.io at roughly $3.32/mo for shared-cpu-1x 512MB always-on, with $0.02/GB egress, wins when you need WebSockets, SSE streaming, or absolute lowest bandwidth cost.

How do I distinguish AI agent traffic from human traffic in my API logs?

No tool does it automatically. Four practical paths: parse User-Agent for known SDK strings; verify RFC 9421 HTTP Message Signatures from agents that ship them; issue separate API keys for agent vs. human-facing clients and tag every request by key type; put Cloudflare in front and use Bot Analytics.

What is the cheapest way to serve product images to agents?

Cloudflare R2 with zero egress fees. Storage at $0.015/GB-month with 10 GB free; zero egress regardless of volume. S3 at $0.09/GB egress would cost roughly $90 for 1 TB of agent-driven image fetches; R2 costs nothing. Given AI crawl-to-refer ratios reach 500,000:1, this is the difference between a $0 image bill and a $1,000 image bill.

Do I really need rate limiting on day one?

Yes. Agents call at machine speed. A runaway agent loop can exhaust a metered free tier in minutes; an attacker can do it on purpose. Implement per-IP for anonymous, per-API-key for authenticated, backed by Redis. UCP's spec already assumes rate limiting exists.

What is the minimum I need to test before shipping?

Four things: MCP Inspector for interactive checks if you ship MCP; Prism mock server to validate your OpenAPI spec against simulated agent calls; the UCP conformance suite if you ship UCP; EvalView for regression testing of agent tool-call behavior in CI.

Why is OpenAPI specifically important for agents — isn't it just developer docs?

Because LLMs consume it directly. AI Agents rely on precise, machine-readable signals — when the API lacks predictable schemas, typed errors, and clear behavioral rules, AI agents cannot function. OpenAPI is the contract; MCP tools, the OpenAI Agents SDK, LangChain, Speakeasy, and FastMCP all auto-generate agent integrations from it.

§ 09 · Sources

Pricing pages & primary sources.

All prices fetched from official pricing pages and subject to change — re-verify before publishing or quoting in production planning.

Hosting & runtime

Vercel — pricing · Fluid Compute docs · FastAPI deployment
AWS Lambda — pricing · Edge Delta cold start analysis
Google Cloud Run — pricing
Fly.io — pricing
Railway — pricing

Data layer

Supabase — pricing
Neon — pricing
PlanetScale — pricing
Upstash — pricing
Pinecone — pricing
Qdrant — pricing
Weaviate — pricing

Payments & protocols

Stripe — pricing · agentic commerce announcement · ACP announcement
UCP — Google UCP Guide · ucp.dev spec · Google Developer Blog · conformance repo
AP2 — ap2-protocol.org · CSA security analysis
MCP — introduction · WorkOS MCP guide · Lenses.io production analysis · Panther MCP security
OpenAI — function calling · Agents SDK MCP
Visa Trusted Agent Protocol — announcement

API gateway & keys

Unkey — pricing
Zuplo — pricing

Observability & edge

Sentry — pricing
Grafana Cloud — pricing
Datadog — pricing
PostHog — pricing
Cloudflare — plans · Web Bot Auth · Signed Agents · Radar Year in Review · R2
Resend — pricing

Schema & structured data

Google — Product structured data docs
Schema App — 2025 schema markup analysis

Community sources

Items flagged unverified

Unkey current pricing model — public pricing page has pivoted to CPU/memory/egress infrastructure billing. Per-verification quotas come from Unkey's engineering RFC, not the live pricing page. [PARTIALLY VERIFIED]
UCP Conformance Suite test cases — repo was inaccessible during research. Test cases listed in this brief are inferred from the UCP spec. [UNVERIFIED]
Exact AP2 idempotency-key/nonce format — documented in the CSA security analysis; spec source-of-truth field name was not confirmed in src/ap2/types/mandate.py. [UNVERIFIED]

Re-verify before launch

Platform pricing and protocol specs shift without notice. Any number on this page that affects your unit economics or compliance should be re-verified against the live vendor docs before you build a margin or launch plan on it.

The Agent Commerce Tech Stack.

The complete stack at a glance.

Surface & Protocol

AI Surface Layer

Protocol Layer

Tools that implement it

Discovery & API

Discovery Layer

Tools

Minimum Google AI Mode requirements

API + OpenAPI Layer

Tools

Minimum UCP REST surface

Data & Payments

Data Layer

Tools

Payment Authorization Layer

Tools

Billing & Ops

Billing + API Key Management

Tools

Observability Layer

Tools

Full stack cost by MRR stage.

Vector search add-on (if needed)

Build order & dependencies.

The minimum viable day-one stack

When each free tier breaks.

Agent-specific considerations.

Traffic patterns

Headers well-behaved agents send

Testing an agent-facing API

Logging that distinguishes agent vs. human

Security: what's different

The stack in production.

Minimum pre-deploy CI checks

Common shortcuts that backfire.

Shipping without idempotency keys

No spend alerts on Vercel or AWS

Shipping MCP over STDIO transport

Stripe Billing without knowing fees compound

Frequently asked questions.

Can I ship without UCP and just use ACP?

Do I need a vector database from day one?

Vercel or Fly.io for the API runtime?

How do I distinguish AI agent traffic from human traffic in my API logs?

What is the cheapest way to serve product images to agents?

Do I really need rate limiting on day one?

What is the minimum I need to test before shipping?

Why is OpenAPI specifically important for agents — isn't it just developer docs?

Pricing pages & primary sources.

Hosting & runtime

Data layer

Payments & protocols

API gateway & keys

Observability & edge

Schema & structured data

Community sources

Items flagged unverified

Get notified when thenext one drops.

Get notified when the
next one drops.