The complete stack at a glance.
Eight layers, every one of which has to exist before an AI agent can find your store, query your catalog, transact, and report the result back to its user. Most solo founders show up thinking "I just need a Stripe button." The actual stack is closer to this:
┌─────────────────────────────────────────────────────────────────────────┐
│ AGENT COMMERCE TECH STACK (2026) │
├──────────────┬──────────────────────────────────────────────────────────┤
│ Layer │ Tool (primary → alternative) │
├──────────────┼──────────────────────────────────────────────────────────┤
│ L1 Hosting │ Vercel Fluid Compute → Railway → Cloud Run │
│ L2 Database │ Supabase (Postgres + pgvector) → Neon │
│ L3 Caching │ Upstash Redis → Cloudflare KV │
│ L4 API GW │ Zuplo → Cloudflare Workers → Kong │
│ L5 Auth/Keys │ Unkey → JWT + Supabase Auth │
│ L6 Billing │ Stripe → Lemon Squeezy (tax-inclusive) │
│ L7 Agent I/F │ OpenAPI spec + FastMCP server + UCP checkout │
│ L8 Observ. │ Grafana Cloud (metrics/logs) + Sentry (errors) │
│ CI/CD │ GitHub Actions + Vercel preview deployments │
│ Vector Search│ pgvector (free/bundled) → Pinecone → Qdrant Cloud │
└──────────────┴──────────────────────────────────────────────────────────┘
The second diagram shows where this maps to the broader protocol and payment stack:
┌─────────────────────────────────────────────────────────────┐
│ 1. AI Surface Google AI Mode · ChatGPT · Claude │
├─────────────────────────────────────────────────────────────┤
│ 2. Protocol UCP · ACP · MCP · A2A │
├─────────────────────────────────────────────────────────────┤
│ 3. Discovery /.well-known/ucp · Schema.org · │
│ Merchant Center feed │
├─────────────────────────────────────────────────────────────┤
│ 4. API + OpenAPI FastAPI on Vercel Fluid Compute │
├─────────────────────────────────────────────────────────────┤
│ 5. Data Postgres (Supabase/Neon) + │
│ pgvector + Upstash Redis │
├─────────────────────────────────────────────────────────────┤
│ 6. Payment Authorization AP2 mandates · Stripe SPTs · │
│ Visa Trusted Agent Protocol │
├─────────────────────────────────────────────────────────────┤
│ 7. Billing + API Keys Stripe Billing · Unkey / Zuplo │
├─────────────────────────────────────────────────────────────┤
│ 8. Observability Sentry · Logfire · Cloudflare │
│ Bot Analytics · structured logs │
└─────────────────────────────────────────────────────────────┘
Per the UCP specification at ucp.dev, this stack collapses N×N integration complexity into a 1-to-Many model: implement once, every UCP-compliant agent can reach you. The eight layers below are what "implement once" actually means.
Every purchase-path request from a shopping agent passes through L4 (API gateway / rate limiting), L7 (auth keys), the FastAPI router, L5 (database), and the OpenAPI/MCP response. Caching, observability, and vector search are non-blocking supporting infrastructure.
Surface & Protocol
Where the buyer's agent actually shows up — and the language it speaks before it touches your API.
AI Surface Layer
What it is: The consumer-facing AI app — Google AI Mode, ChatGPT, Claude, Gemini, Perplexity — that the buyer is talking to.
Why it's in the stack: You don't build this layer. You build for it. Every other layer below is calibrated to what these surfaces can actually call.
Solo founder action: Pick at least one surface to optimize for on day one. Per Google's UCP Guide, Google AI Mode requires an active Merchant Center account plus UCP integration. For ChatGPT Instant Checkout, you need the Stripe-backed Agentic Commerce Protocol (ACP). Pick one. Don't try to ship both at MVP.
Choose one surface to optimize for first — Google AI Mode (UCP path) or ChatGPT Instant Checkout (ACP path). The rest of the stack inherits from this pick.
Protocol Layer
What it is: The agreed language between agent and merchant — Universal Commerce Protocol (UCP), Agentic Commerce Protocol (ACP), Model Context Protocol (MCP), Agent-to-Agent (A2A), with the Agent Payments Protocol (AP2) underneath for payment authorization.
Why it exists: Without a protocol, every agent integration is a one-off. Per Google's UCP announcement, UCP is "an open-source standard designed to power the next generation of agentic commerce. By establishing a common language and functional primitives, UCP enables seamless commerce journeys between consumer surfaces, businesses, and payment providers."
Tools that implement it
- UCP —
/.well-known/ucpJSON manifest plus REST or MCP transport. Open-source, backed by Google, Shopify, Etsy, Wayfair, Target, Walmart, Stripe, Visa, Mastercard. - ACP — Stripe + OpenAI's open standard at agenticcommerce.dev, Apache 2.0, powers ChatGPT Instant Checkout.
- MCP — Anthropic's JSON-RPC 2.0 protocol, now under the Linux Foundation's Agentic AI Foundation per WorkOS's MCP guide.
UCP is the most ambitious — full commerce journey, decentralized via /.well-known/. ACP is the most pragmatic if you're already on Stripe; per Stripe's announcement, existing Stripe users can enable ACP with "as little as one line of code." MCP is the lowest-level — it's just the tool-calling pipe.
One protocol picked. If on Stripe already, ACP. If building catalog-first and want decentralized discovery, UCP. MCP gets added as the agent-facing adapter on top of either.
Discovery & API
How an agent finds you in the first place, and the HTTP surface it actually hits.
Discovery Layer
What it is: How an agent finds you. The /.well-known/ucp manifest, Schema.org Product markup on every product page, and a Merchant Center feed.
Why it exists: Per Schema App's analysis quoting Google: "Structured data is critical for modern search features because it is efficient, precise, and easy for machines to process… Schema Markup is no longer an SEO tactic. It is core infrastructure for AI-driven search."
Tools
- Schema.org
Producttype on every product page. - Google Merchant Center account + product feed.
- A static JSON file served at
/.well-known/ucp— no runtime needed for the manifest itself.
Minimum Google AI Mode requirements
Per Google's structured data docs and dataiads.io feed analysis: GTIN, MPN, brand, product titles of at least 30 characters, descriptions of at least 500 characters, a minimum of 3 additional images at 1500×1500px, and real-time inventory sync every 15–60 minutes.
Schema.org JSON-LD on every product page, a /.well-known/ucp manifest served over HTTPS, and a feed live in Google Merchant Center.
API + OpenAPI Layer
What it is: The HTTP endpoints an agent calls — products, cart, checkout, orders — plus the OpenAPI spec that describes them.
Why it exists: Per the Postman State of the API Report: "AI Agents rely on precise, machine-readable signals, not tribal knowledge. When your API lacks predictable schemas, typed errors, and clear behavioral rules, AI agents can't function as they're intended to." Only 24% of developers actively design APIs with agents in mind.
Tools
- FastAPI on Python 3.12/3.13/3.14 — gives you a working OpenAPI spec at
/openapi.jsonfor free. - Vercel Fluid Compute as the runtime (per Vercel's FastAPI docs, auto-detected, native ASGI, Fluid Compute enabled by default).
- Alternatives: Cloud Run, Fly.io, Railway, AWS Lambda + Mangum — see the Cost Breakdown table for tradeoffs.
Minimum UCP REST surface
Per the dev.to UCP vs ACP technical comparison, six endpoints are required:
GET /products
GET /products/:id
POST /checkout-sessions
PUT /checkout-sessions/:id
POST /checkout-sessions/:id/complete
GET /orders/:id
Python serverless cold starts run 600ms–1,500ms on Vercel and Lambda depending on bundle size. Agents operating synchronously will time out if cold starts exceed their patience window. Mitigation: Railway ($5/mo always-on container) for any endpoint agents hit on a strict SLA, or use Vercel's keep-alive configuration.
FastAPI app deployed to Vercel Pro with the six UCP endpoints live, /openapi.json auto-generated, and a custom domain pointed at the deployment.
Data & Payments
The store under the API, the vector index next to it, and the cryptographic proof that the buyer actually authorized the charge.
Data Layer
What it is: A relational store for the product catalog, orders, users, and sessions; a vector store for semantic product search; a Redis cache for rate limiting and idempotency keys.
Why it exists: Per upsun.com's UCP analysis: "Implement Vector Databases (like Qdrant or pgvector) to cache the meaning of a request. If the agent asks a question that is semantically close to a cached result, the system can serve the cached JSON without hitting the primary SQL database." Agents query in natural language; keyword search is the wrong index.
Tools
- Postgres on Supabase ($25/mo Pro) or Neon (usage-based, ~$15/mo at low scale) — both ship with pgvector.
- pgvector as the vector index, free if you're already on Postgres. Per Firecrawl's vector database guide, pgvector + pgvectorscale delivers 471 QPS at 99% recall on 50M vectors — 11.4× the throughput of standalone Qdrant and roughly 75% cheaper than Pinecone at scale.
- Upstash Redis — 500K commands/month free; pay-as-you-go at $0.20/100K thereafter (Upstash pricing).
A 100,000-product catalog at roughly 200 tokens each = 20M tokens = $0.40 one-time using OpenAI text-embedding-3-small at $0.02/1M tokens. Voyage AI gives the first 200M tokens free on voyage-4-lite. Embedding cost is rarely the constraint.
Supabase free projects pause after 1 week of inactivity. Catastrophic for an always-on agent API. Neon free suspends compute when the monthly CU-hour limit is hit. Upgrade to Supabase Pro ($25/mo) or Neon Launch before your first real user — this is non-negotiable for agent workloads.
Postgres running on Supabase Pro with pgvector enabled, product catalog embedded, Upstash Redis connected, and the FastAPI app reading from both.
Payment Authorization Layer
What it is: The cryptographic proof that the buyer authorized this specific purchase — separate from the act of charging the card.
Why it exists: Per Stripe's agentic commerce announcement: "Trust can't be inferred — it has to be explicitly granted, scoped, and enforced in code." Shopping agents should never see raw payment credentials.
Tools
- AP2 (Agent Payments Protocol) at ap2-protocol.org — Verifiable Digital Credentials chained together; Cart Mandate (human-present, hardware-signed) and Intent Mandate (human-not-present, pre-signed with TTL).
- Stripe Shared Payment Tokens (SPTs) — scoped to a specific business, time-limited, amount-capped, webhook-revokable.
- Visa Trusted Agent Protocol — per Visa's announcement, built on RFC 9421 HTTP Message Signatures and aligned with Cloudflare Web Bot Auth.
Per the CSA AP2 security analysis: TTL on Intent Mandates (24h recommended), creation_time timestamps, hardware-backed ECDSA signatures, SHA-256 checksums on all A2A messages, and idempotency keys on every checkout endpoint. The CSA reports AP2 reduces fraud from 2.1% (API-based) to 1.15% in simulated transactions.
Stripe SPT or AP2 Cart Mandate flow live on /checkout-sessions/:id/complete, with idempotency keys enforced at the database layer and TTL on every Intent Mandate.
Billing & Ops
The money in, and the dashboard that tells you when something is wrong.
Billing + API Key Management
What it is: How you charge for API access — free tier metering, paid tier enforcement, key issuance, rate limits per key.
Why it exists: A free-to-paid funnel is the only way an agent-first API scales. Stripe handles money, Unkey handles keys, Zuplo (optional) handles the gateway in front.
Tools
- Stripe Billing at 0.7% of subscription volume on top of 2.9% + $0.30 base. Compounding fee is real — see Shortcuts below.
- Unkey for API key issuance, rate limiting, and verification.
[PARTIALLY VERIFIED — Unkey's public pricing page has pivoted to a CPU/memory/egress infrastructure model; per-verification quotas referenced come from their engineering RFC.](unkey.com/pricing) - Zuplo ($25/mo Builder) as the heavier-duty alternative — 100K requests included, but overage is steep at $100 per additional 100K (Zuplo pricing).
At $50/mo subscriptions: 2.9% + $0.30 base + 0.7% Billing = 4.2% effective fee. At $10K MRR, that's $700/month in Billing fees alone. The annual Billing plan ($620/mo) breaks even above roughly $90K MRR — not before.
Stripe checkout for API subscriptions live; Stripe webhook creates/revokes Unkey keys on payment events; per-key rate limits enforced at request time.
Observability Layer
What it is: Error tracking, distributed tracing, structured logs, agent vs. human segmentation, and uptime monitoring.
Why it exists: Per Cloudflare Radar's Year in Review, AI bot share of HTML traffic averaged 4.2% globally, with crawl-to-refer ratios reaching 25,000:1 to 500,000:1 for Anthropic's ClaudeBot. You will get drowned in bot traffic; you need to see it.
Tools
- Sentry — 5K errors/mo free; $26/mo Team plan for production (Sentry pricing).
- Pydantic Logfire — built by the Pydantic team; one-line FastAPI instrumentation; first-class LLM/token tracking.
- Grafana Cloud — the most generous free tier in the category: 10K metric series, 50 GB logs, 50 GB traces, 50 GB profiles (Grafana pricing).
- Cloudflare Bot Analytics — the only network-level service that classifies bot vs. human reliably; full bot management is Enterprise-only.
No tool automatically segments human vs. AI agent traffic. Practical approaches: parse User-Agent for known SDK strings, scope API keys by client type, or verify RFC 9421 HTTP Message Signatures per Cloudflare's Web Bot Auth and Signed Agents posts.
Sentry catching errors, Logfire instrumenting FastAPI, Grafana dashboards on request volume + latency percentiles, and spend alerts set on Vercel and Supabase at 80% of expected monthly burn.
Full stack cost by MRR stage.
All prices fetched from official pricing pages. "$1K MRR" assumes ~1,000 active customers and ~3M API calls/month. "$10K MRR" assumes ~10K customers and ~30M API calls/month.
| Layer | Tool | Free Tier | At $1K MRR | At $10K MRR | Source |
|---|---|---|---|---|---|
| Hosting | Vercel (Fluid Compute, Pro) | 1M invocations + 4 CPU-hrs (Hobby, non-commercial) | $20/mo (Pro, $20 usage credit included) | $60–120/mo | vercel.com/pricing |
| Database | Supabase Pro | 500 MB DB, pauses at 1 wk idle | $25/mo + $10 Micro compute | $35–60/mo | supabase.com/pricing |
| Database (alt) | Neon Launch | 100 CU-hrs + 0.5 GB storage | ~$15/mo (~80 CU-hrs) | ~$40–60/mo | neon.tech/pricing |
| Cache / KV | Upstash Redis | 500K cmds + 256 MB | ~$5/mo (~2.5M cmds) | ~$20/mo (~10M cmds) | upstash.com/pricing |
| Vector DB | pgvector (on Postgres above) | Free (extension) | $0 (bundled) | $0 (bundled) | pgvector |
| Payments | Stripe (cards) | None | ~$29/mo (2.9% + $0.30 × ~50 charges) | ~$290/mo | stripe.com/pricing |
| Subscriptions | Stripe Billing | None | $7/mo (0.7% of $1K) | $70/mo (0.7% of $10K) | stripe.com/pricing |
| API Key Mgmt | Unkey | ~150K req (per RFC) | $5/mo (Starter) | $25/mo (Pro) | unkey.com/pricing [PARTIALLY VERIFIED] |
| Error Tracking | Sentry | 5K errors/mo | $26/mo (Team, 50K) | $26–50/mo | sentry.io/pricing |
| Tracing/Logs | Grafana Cloud | 10K series + 50 GB logs | $0–10/mo (within free) | $30–80/mo | grafana.com/pricing |
| CDN + WAF | Cloudflare | Unlimited DDoS, CDN, SSL | $0 (Free is fine) | $20/mo (Pro) | cloudflare.com/plans |
| Domain | Cloudflare Registrar | N/A | $0.87/mo (.com at-cost) | $0.87/mo | cloudflare.com |
| Email (transactional) | Resend | 3,000/mo (100/day) | $20/mo (Pro, 50K) | $35/mo (100K) | resend.com/pricing |
| Estimated total | — | — | ~$1/mo (domain) | ~$130–145/mo | ~$580–810/mo |
Vector search add-on (if needed)
pgvector handles product catalog semantic search up to ~1M vectors with an HNSW index at zero additional cost on Supabase Pro. Switch to a dedicated vector database at three signals: (a) catalog exceeds 1M vectors, (b) p99 semantic search latency exceeds 500ms under concurrent agent load, or (c) you need multi-tenant vector isolation.
| Tool | Free Tier | At $1K MRR | At $10K MRR | Source |
|---|---|---|---|---|
| pgvector (bundled) | $0 (included in Supabase/Neon) | $0 | $0 | pgvector |
| Pinecone | $0 (2GB, 2M write, 1M read units) | $20/mo (Builder, 10GB) | $50+/mo (Standard) | pinecone.io/pricing |
| Qdrant Cloud | $0 forever (0.5vCPU, 1GB RAM, 4GB disk) | Usage-based | Usage-based | qdrant.tech/pricing [PARTIALLY VERIFIED] |
| Weaviate | $0 (14-day trial only) | $45/mo (Flex, min) | $45–400/mo | weaviate.io/pricing |
Reading the table: At zero revenue you can run a production-grade agent commerce stack for about a dollar a month, but only because Vercel Hobby is non-commercial. The moment you take a paying customer, you owe Vercel $20/mo for Pro and Supabase $25/mo for a Postgres that doesn't pause itself. From there, the next $80/mo gets you everything else — error tracking, billing, API keys, transactional email, domain. The first $130/mo carries you to roughly $1K MRR. From $1K to $10K MRR, the largest cost growth is Stripe processing fees, not infrastructure.
Build order & dependencies.
One row per stack layer. "Hard Dependency On" lists the layers that must exist first. "Can Be Deferred" means the stack can ship and transact without it.
| Layer | What It Enables | Hard Dependency On | Can Be Deferred? | Day-One Required? |
|---|---|---|---|---|
| 1. Structured Product Data (Schema.org + Merchant Center) | Agents find products via Google AI Mode and crawler indexing | Product pages + GTINs + 3+ images at 1500×1500 | No — discovery starts here | Yes |
| 2. FastAPI commerce REST API | Programmatic access to products, cart, checkout, orders | Database + hosting | No — UCP REST transport depends on this | Yes |
| 3. OpenAPI spec | LLMs read endpoints, parameters, error shapes; auto-generated MCP | API exists (FastAPI emits this for free) | No — agent legibility starts here | Yes |
4. UCP /.well-known/ucp manifest |
Decentralized agent discovery; full UCP journey | API + OpenAPI spec; HTTPS; no 3xx redirects | Yes — can ship private-beta API without it | No — only for UCP-compliant discovery |
| 5. MCP server | Direct tool-use from Claude/ChatGPT/Cursor | OpenAPI spec or REST API | Yes — UCP MCP transport is optional | No — defer unless your buyer uses Claude/Cursor |
| 6. Billing gate | Free-to-paid funnel; key issuance; rate limit enforcement | API + database + Stripe | Yes — can launch free-only for first week | No — but defer with caution |
| 7. Vector search (pgvector) | Semantic product discovery for natural-language agent queries | Postgres + embedding pipeline | Yes — keyword search works under 1K SKUs | No — defer until catalog or queries justify |
| 8. Observability | Error visibility, agent traffic segmentation, spend alerts | API deployed somewhere | No — flying blind is how $1,141 Vercel bills happen | Yes (at minimum: Sentry free + spend alerts) |
The minimum viable day-one stack
Per the dev.to UCP vs ACP comparison: "If you have an existing e-commerce backend with product and checkout APIs, adding UCP can take a few hours. Shopify merchants can deploy in under 48 hours using the native integration."
- FastAPI app with the six UCP REST endpoints listed in Layer 4
- Postgres database (Supabase Pro or Neon Launch)
- HTTPS-served
/.well-known/ucpJSON manifest pointing at the API - Stripe account configured for ACP or SPTs
- Sentry + Cloudflare spend alerts
When each free tier breaks.
The free tier is a marketing-acquisition channel for every vendor in this stack. They are calibrated to break at slightly different points. Knowing the order of breakage is how you avoid the $1,141 Vercel bill.
| Tool | Breaks at roughly… | Why |
|---|---|---|
| Resend | ~90–100 signups/day | 3,000 emails/mo, 100/day hard cap |
| Postmark | First real user | 100 emails/mo, permanent hard limit |
| Sentry | One bad deploy | 5K errors/mo is one outage |
| Vercel Hobby | First paying customer | Hobby is personal/non-commercial only |
| Supabase free | One week of inactivity | Projects pause after 7 days idle |
| Upstash Redis free | ~5–8K daily requests | 500K commands/mo at 2–3 cache lookups/req |
| Neon free | <1 week of continuous compute | 100 CU-hrs/mo; a 0.5 CU instance running 24/7 needs 360 CU-hrs |
| Pinecone free | ~50K queries/day | 1M read units/mo; locks you to AWS us-east-1 |
| Cloudflare Workers free | ~3M requests/month | 100K req/day — generous |
| Zuplo free | 100K requests/month | Builder overage is $100 per additional 100K — steepest jump in the stack |
Agent-specific considerations.
What changes when your API is built for agents rather than humans.
Traffic patterns
- AI crawler traffic grew 15× in 2025 globally; AI bot share of HTML requests reached 4.2% average (Cloudflare Radar Year in Review).
- Anthropic's ClaudeBot crawl-to-refer ratio peaks at 500,000:1; OpenAI's GPTBot reaches 3,700:1.
- Visa cited a 4,700% surge in AI-driven traffic to U.S. retail sites as justification for launching Trusted Agent Protocol.
- Bursty traffic patterns can look like abuse to a naive rate limiter. Configure horizontal scale, not vertical.
Headers well-behaved agents send
- OpenAI GPTBot / OAI-SearchBot — declared User-Agent + dedicated IP ranges.
- Anthropic-ai / ClaudeBot — declared User-Agent, dedicated IPs.
- ChatGPT Agent, Goose, Browserbase, Anchor Browser — RFC 9421 HTTP Message Signatures with
Signature-Agent,Signature-Input,Signatureheaders.
Per Simon Willison's writeup, actual ChatGPT agent headers look like:
Signature-Agent: "https://chatgpt.com"
Signature-Input: sig1=("@authority" "@method" "@path" "signature-agent");
created=1754340838;keyid="...";alg="ed25519";tag="web-bot-auth"
Signature: sig1=...
Verification per Arcjet: check Signature-Agent equals "https://chatgpt.com", fetch public keys from that domain's .well-known directory, verify the signature per RFC 9421.
Testing an agent-facing API
- MCP Inspector (
npx @modelcontextprotocol/inspector) — interactive dev-time tool for testing MCP servers. Not CI-grade. - UCP Conformance Suite at github.com/Universal-Commerce-Protocol/conformance — tests manifest correctness, capability schemas, checkout state machine.
- Prism (Stoplight) — open-source OpenAPI mock server + contract testing proxy; runs in CI.
- EvalView — "Playwright, but for tool-calling agents." Snapshots agent behavior as a baseline, then diffs new behavior against it. Catches silent regressions when a model update changes tool-call patterns.
Logging that distinguishes agent vs. human
No tool does it natively. The practical pattern is structured JSON logging where every request includes a parsed client_type:
{
"timestamp": "2026-05-19T13:54:00Z",
"request_id": "uuid",
"api_key_id": "hashed_id",
"endpoint": "POST /checkout-sessions",
"http_status": 200,
"latency_ms": 312,
"user_agent": "OpenAI/Agents-SDK/0.4.2",
"client_type": "agent",
"signature_agent": "https://chatgpt.com"
}
This is what makes a client_type:agent filter on your dashboard possible. None of the analytics vendors give it to you by default.
Security: what's different
- Replay attacks matter more. Agents retry aggressively. AP2's TTL on Intent Mandates (24h recommended), ECDSA signatures, and idempotency keys exist for this reason.
- Credentials don't belong on the agent. Per Stripe: "Tokens can be scoped to a specific business, limited by time or amount, revoked at any time, and monitored via webhook events." Use Stripe SPTs or AP2 Closed Payment Mandates; never give the agent the raw card.
- Rate limiting needs per-key + per-IP + per-tool granularity. Per Panther on securing MCP: "Cap concurrent sessions per client, and throttle expensive tool categories (outbound HTTP, database writes, file operations) more aggressively than read-only tools."
- Production MCP reality. Per Lenses.io's analysis of ~7,000 MCP server implementations: 86% run locally on developer machines, only 5% in production, 43% contained command injection flaws, 25% had zero authentication. STDIO transport "failed catastrophically under concurrent load (20 out of 22 requests failed with just 20 simultaneous connections)" — if you ship MCP, you must ship Streamable HTTP, not STDIO.
The stack in production.
What a real deployed instance of this looks like end-to-end.
GitHub repo
│
├─ push to main → GitHub Actions CI/CD
│ │
│ ├─ ruff (lint)
│ ├─ mypy (type-check)
│ ├─ pytest (unit + integration)
│ ├─ OpenAPI schema validation
│ ├─ pip-audit (dependency security)
│ └─ vercel deploy --prod
│
└─ PR branch → preview deployment URL on Vercel
Custom domain (Cloudflare DNS, $0.87/mo .com)
└─ CNAME → vercel.app (auto Let's Encrypt SSL)
Secrets (Vercel env vars → pydantic-settings on boot)
├─ DATABASE_URL (Supabase or Neon)
├─ UPSTASH_REDIS_REST_URL
├─ STRIPE_SECRET_KEY
├─ UNKEY_ROOT_KEY
└─ SENTRY_DSN
Discovery
├─ /.well-known/ucp (static JSON manifest)
├─ /openapi.json (auto from FastAPI)
└─ /sitemap.xml + Schema.org Product on every PDP
Runtime
├─ FastAPI on Vercel Fluid Compute (Active CPU billing)
├─ Postgres + pgvector (Supabase Pro $25/mo)
├─ Upstash Redis for rate-limit counters + idempotency keys
└─ Unkey for API-key verification (middleware)
Observability
├─ Sentry (ASGI middleware) — error tracking, 5K free
├─ Logfire (logfire.instrument_fastapi(app)) — distributed tracing
├─ Cloudflare Bot Analytics (Free plan) — bot vs. human signal
└─ Vercel spend alerts ON (this is non-negotiable)
Minimum pre-deploy CI checks
ruff check .— lintmypy app/— type-check Pydantic modelspytest tests/ -v— unit and route testspython -c "from app.main import app; import json; json.dumps(app.openapi())"— OpenAPI schema validitypip-audit— known CVEs in dependencies
Disable Vercel's native git auto-deploy in favor of GitHub Actions gates. Otherwise a broken OpenAPI schema reaches production. The tiangolo/full-stack-fastapi-template (43K GitHub stars) ships with FastAPI, PostgreSQL, SQLModel, Docker Compose, GitHub Actions CI, and Pytest end-to-end coverage — a sound base.
Common shortcuts that backfire.
Four places solo founders cut corners on agent infrastructure that cause real problems later.
Shipping without idempotency keys
The shortcut: "I'll add idempotency when I see duplicates."
The fail: An AI agent retries a failed POST /orders over a flaky network and creates a duplicate charge. Per Stripe's talk on idempotent endpoints: "By ensuring APIs are idempotent, servers process requests exactly once, avoiding unintended behaviors like duplicate transactions."
The fix on day one: Accept an Idempotency-Key header, store it with a TTL in Redis, return the cached response for duplicates. The UCP spec already requires idempotency-key on every transaction.
No spend alerts on Vercel or AWS
The shortcut: "I'll watch the dashboard."
The fail: A Reddit user reported a $1,141 Vercel bill on a "small" Next.js site, driven by bot/crawler traffic to ISR pages. A separate HN post documented a $1,000 NAT Gateway charge from a single misconfiguration.
The fix on day one: Set a Vercel spend alert at 2× expected, an AWS Budget at 1.5× expected, and route both to email + push.
Shipping MCP over STDIO transport
The shortcut: "It works on my machine."
The fail: Per Lenses.io citing Stacklok load tests: "STDIO fails catastrophically under concurrent load (20 out of 22 requests failed with just 20 simultaneous connections)." 86% of MCP servers in the wild are still local; only 5% are production-grade.
The fix on day one: If you ship MCP, ship Streamable HTTP transport with OAuth 2.1 — not STDIO with environment-variable API keys.
Stripe Billing without knowing fees compound
The shortcut: "Stripe is just 2.9% + 30¢."
The fail: Stripe Billing adds 0.7% on top of the base 2.9% + $0.30. On a $50/month subscription, the total fee is 4.2% ($1.75 base + $0.35 Billing = $2.10). At $10K MRR, that's $700/month in Billing alone. Per Stripe's pricing page, the annual Billing plan starts at $620/month — only worth it above ~$90K MRR.
The fix on day one: Calculate Billing fees into your contribution margin from the first SKU. If unit economics break at 4.2% fees, the product was already broken.
Frequently asked questions.
Can I ship without UCP and just use ACP?
Yes. ACP (Stripe + OpenAI) is the lower-effort path if you are already on Stripe — enabling agentic payments is as little as one line of code. UCP is more ambitious and more work. The two are not mutually exclusive. Incremental effort to add ACP after UCP is 2–4 hours; adding UCP after ACP is 8–16 hours.
Do I need a vector database from day one?
No. For catalogs under 1,000 SKUs, Postgres full-text search works. Add pgvector when the catalog grows past 1,000 SKUs, when agents start sending paraphrased natural-language queries, or when you want find-similar-product semantics. Embedding 100K products with text-embedding-3-small costs about $0.40 one-time.
Vercel or Fly.io for the API runtime?
Vercel Pro at $20/mo is the documented best practice for solo founders shipping FastAPI — auto-detected entrypoint, native ASGI, Fluid Compute billing on Active CPU rather than wall-time. Fly.io at roughly $3.32/mo for shared-cpu-1x 512MB always-on, with $0.02/GB egress, wins when you need WebSockets, SSE streaming, or absolute lowest bandwidth cost.
How do I distinguish AI agent traffic from human traffic in my API logs?
No tool does it automatically. Four practical paths: parse User-Agent for known SDK strings; verify RFC 9421 HTTP Message Signatures from agents that ship them; issue separate API keys for agent vs. human-facing clients and tag every request by key type; put Cloudflare in front and use Bot Analytics.
What is the cheapest way to serve product images to agents?
Cloudflare R2 with zero egress fees. Storage at $0.015/GB-month with 10 GB free; zero egress regardless of volume. S3 at $0.09/GB egress would cost roughly $90 for 1 TB of agent-driven image fetches; R2 costs nothing. Given AI crawl-to-refer ratios reach 500,000:1, this is the difference between a $0 image bill and a $1,000 image bill.
Do I really need rate limiting on day one?
Yes. Agents call at machine speed. A runaway agent loop can exhaust a metered free tier in minutes; an attacker can do it on purpose. Implement per-IP for anonymous, per-API-key for authenticated, backed by Redis. UCP's spec already assumes rate limiting exists.
What is the minimum I need to test before shipping?
Four things: MCP Inspector for interactive checks if you ship MCP; Prism mock server to validate your OpenAPI spec against simulated agent calls; the UCP conformance suite if you ship UCP; EvalView for regression testing of agent tool-call behavior in CI.
Why is OpenAPI specifically important for agents — isn't it just developer docs?
Because LLMs consume it directly. AI Agents rely on precise, machine-readable signals — when the API lacks predictable schemas, typed errors, and clear behavioral rules, AI agents cannot function. OpenAPI is the contract; MCP tools, the OpenAI Agents SDK, LangChain, Speakeasy, and FastMCP all auto-generate agent integrations from it.
Pricing pages & primary sources.
All prices fetched from official pricing pages and subject to change — re-verify before publishing or quoting in production planning.
Hosting & runtime
- Vercel — pricing · Fluid Compute docs · FastAPI deployment
- AWS Lambda — pricing · Edge Delta cold start analysis
- Google Cloud Run — pricing
- Fly.io — pricing
- Railway — pricing
Data layer
- Supabase — pricing
- Neon — pricing
- PlanetScale — pricing
- Upstash — pricing
- Pinecone — pricing
- Qdrant — pricing
- Weaviate — pricing
Payments & protocols
- Stripe — pricing · agentic commerce announcement · ACP announcement
- UCP — Google UCP Guide · ucp.dev spec · Google Developer Blog · conformance repo
- AP2 — ap2-protocol.org · CSA security analysis
- MCP — introduction · WorkOS MCP guide · Lenses.io production analysis · Panther MCP security
- OpenAI — function calling · Agents SDK MCP
- Visa Trusted Agent Protocol — announcement
API gateway & keys
Observability & edge
- Sentry — pricing
- Grafana Cloud — pricing
- Datadog — pricing
- PostHog — pricing
- Cloudflare — plans · Web Bot Auth · Signed Agents · Radar Year in Review · R2
- Resend — pricing
Schema & structured data
- Google — Product structured data docs
- Schema App — 2025 schema markup analysis
Community sources
- Simon Willison on ChatGPT Agent headers
- Arcjet on agent identification
- Postman State of the API
- dev.to UCP vs ACP comparison
- Firecrawl vector database guide
- Render FastAPI production guide
Items flagged unverified
- Unkey current pricing model — public pricing page has pivoted to CPU/memory/egress infrastructure billing. Per-verification quotas come from Unkey's engineering RFC, not the live pricing page.
[PARTIALLY VERIFIED] - UCP Conformance Suite test cases — repo was inaccessible during research. Test cases listed in this brief are inferred from the UCP spec.
[UNVERIFIED] - Exact AP2 idempotency-key/nonce format — documented in the CSA security analysis; spec source-of-truth field name was not confirmed in
src/ap2/types/mandate.py.[UNVERIFIED]
Platform pricing and protocol specs shift without notice. Any number on this page that affects your unit economics or compliance should be re-verified against the live vendor docs before you build a margin or launch plan on it.