One decision before you write a single line of billing code.
Before you pick Stripe or Lago, before you configure a rate limiter, you must answer one question: what does "free" mean for your API, and what specifically forces a user to stop being free?
The gate is not your pricing page. It is the moment of friction: a 429 rate limit response, a soft "you've hit your monthly quota" wall, a disabled endpoint. Without a defined gate, you have built a subsidy program, not a product. For a solo founder shipping a FastAPI/Vercel API, the sequence is:
- Pick a gate that is real but not punishing.
- Instrument the gate with a stateful counter — which requires external Redis on serverless.
- Decide whether overages block or alert — only then does billing infrastructure matter.
The three decisions — what to gate, how to count it, and how to charge beyond it — are the entire implementation problem.
The three monetization models, ranked for an API builder.
Usage-based pricing is not an emerging trend. According to OpenView Partners' State of Usage-Based Pricing, three out of five SaaS companies now use some form of UBP. For API and infrastructure companies specifically, the adoption rate is higher still. The tradeoffs that matter for a solo builder are below — not as a pricing textbook, but as a comparison ranked by what you can actually ship this week.
Per-Call (Pay-as-You-Go)
Each API call or unit of output is billed individually. Real examples with current published pricing:
- Twilio SMS: $0.0083 per outbound SMS (US long code); pure pay-as-you-go, no monthly minimum (twilio.com/en-us/sms/pricing/us).
- OpenAI API: GPT-5.4 mini at $0.75 per 1M input tokens; no standing free tier and no credit grant for new accounts as of late 2025 (openai.com/api/pricing).
AgentMall fit: Per-call is ideal when API calls have direct measurable value. Risk: customers cannot predict bills, which creates upgrade hesitation. At low traffic (under 10K req/day), per-call revenue is negligible — the model is primarily a signal mechanism at this stage.
Per-Seat
Price tied to the number of users or API keys, not consumption. Weak fit for an AgentMall API — agent traffic is programmatic, not tied to human seats. Per-seat billing creates friction without a natural correlation to value delivered.
Flat-Tier (Subscription with Included Volume)
Fixed monthly price grants access to a defined call volume. Overages either block (hard gate) or accrue extra charges (soft gate). Real examples:
- Cloudflare Workers: Free plan 100K requests/day; Standard $5/month includes 10M requests/month, then $0.30/million additional (developers.cloudflare.com/workers/platform/pricing).
- Zuplo API Gateway: Free tier 100K requests/month; Builder $25/month up to 1M requests/month (zuplo.com/pricing).
- Unkey API Key Management: 150,000 verifications/month free (unkey.com).
Model Comparison
| Model | Revenue Predictability | Implementation Complexity | Best For |
|---|---|---|---|
| Per-call only | Low | High | High-volume value-per-call APIs |
| Per-seat | High | Low | B2B with named users |
| Flat tier | High | Low–Medium | Early-stage predictable value |
| Hybrid flat + usage | Medium–High | Medium | Scale-stage mixed customers |
Flat free tier with Redis rate limit + flat paid tier with Stripe subscription. Add usage-based metering only when you have paying customers hitting the flat tier ceiling.
Designing a free tier that converts, not subsidizes.
Free-to-Paid Conversion Benchmarks
From Lenny's Newsletter — the most widely cited source for freemium conversion benchmarks:
- Freemium self-serve: 3%–5% is "good," 6%–8% is "great."
- Developer-focused products specifically: median conversion rate is 5% — half that of non-developer products. "It's a truism that developers are a tough crowd to sell to — and the data supports that intuition."
- a16z rule of thumb: "Below 5% in a year (cohorted) is our usual rule of thumb for a suboptimal free plan in SaaS businesses."
Practical interpretation: If you have 1,000 free-tier API users at 6 months, expect 15–50 conversions under normal conditions. Zero conversions after 200+ active free users means your gate is either too generous or your paid value proposition is unclear.
What to Gate
The gate must create real friction without preventing the "aha" moment. Gate options ranked by conversion pressure:
- Feature gating — some endpoints only on paid (low friction; easy to circumvent).
- Volume gating — N calls/month, then block (most common; creates natural upgrade pressure).
- Rate gating — N calls/minute throttled, not blocked (less effective; users simply slow down).
- Data gating — truncated results on free tier (effective for data APIs).
- Recency gating — real-time data on paid, delayed on free (highly effective for time-sensitive APIs).
For an AgentMall commerce API, recommend volume + feature gating combined. Free tier gets product lookup endpoints with 1,000 calls/month; paid tier unlocks inventory sync, webhook events, and bulk endpoints.
Free Tier Benchmarks
| Company | Product Type | Free Tier Limit | First Paid Tier | Billing Model | Notes |
|---|---|---|---|---|---|
| Cloudflare Workers | Serverless compute/API | 100,000 requests/day | $5/month (10M req/month) | Flat + usage overage | Workers AI: 10K Neurons/day free (source) |
| Twilio SMS | Communications API | No free tier (pay-go from $0) | $0.0083/SMS (no plan minimum) | Pure per-call | Test credentials available for dev (source) |
| OpenAI API | LLM API | No standing free tier as of late 2025 | Pay-as-you-go: GPT-5.4 mini $0.75/1M input tokens | Pure per-token | Must add payment method to use API (source) |
| Upstash Redis | Serverless Redis | 500K commands/month, 10K/day max | $0.20/100K commands (pay-as-you-go) | Free + pay-as-you-go | Fits perfectly as rate limit counter (source) |
| Zuplo | API gateway | 100K requests/month, unlimited API keys | $25/month (Builder: up to 1M req/month) | Flat tier | Includes developer portal (source) |
| Unkey | API key management | 150,000 verifications/month free | Usage-based beyond free tier | Verification-based | No credit card for free tier (source) |
| AWS API Gateway | API gateway | 1M REST API calls/month free (new accounts, 12 months) | $3.50/million calls (REST, first 333M) | Pay-as-you-go | Free tier expires after 12 months (source) |
| Google Maps Platform | Mapping API | Up to $3,250/month in usage credits (~10K calls/SKU/month) | $2–$17/1,000 calls depending on SKU | Usage-based with monthly credit | Generous free tier drives massive developer adoption |
The "Free Forever" Trap
The Heroku post-mortem is canonical. From dev.to's analysis: "For a decade, Heroku offered a generous free tier... this filled Heroku's infrastructure with abandoned crypto miners, dormant Slack bots, and student projects that hadn't been touched in five years... In 2022, Heroku killed the free tier. The backlash was immense, but the business logic was sound."
Real founder story from Reddit r/SaaS (2025): "Our free users make around 8 million API calls monthly, compared to just 6 million from our paying customers. This means free users account for 57% of our overall traffic but contribute nothing to our revenue... just 12 free users were responsible for 4 million of those calls." Resolution: capped free tier at 1,000 calls/day.
Never label a tier "free forever" in your marketing. Use "free starter plan" or "free up to X calls/month." This preserves your ability to lower limits without backlash from users who feel they have a contractual right to unlimited free access.
Rate limiting on FastAPI + Vercel — the only stack that actually works.
The Vercel Serverless Constraint
Vercel serverless functions are stateless. Each invocation spins up a new Lambda context with no shared memory. In-memory rate limiting (a Python dict counting requests) resets on every cold start and cannot coordinate across concurrent invocations.
From Reddit r/nextjs: "Sticking something in front of Vercel is the only option to truly rate limit. Any rate-limiting done via NextJs code paths still counts as an invocation, end of story." / "Upstash Rate Limit... is a pretty good framework-aware solution, also they are providing rate limit analytics for you so you can easily find abuses/abusers."
Solution: External stateful store. The go-to for Vercel is Upstash Redis — an HTTP-based (REST) Redis interface that works without persistent TCP connections. That requirement is exactly what serverless needs.
The Four Rate Limiting Algorithms
Token Bucket
Each user has a bucket with capacity N tokens. Each request spends one token. Tokens refill at a fixed rate. Allows bursts up to bucket capacity, then enforces steady-state rate. Best for: public APIs with legitimate bursty traffic. Per arcjet's algorithm comparison: "Token bucket is the strongest general-purpose default for APIs."
Fixed Window
Counter increments per request, resets at the window boundary (e.g., top of each minute). A user can make 2× their limit by hitting the boundary (N calls at 11:59:59 + N calls at 12:00:00). Best for: simple implementations, low-stakes limits. Weakness: boundary exploitation.
Sliding Window
The window rolls with the current timestamp — counts requests in the past N seconds from now. Smooths the fixed-window boundary problem. Best for: APIs where fair distribution matters. Upstash's SlidingWindow algorithm is available natively.
Leaky Bucket
Requests fill a queue; they drain at a constant rate. No bursts. Best for: shaping outbound traffic to a downstream service. Not ideal for: inbound API rate limiting where you want to allow bursts.
For FastAPI/Vercel: Token bucket or sliding window via Upstash.
slowapi — FastAPI Rate Limiting Library
Source: github.com/laurentS/slowapi — 1,700+ stars, 7,500+ dependent projects.
- Install:
pip install slowapi - Backends: Redis, memcached, in-memory fallback
- Key constraint: the
requestargument must be explicitly passed to the endpoint function - Vercel compatibility: configure Upstash Redis as the backend using the
redis://URL
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded
from fastapi import FastAPI, Request
limiter = Limiter(key_func=get_remote_address)
app = FastAPI()
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
@app.get("/api/products")
@limiter.limit("100/day") # free-tier cap per IP
async def get_products(request: Request):
# Note: 'request' argument is mandatory for slowapi to hook
return {"products": []}
For per-API-key rate limiting (recommended over per-IP for authenticated APIs):
def get_api_key(request: Request):
return request.headers.get("X-API-Key", get_remote_address(request))
limiter = Limiter(key_func=get_api_key)
Upstash Redis SDK — The Recommended Serverless Solution
| Tier | Price | Commands | Daily Cap | Storage |
|---|---|---|---|---|
| Free | $0 | 500K/month | 10K/day | 256 MB |
| Pay as You Go | $0.20/100K commands | Unlimited monthly | 1M/day | $0.25/GB |
| Fixed | $10/month + $5/read region | — | 100K/day | 250 MB |
At under 10K req/day: fits within Upstash's free tier exactly. 10K additional commands costs $0.02.
# Install: pip install upstash-ratelimit upstash-redis
from fastapi import FastAPI, HTTPException, Request
from upstash_ratelimit import Ratelimit, FixedWindow, SlidingWindow
from upstash_redis import Redis
app = FastAPI()
redis = Redis.from_env() # reads UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN
# Fixed window: 100 requests per day per API key
ratelimit = Ratelimit(
redis=redis,
limiter=FixedWindow(max_requests=100, window=86400), # 86400 seconds = 1 day
prefix="free_tier"
)
@app.get("/api/products")
async def get_products(request: Request):
api_key = request.headers.get("X-API-Key", request.client.host)
response = ratelimit.limit(api_key)
if not response.allowed:
raise HTTPException(
status_code=429,
detail="Free tier limit exceeded. Upgrade at https://yourapi.com/pricing",
headers={"X-RateLimit-Limit": str(response.limit),
"X-RateLimit-Remaining": "0",
"Retry-After": "86400"}
)
return {"products": [], "remaining": response.remaining}
Source: upstash.com/docs/redis/tutorials/python_rate_limiting.
Vercel environment setup: Add UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN to Vercel environment variables. Redis.from_env() reads both automatically.
Always return rate limit headers on 429 responses. Users who hit a limit with no context will assume your API is broken, not that they've been rate-limited. Always include X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and Retry-After — and include a link to your pricing page in the error body.
Stripe Meters — the only usage path for new builds.
Meters vs. Usage Records — The Critical Distinction
In April 2024, Stripe launched the Meters API as a new, recommended path for usage event ingestion. This is distinct from the older "usage records" approach. For any new integration in 2025–2026, use Meters.
| Feature | Usage Records (Legacy) | Meters (Recommended, 2024+) |
|---|---|---|
| API endpoint | /v1/subscription_items/{id}/usage_records |
/v1/billing/meter_events |
| Reporting pattern | Report against a specific SubscriptionItem ID | Report against a named event (e.g., api_calls) |
| Subscription coupling | Requires knowing the SubscriptionItem ID | Decoupled — report usage before subscription exists |
| Aggregation | Manual aggregation required | Stripe aggregates natively (sum, count, last) |
| Ingest rate | Limited | Up to 100,000 events/second |
| Recommended for new integrations? | No | Yes |
From Stripe's Sessions 2024 announcement: "We launched support for usage-based billing with our new Meters API, so you can now ingest, aggregate, and view usage events on Stripe in real time."
From Prefab.cloud's analysis: "The great thing about Stripe's new support for Usage Based Billing and Meters is that the 'usage' part has gotten very simple and is hardly something you need to think about anymore."
The Stripe Meters API Flow (Four Steps)
Step 1 — Create a Meter (one-time setup)
# From Stripe API Reference: https://docs.stripe.com/api/billing/meter/create
curl https://api.stripe.com/v1/billing/meters \
-u "sk_live_YOUR_KEY:" \
-d "display_name=API Calls" \
-d "event_name=api_call" \
-d "default_aggregation[formula]=sum" \
-d "value_settings[event_payload_key]=value" \
-d "customer_mapping[type]=by_id" \
-d "customer_mapping[event_payload_key]=stripe_customer_id"
Step 2 — Create a Product and Meter-linked Price via Stripe Dashboard
Create a recurring product, set pricing model to "Usage-based," set pricing structure to "Per unit," link the price to your meter by name.
Step 3 — Report Usage Events on every qualifying API call
import stripe
import time
stripe.api_key = "sk_live_YOUR_KEY"
def report_api_call(stripe_customer_id: str, call_count: int = 1):
"""Call this after each successful paid API request."""
stripe.billing.MeterEvent.create(
event_name="api_call",
payload={
"value": str(call_count),
"stripe_customer_id": stripe_customer_id,
},
identifier=f"call_{stripe_customer_id}_{int(time.time())}",
# identifier makes the event idempotent
)
FastAPI middleware pattern for automatic metering:
from fastapi import FastAPI, Request
import stripe, time
app = FastAPI()
stripe.api_key = "sk_live_YOUR_KEY"
@app.middleware("http")
async def meter_paid_requests(request: Request, call_next):
response = await call_next(request)
# Only meter successful paid-tier requests
if response.status_code == 200 and request.state.is_paid_tier:
stripe_customer_id = request.state.stripe_customer_id
if stripe_customer_id:
# Fire-and-forget: don't block response waiting for Stripe
stripe.billing.MeterEvent.create(
event_name="api_call",
payload={"value": "1", "stripe_customer_id": stripe_customer_id}
)
return response
Stripe processes meter events asynchronously. Do not block your API response waiting for confirmation. In production, send usage events to a queue (Redis list, SQS) and process them in a background worker.
Step 4 — Invoice Flow
Stripe bills customers at the end of their subscription period based on aggregated meter data. The upcoming_invoice endpoint reflects metered usage (with up to 30-second lag in test mode).
Stripe Billing Fees
From Stripe's pricing page:
| Plan | Fee | Recurring |
|---|---|---|
| Pay-as-you-go Billing | 0.7% of billing volume | No recurring fees |
| Annual Billing contract | Starting $620/month | Annual commitment |
What this means in practice:
- At $1,000 MRR: Stripe Billing costs $7/month; card processing ~$30–$45/month; total Stripe cut ~$37–$52 (3.7%–5.2%).
- The 0.7% fee becomes meaningful at ~$88,500 MRR, where it equals the $620/month annual contract floor.
- For a solo founder at < $5K MRR: use pay-as-you-go. Do not sign an annual contract.
Alternatives — Lago and Others
- Lago: Open-source billing platform. Self-hosted: free (Docker/Kubernetes). Cloud: Business and Enterprise tiers, pricing not publicly listed (contact sales). Best for founders who have hit Stripe's 0.7% pain point at scale, or need complex metering (dimensional pricing, prepaid credits). Not for early-stage — the ops burden is disproportionate when Stripe's 0.7% on $2K MRR is $14/month.
- Paddle: Merchant-of-record model; handles tax and global compliance. 5% + $0.50 per transaction [UNVERIFIED — verify at paddle.com/pricing]. Best for international sales with VAT/GST complexity.
- Orb: Usage-based billing infrastructure for complex models. Pricing not publicly listed [UNVERIFIED — verify at withorb.com/pricing]. Best for multiple meters and dimensional pricing.
How much of the stack do you want to own?
The core question is how much of the infrastructure stack you want to own vs. outsource. Five options, one comparison table.
Option A — DIY FastAPI + Upstash Redis
What you build: API key table in your DB, FastAPI middleware (validate key → check Redis counter → increment Redis counter), Stripe Billing subscription for paid customers, Stripe Meters for usage reporting.
- Monthly cost at low volume: $0 (Upstash free tier under 10K req/day) + 0.7% Stripe on billing volume
- Pros: full control, no vendor lock-in, zero additional tooling cost at low volume
- Cons: you build API key management logic, you build Stripe ↔ DB sync (webhook handler), no built-in developer portal, 2–4 weeks minimum implementation time for a correct abuse-resistant system
- Verdict: Best for founders who want to understand the stack deeply and minimize per-seat tooling costs.
Option B — Unkey
Open-source API key management platform (unkey.com). Handles: key creation, management, validation, per-key rate limiting, usage analytics, IP allowlisting, audit logs, globally distributed low-latency key verification.
- Free tier: 150,000 verifications/month, no credit card required
- One API call in your FastAPI dependency replaces your entire key management layer
import unkey
from fastapi import FastAPI, Header, HTTPException
client = unkey.Client(api_id="YOUR_API_ID")
app = FastAPI()
async def validate_api_key(x_api_key: str = Header(...)):
result = await client.keys.verify(key=x_api_key)
if not result.valid:
raise HTTPException(status_code=401, detail="Invalid API key")
if result.code == "RATE_LIMITED":
raise HTTPException(status_code=429, detail="Rate limit exceeded")
if result.code == "USAGE_EXCEEDED":
raise HTTPException(status_code=402, detail="Upgrade required")
return result
Verdict: Best for founders who want to eliminate API key management entirely. Pairs with Stripe Billing for payment. Adds vendor dependency but removes a non-trivial implementation surface.
Option C — Zuplo
Fully managed API gateway (zuplo.com). Handles: rate limiting, API key issuance, usage metering, Stripe billing integration, developer portal — as an integrated platform.
- Free: $0/month — 100K requests/month, unlimited API keys
- Builder: $25/month — up to 1M requests/month, 1,000 consumers, 2 custom domains
- Vercel integration: Zuplo sits in front of your FastAPI as a proxy; your app handles business logic; Zuplo handles auth, rate limiting, metering, billing before the request reaches your code
From Zuplo's comparison: "Stripe Billing does not enforce quotas... Zuplo enforces quotas at the edge."
Verdict: Best for founders who want a complete billing-integrated gateway with a customer-facing portal. Main risk: $25/month is a real carry cost at zero revenue, and the gateway pattern puts Zuplo in your critical path.
Option D — Stripe Billing Native (No Gateway)
Stripe alone, without an API gateway, does NOT provide: API key issuance or validation, rate limiting or quota enforcement, real-time request counting, or a developer portal. Stripe must be paired with at least Upstash Redis + your own key management layer.
Option E — Lago (Self-Hosted)
Best for founders who have hit Stripe's 0.7% pain point at scale or need complex metering. Not for early-stage — disproportionate ops burden.
Stack Comparison
| Tool | What It Handles | Monthly Cost (Low Volume <10K req/day) | Vercel Compatible | Complexity |
|---|---|---|---|---|
| DIY FastAPI + Upstash Redis | Rate limiting, API key validation (you build), usage counting | $0 (Upstash free tier) | ✅ Yes — HTTP REST API, no TCP dependency | Medium — must build key management, Stripe webhooks, usage sync |
| Unkey | API key lifecycle (create, validate, rotate, revoke), per-key rate limiting, usage analytics | $0 (150K verifications/month free) | ✅ Yes — HTTP API, SDKs available | Low — replace key management with one API call |
| Zuplo | API gateway (routing + auth + rate limiting + quota enforcement + developer portal + Stripe billing) | $0–$25/month (100K req/month free; $25 Builder) | ✅ Yes — proxy sits in front of Vercel | Low–Medium — configuration-based, no custom code for standard flows |
| Stripe Billing native | Subscription management, usage meter ingestion, invoice generation, payment collection | 0.7% of billing volume | ✅ Yes — HTTP API, official Python SDK | Medium — does NOT handle rate limiting, key management, or quota enforcement |
| Lago (self-hosted) | Complex metering, usage-based billing, invoice generation, prepaid credits, multi-entity | $0 software + ~$20–$50/month VPS hosting | ⚠️ Partial — needs separate hosting; not Vercel-native | High — requires Docker/Kubernetes, PostgreSQL, Redis, ongoing ops |
| Zuplo + Stripe (combined) | End-to-end: gateway + billing + developer portal | $0–$25/month Zuplo + 0.7% Stripe Billing | ✅ Yes | Low — purpose-built for API monetization; least custom code |
Six mistakes that kill conversion or invite abuse.
The Over-Generous Free Tier
From a16z: "Though a generous free tier can initially grow your user base, many users may not see any reason to upgrade to paid tiers if you're meeting most of their needs in the free tier."
Real founder: "We cut our free tier down to 2 prompts and got our first annual subscriber the same week... The product itself remained unchanged; only the access level was modified." (Reddit r/startups, 2025)
Fix: Design the free tier for discovery, not productivity. Users should be able to demonstrate value but not rely on the free tier for production workloads.
Rate Limits Too Low (The Restrictive Trap)
a16z: "If your free tier is either too heavily rate- or feature-limited, customers are not likely to realize the value of your product before leaving."
Fix: Test your own free tier. Can a new user complete a meaningful integration in one session without hitting limits? If not, the limit is too low.
Rate Limiting by IP Instead of API Key
From Statsig's rate limiting guide: "Many users share an IP at work or on mobile carriers; a pure IP cap will punish the innocent. Scope by user or account when possible."
Fix: Rate limit by API key, not IP, for authenticated endpoints. Use IP-based limiting only for unauthenticated signup/health endpoints.
The "Free Forever" Trap
The Heroku case study. Dev.to: "Heroku killed the free tier. The backlash was immense, but the business logic was sound. They had fallen into the trap of subsidizing 'noise.'"
Fix: Never label a tier "free forever." Use "free starter plan" or "free up to X calls/month."
Not Returning Rate Limit Headers
The correct 429 response format:
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1735689600
Retry-After: 86400
Content-Type: application/json
{"error": "rate_limit_exceeded", "message": "Free tier limit: 100 calls/day. Upgrade at https://yourapi.com/pricing"}
Fix: Always emit X-RateLimit-* + Retry-After headers on 429 — and include a pricing-page link in the JSON body so the user knows what to do next.
Multi-Account Abuse (The Cheapest Attack)
Users create multiple free-tier accounts to bypass per-account limits. Practical solo-founder mitigation stack (no security team required):
- Require API key for all requests — no anonymous access
- Rate limit by API key (not IP) in application middleware
- Add Cloudflare free plan in front of Vercel (DDoS protection, bot filtering)
- Monitor Redis counters for anomalous usage patterns with a simple cron alert
- Implement a
flagged_keysset in Redis for manual key suspension
Eight questions an API operator actually asks.
Q1: Do I need an API gateway (Zuplo, Kong) or can I just use FastAPI middleware?
For a solo founder at early stage, FastAPI middleware + Upstash Redis is sufficient. API gateways add real value when you need: (1) multi-region enforcement, (2) a customer-facing developer portal, (3) complex routing across multiple backend services, or (4) Stripe billing integration without custom code. At <$10K MRR, the $25/month Zuplo Builder tier or Unkey is reasonable to outsource API key management. A full Kong/Tyk self-hosted deployment is overkill.
Q2: When should I add Stripe metered billing vs. just using flat subscriptions?
Start with flat subscriptions. Add usage-based metering when: (1) you have at least 3–5 paying customers hitting the ceiling of their tier, (2) you have clear evidence high-usage customers would pay more, and (3) you have Upstash Redis counter logging (so you have the usage data). Metered billing adds ~2–4 hours of implementation time using Stripe Meters, but requires ongoing operational attention (monitoring for missed events, handling async aggregation lag).
Q3: What's the minimum viable rate limiting stack for a FastAPI app on Vercel today?
pip install upstash-ratelimit upstash-redis- Create a free Upstash Redis database (10K commands/day free)
- Add
UPSTASH_REDIS_REST_URLandUPSTASH_REDIS_REST_TOKENto Vercel environment variables - Add a
check_rate_limit(api_key)dependency to your routes usingFixedWindow - Return 429 with
Retry-Afterheader when limit exceeded
Total cost: $0 at under 10K requests/day.
Q4: How do I handle the gap between "user hits free tier limit" and "user upgrades to paid"?
Design for a 24–72 hour conversion window. When a user hits their limit: (1) return a 429 with a link to your pricing page, (2) send an email (if you have their address) with a direct Stripe checkout link, (3) optionally soft-block for 1 hour rather than the full day to create urgency without frustration. Most conversion decisions happen within 72 hours of hitting a meaningful limit.
Q5: What are Stripe's fees for a small API with $2,000 MRR?
Stripe Billing (usage tracking): 0.7% × $2,000 = $14/month. Card processing: ~2.9% + $0.30/transaction. For 10 customers paying $200/month: ~$62/month. Total Stripe take: ~$76/month (~3.8% of revenue). At $2K MRR this is negligible. Stripe becomes expensive above ~$50K MRR, where alternatives (Paddle, Lago) are worth evaluating.
Q6: Can I use Stripe's free tier as my product's free tier?
No. Stripe Billing has no concept of a "free tier" for your end customers — Stripe charges you 0.7% on all billing volume it processes. Your product's free tier is implemented in your application layer (Redis rate limiter + feature flags), not in Stripe. Stripe only enters the picture when a customer creates a paid subscription.
Q7: What's the difference between a hard gate and a soft gate?
Hard gate: when a user hits their free tier limit, the API returns 429 and blocks all further requests until the window resets. Creates maximum upgrade pressure but also maximum frustration.
Soft gate: usage continues but the response includes a warning header or payload ("You've used 90% of your free tier"). Overages may be billed at a per-call rate (requires Stripe Meters).
Recommendation: hard gate for free→paid conversion. Soft gate (with overage billing) for paid tier ceiling management.
Q8: How do I store and validate API keys in a FastAPI app?
Minimal implementation — store a SHA-256 hash of the key in your DB (never store plaintext):
import hashlib, secrets
def generate_api_key() -> tuple[str, str]:
"""Returns (plaintext_key_for_user, hashed_key_for_db)"""
plaintext = "sk_" + secrets.token_urlsafe(32)
hashed = hashlib.sha256(plaintext.encode()).hexdigest()
return plaintext, hashed
async def validate_key(x_api_key: str = Header(...), db=Depends(get_db)):
key_hash = hashlib.sha256(x_api_key.encode()).hexdigest()
record = await db.get_api_key(key_hash)
if not record or record.revoked:
raise HTTPException(status_code=401)
return record
For production, consider Unkey to avoid building this yourself — it handles hash storage, key rotation, expiry, and global distribution out of the box.
The spec is written. The tools exist. The pricing is known.
Upstash Redis is free under 10K requests/day. Stripe Billing costs $0 until you have revenue. Unkey's free tier covers 150,000 key verifications per month. For a solo founder with a working FastAPI/Vercel API, the path from "no billing" to "free tier with paid upgrade" is a 6–12 hour project using the stack above. The earlier you instrument your gate, the more usage data you have when your first paying customer appears — and the more confidently you can price.
The rest of the AgentMall build is one spoke away. Read the AgentMall 30-Day Roadmap.