§1 · The Agent-Fraud Problem
Auth Passed, Payment Cleared — It's Still Fraud.
Traditional fraud prevention assumes a human at the keyboard. A card-present signal, a behavioral biometric, a CAPTCHA challenge, a 3DS step-up — all of these rely on biological latency and physical context. When the buyer is an agent, those assumptions collapse entirely.
Consider the threat model after a principal has successfully completed OAuth + PKCE authorization: valid access token, authorized scopes, spending limit intact, Stripe payment method on file. The transaction will clear. But the token leaks — intercepted from an insecure agent environment, logged to stdout, stored unencrypted in agent memory, or exfiltrated via prompt injection against the agent's tool chain. The attacker now has everything they need, and standard fraud signals offer nothing useful in response.
Agents do not move a mouse. They do not pause before checkout. They do not produce browser fingerprints, device sensor readings, or human typing cadence. Radar's passive behavioral signals — the ones that catch human fraudsters — produce null or anomalous readings for legitimate agents and fraudulent token replays alike. Worse: agents retry harder than humans. A soft decline that causes a human to abandon checkout becomes a programmatic retry loop. Stripe's Adaptive Acceptance ML was trained predominantly on human patterns; the retry-aggressiveness signal that flags a human fraud attempt may appear as normal agentic behavior.
This is the structural problem: fraud does not disappear when authentication is good. It shifts. The attack surface moves from pre-auth to post-auth. The attacker's goal is no longer to steal credentials — it is to steal or misuse the downstream artifact of successful authentication: the access token, the payment session, the Shared Payment Token.
Post-Auth Attack Taxonomy
Speculation Flag
Agent-specific fraud rate data is not yet publicly available as a distinct metric. Industry figures for card-not-present fraud rates (typically cited in the 0.06%–0.10% range by Nilson Report) are the closest proxy, but agent transactions carry a distinct risk profile — programmatic retry, no behavioral friction, off-session flow — that likely places them closer to the higher-risk end of the CNP distribution. Until agent-specific fraud data is published by card networks or processors, treat all percentage figures extrapolated from CNP data as provisional and flag accordingly.
| Attack Type |
Mechanism |
Auth-Layer Visibility |
Payment-Layer Signal |
| Token exfiltration |
Attacker steals bearer access token from agent runtime via prompt injection, log leak, or memory dump |
None — token was legitimately issued |
Token age anomaly; new IP/device; velocity spike |
| Phished principal credentials |
Attacker hijacks the principal's OAuth session before grant; agent proceeds with attacker-controlled grant |
Detectable via unusual AS login signal, not at payment layer |
Geographic anomaly; unusual MCC |
| Spending cap probing |
Many small transactions just under RAR max_amount to map enforcement boundary — adapted card-testing for agents |
Not visible at auth layer |
Hourly velocity cluster below a threshold |
| Account takeover via compromised agent platform |
Attacker breaches the agent platform itself; all delegated tokens now under attacker control |
None — tokens appear legitimate |
Sudden velocity spike across many principals simultaneously |
| Merchant category drift |
Agent authorized for "software subscriptions" makes purchases in "electronics" or "wire transfers" |
Detectable via RAR authorization_details mismatch if AS enforces strictly |
MCC not in principal's typical MCC list |
Auth Layer
DPoP + RAR
Binds tokens to client key pairs and constrains spending parameters at issuance time. Catches token exfiltration and cap overrun before payment is attempted.
Payment Layer
Stripe Radar + Metadata
Custom rules keyed to agent metadata catch velocity probing, cross-border anomalies, and category drift that the auth layer never sees.
Dispute Layer
Evidence Package
OAuth grant log + RAR authorization_details + token introspection record + DPoP binding proof = your chargeback defense under Visa 10.4 / Mastercard 4837.
§2 · Stripe Radar Deep Dive
Three Radar Tiers — Only One Works for Agent Commerce.
Stripe Radar is trained on data from merchants processing more than $1.9 trillion in payments annually. Stripe reports a 92% probability that any card presented has been seen before on the Stripe network — dense cross-merchant signal that standalone fraud tools cannot replicate. But the base tier gives you ML scoring without the ability to inject the context that makes agent transactions distinct. For agent commerce, Radar for Fraud Teams is the operative tier.
Radar Tier Comparison
| Feature |
Radar (Base ML) |
Radar for Fraud Teams |
Stripe Sigma |
| Pricing |
Included with Stripe payments — no additional charge |
~$0.02/screened transaction (re-verify before launch at stripe.com/radar/pricing) |
Included with Stripe payments; Data Pipeline pricing separate (re-verify) |
| ML fraud scoring |
✓ Every transaction scored 0–100 |
✓ Same plus custom ML |
N/A — analytics layer |
| Risk levels |
normal, elevated, highest |
Same, plus adjustable thresholds |
Queryable via SQL |
| Custom rules engine |
✗ — no custom rules |
✓ Hundreds of attributes; metadata support via double-colon syntax |
N/A |
| AI-powered Radar Assistant |
✗ |
✓ Natural language → rule syntax |
N/A |
| Block/allow lists |
Basic |
✓ Full (card, email, IP) |
N/A |
| Backtesting on 6-month history |
✗ |
✓ |
N/A |
| Dispute probability score |
✗ |
✓ ML-based likelihood of winning dispute |
Queryable |
| Smart Refunds |
✗ |
✓ |
N/A |
| Agent-specific custom metadata rules |
✗ — no mechanism to read your metadata |
✓ Double-colon syntax for all metadata fields you pass |
Post-hoc query only |
Critical
Stripe Radar base ML does NOT detect agent-specific fraud out of the box. It cannot know the token was issued 45 minutes ago, or that the principal's typical MCC is 5734, unless you pass that data via metadata. Without Fraud Teams and custom metadata rules, Radar treats agent transactions as unusual-but-unclassified CNP traffic and will miss the most dangerous attack vectors.
Custom Rules Syntax
Radar for Fraud Teams rules follow the structure {action} if {attribute} {operator} {value}. Actions are Block, Review, Allow, and Request 3D Secure.
Standard attributes use single colons: :risk_score:, :amount_in_usd:, :ip_country:, :card_country:, :minutes_since_customer_was_created:. Metadata attributes use double colons: ::agent_token_age_minutes::, ::principal_typical_mcc::, ::is_agent_flow::. Customer metadata: ::[customer:field_name]::. Compound conditions use AND, OR, NOT.
Radar Rule — Compound Condition Example
Block if ::is_agent_flow:: = 'true' AND :amount_in_usd: > 50 AND :minutes_since_customer_was_created: < 60
Radar Rule — Cross-Border Agent Review
Review if :ip_country: != :card_country: AND ::is_agent_flow:: = 'true'
ML Scoring and Risk Thresholds
Every transaction receives a risk_score from 0 (least risky) to 100 (riskiest). Default thresholds: ≥65 = elevated, ≥75 = highest. These are adjustable in Risk Settings for Fraud Teams accounts. Custom rules can target :risk_score: directly: Block if :risk_score: > 85. The Fraud Teams tier also exposes :bot_score:, :fraudulent_dispute_score:, and :early_fraud_warning_score: as rule attributes.
For agent commerce, consider lowering the elevated threshold from 65 to 60. Agent flows generate no behavioral biometric signal — the absence of those signals artificially deflates the risk score on what may be a fraudulent replay.
Velocity Attributes Built into Radar
| Attribute |
Counts |
Agent Use |
:authorized_charges_per_email_hourly: |
Successful charges keyed to email in past hour |
Cap probing detection — cluster of micro-transactions from one principal |
:blocked_charges_per_email_hourly: |
Blocked charges keyed to email in past hour |
Retry storm detection after an initial block |
:email_count_for_ip_hourly: |
Distinct emails from same IP in past hour |
Platform-breach detection — attacker using same infra for many principals |
:minutes_since_per_payment_instrument_fingerprint_first_seen: |
Age of payment instrument in Stripe's network |
New card + immediate agent use = card-on-file compromise pattern |
§3 · Adaptive Acceptance
How Agents Break the Retry Model.
Stripe's Adaptive Acceptance uses ML to identify false declines and retry them with optimized parameters — message format, routing, postal code normalization — invisibly, before the customer sees the decline. In 2024, Adaptive Acceptance recovered a record $6 billion in falsely declined transactions, a 60% year-over-year increase in retry success rate. Stripe recently migrated from a gradient-boosted XGBoost model to a TabTransformer+ deep neural network architecture, achieving 70% greater precision in identifying legitimate declined transactions.
The Agent-Specific Retry Problem
Adaptive Acceptance's retry signal is tuned on predominantly human behavioral patterns. Legitimate agents retry programmatically — there is no human abandonment signal after a soft decline. Fraudulent token reuse also retries programmatically. From the ML's perspective, both look similar. This creates two concrete operational risks:
- Over-retry fines. Mastercard allows 35 retry attempts per card per rolling 30-day period; Visa allows 15. Exceeding these can result in fines from acquirers. Agents that implement their own retry logic outside of Stripe's managed flows may inadvertently hit these limits. Use Stripe's Smart Retries rather than raw retry loops.
- False-negative pressure on fraud detection. The retry-aggressiveness signal that flags a human fraud attempt may appear as normal agentic behavior, reducing Radar's effective detection rate on fraudulent token replays.
| Signal |
Human Behavior |
Legitimate Agent Behavior |
Fraudulent Token Replay |
Distinguishing Feature |
| Retry after soft decline |
Rare — human abandons or uses different card |
Programmatic — retries immediately per configured logic |
Programmatic — retries aggressively until hard decline or success |
Token metadata: age, binding, principal context — NOT retry pattern |
| Browser fingerprint |
Rich device + behavioral signal |
Headless / absent |
Headless / absent |
Not distinguishing — both absent |
| Typing cadence / mouse movement |
Unique behavioral biometric |
Not applicable |
Not applicable |
Not distinguishing — both absent |
| Transaction timing |
Business hours, human time zones |
May run 24/7 per principal's scheduled workflow |
Often off-hours — attacker rushes token use |
Time-of-day in principal's timezone is a weak but useful signal |
| Geographic consistency |
IP near cardholder billing address |
Agent IP may differ from principal home country |
IP in attacker's country — different from principal |
::principal_home_country:: metadata makes this a strong signal |
Operator Note
The distinguishing feature between a legitimate agent retry and a fraudulent token replay is not behavioral latency — it is token metadata: age since issuance, DPoP binding proof, principal context. Operators who instrument Stripe charges with this metadata give Adaptive Acceptance's models the signal they need to generalize across merchant traffic and calibrate to agent behavioral baselines over time.
§4 · Agent-Specific Risk Signals
Eight Signals Base Radar Cannot See Without Your Metadata.
Speculation Flag
Empirical agent-specific fraud rate data does not yet exist at publication volume. The signal patterns below are derived from card-testing, account takeover, and token theft research adapted to the agentic context. These are the closest available proxies, not confirmed agent fraud benchmarks.
| Signal |
What to Detect |
Radar Rule Pattern |
Disposition |
| Token issued recently + high amount |
Leaked token rushed into use before principal notices |
Block if ::agent_token_age_minutes:: < 60 AND :amount_in_usd: > 50 |
Block |
| Spending cap probing |
Many small transactions just under RAR max_amount; adapted card-testing for agents |
Review if ::is_agent_flow:: = 'true' AND :authorized_charges_per_email_hourly: > 5 AND :amount_in_usd: < 10 |
Flag for review; escalate if pattern continues |
| Token reuse from new IP/device |
Attacker replaying token from different infrastructure |
Review if ::is_agent_flow:: = 'true' AND :ip_country: != ::principal_home_country:: |
Flag; correlate with login anomaly signals |
| Merchant category drift |
Agent authorized for software purchases attempting electronics or wire_transfer |
Block if ::is_agent_flow:: = 'true' AND NOT (:mcc: IN ::principal_typical_mcc_list::) |
Block or Request 3D Secure |
| Time-of-day anomaly |
3 AM in principal's timezone; no legitimate principal-initiated agent tasks at that hour |
Review if ::is_agent_flow:: = 'true' AND ::transaction_hour_principal_tz:: < 5 |
Flag for review |
| Cross-border anomaly |
IP geolocation inconsistent with principal residence and card issuer country |
Block if :ip_country: != :card_country: AND ::is_agent_flow:: = 'true' |
Block |
| Velocity inconsistent with principal history |
Rate far exceeding principal's historical transaction frequency |
Review if :authorized_charges_per_email_hourly: > ::principal_baseline_hourly_txn_count:: |
Flag |
| New payment instrument + agent flow |
Fresh card added to principal account then immediately used by agent |
Review if ::is_agent_flow:: = 'true' AND :minutes_since_per_payment_instrument_fingerprint_first_seen: < 1440 |
Flag |
Spending Cap Probing in Detail
Classic card testing involves running many small transactions across stolen cards to identify which are active. In the agent context the attack adapts: the attacker holds a valid token with a known max_amount embedded in the RAR authorization_details. Rather than probing card validity, they probe for three things:
- Authorization server enforcement boundaries — does the AS actually reject charges above
max_amount, or does it pass them through?
- Merchant-side rate limits — how many transactions per hour does the platform allow before triggering a review?
- Risk score calibration — do small transactions below a threshold go through without Radar review?
The signal is a cluster of agent-attributed transactions from the same principal, same card, same time window, all just below either the RAR spending cap or the apparent Radar block threshold. Radar's velocity attribute :authorized_charges_per_email_hourly: will catch high-volume probing. For sophisticated low-velocity probing, the time-window anomaly is the more reliable signal.
Under Visa's VAMP program, even 300,000 combined approved + declined authorization attempts per month can trigger acquirer scrutiny — counts, not just dollar losses, matter. Agent-generated micro-transactions aggregate toward these thresholds faster than human ones.
Required Metadata Fields — Pass on Every Agent Charge
Stripe metadata payload — stripe.paymentIntents.create()
{
"is_agent_flow": "true",
"agent_token_age_minutes": "45",
"principal_home_country": "US",
"principal_typical_mcc": "5734",
"spending_cap_usd": "200",
"rar_reference_id": "txn-ref-abc123",
"transaction_hour_principal_tz": "14",
"retry_count": "0"
}
The 30-Day AgentMall Newsletter
One operator note per week. The trust layer in your inbox.
Field-tested patterns, real failure modes, and the next trust-layer spoke as it ships. No fluff. Cancel any time.
§5 · DPoP Integration
Shrinking the Fraud Blast Radius — RFC 9449 Token Binding.
DPoP (Demonstrating Proof of Possession), standardized as RFC 9449 in September 2023, binds OAuth access and refresh tokens to a public/private key pair held by the client. An attacker who exfiltrates the access token without the matching private key cannot use it — the resource server rejects the DPoP proof verification. For agent commerce, this is the single highest-leverage auth-layer fraud mitigation against token exfiltration.
| Scenario |
Without DPoP |
With DPoP (RFC 9449) |
Residual Risk |
| Bearer token exfiltrated from logs |
Full spending authority up to RAR limits — attacker can use immediately |
Token useless without private key — attack neutralized |
None from this vector |
| Access token intercepted in transit |
Replay possible; bearer token usable anywhere the token is accepted |
Proof JWT contains htu (target URL) + htm (method) + fresh iat — replay blocked by RS |
Millisecond-window replay; mitigated by server nonces (RFC 9449 §8) |
| Full device/runtime compromise |
Attacker controls token |
Attacker also controls private key — DPoP provides no protection |
Requires platform-layer compromise detection |
| Refresh token theft |
Attacker can obtain new access tokens indefinitely |
Refresh tokens are also DPoP-bound — same private key required |
None from this vector if refresh tokens are bound |
| Prompt injection → agent makes fraudulent purchase |
Agent's legitimate token used for attacker-directed transaction |
DPoP provides no protection — legitimate client making the call |
Requires Radar rules + RAR MCC enforcement |
DPoP Flow for Agent Commerce
The client generates an asymmetric key pair (P-256/ES256). On every token request and every resource request, the client signs a fresh proof JWT containing the HTTP method, endpoint URL, current timestamp (iat), and a jti nonce. The authorization server computes the SHA-256 thumbprint of the public key and embeds it in a cnf claim in the access token. The resource server verifies the DPoP proof on each API call — including the payment call to Stripe's charge endpoint — confirming the presenting client holds the private key matching the bound public key.
Non-extractable key generation is essential: use crypto.subtle.generateKey with extractable: false in browser environments; use hardware-backed secure enclaves in native agent runtimes. A DPoP key pair that can be extracted provides no more security than a bearer token.
Server-issued nonces (RFC 9449 §8) narrow the proof validity window, preventing even millisecond-scale replay of intercepted proofs. Configure your authorization server to issue nonces and require clients to persist the latest DPoP-Nonce response header for subsequent proofs.
OAuth 2.1 and the Model Context Protocol both list sender-constrained tokens — DPoP or mTLS — as recommended hardening for public clients, which now explicitly includes AI agents. FAPI 2.0 (the open-banking security profile) names DPoP as one of two acceptable sender-constraining mechanisms.
Cross-Link
Full DPoP implementation specifics — authorization server configuration, PKCE integration, token introspection response shape including the cnf claim — are covered in the OAuth + PKCE + DPoP spoke. This section covers only the fraud-prevention angle.
§6 · Authorization-Layer Fraud Prevention
RAR + Introspection — Spending Limits the AS Actually Enforces.
Rich Authorization Requests (RFC 9396) allow OAuth clients to embed structured, transaction-level permission details in the authorization request via the authorization_details parameter. For agent commerce, this delivers spending caps, MCC restrictions, single-use authorization references, and location binding — all embedded at token issuance time.
Global Correction
OAuth scopes alone are NOT sufficient for agent spending limits. Scopes like purchase:limited convey a category of permission but carry no structured amount, currency, or MCC constraints. RFC 9396 RAR's authorization_details is the canonical answer. A scope string cannot be enforced as a $150 spending cap; "max_amount": 150, "currency": "USD" in authorization_details can.
Four RAR Controls for Agent Commerce
| Control |
authorization_details Field |
Fraud Type Caught |
Enforcement Gap |
| Spending cap |
"max_amount": 150, "currency": "USD" |
Amount overrun by exfiltrated token or compromised agent |
Only as good as AS + RS enforcement; Stripe is unaware of RAR limits unless you propagate them via metadata |
| MCC restriction |
"allowed_mcc": ["5045", "5734"] |
Merchant category drift — agent authorized for software used in electronics or wire transfers |
Same enforcement gap; must propagate to Stripe metadata for a second check |
| Single-use reference |
Unique reference number in authorization_details |
Token replay — same authorization used for multiple transactions |
Requires idempotent reference registry at resource server; must mark reference as consumed on first use |
| Location binding |
"locations": ["https://api.merchant-a.com"] |
Token accepted at wrong merchant endpoint |
Requires AS to embed locations array in issued token and RS to validate against its own URI |
Two-Layer Enforcement Architecture
RAR enforcement happens at the authorization server and resource server — not at Stripe. If the merchant's own API does not validate the introspected authorization_details before calling stripe.charges.create(), RAR provides no protection at the payment layer. The two layers must be explicitly wired together.
Two-Layer Flow — AS + RS + Stripe Metadata
1. Principal authorizes via OAuth + PKCE + RAR:
authorization_details = {
"type": "agent_purchase",
"max_amount": 200,
"currency": "USD",
"allowed_mcc": ["5734"],
"single_use_ref": "txn-ref-abc123"
}
2. Agent calls merchant API with access token + DPoP proof.
3. Merchant RS introspects token → extracts authorization_details:
- Validates: amount <= max_amount (200)
- Validates: requested MCC in allowed_mcc (["5734"])
- Validates: single_use_ref not in consumed-ref registry
- Marks single_use_ref as consumed atomically
4. Merchant passes authorization_details as Stripe charge metadata:
metadata = {
"is_agent_flow": "true",
"rar_max_amount": "200",
"rar_allowed_mcc": "5734",
"rar_reference_id": "txn-ref-abc123",
"agent_token_age_minutes": "38"
}
5. Stripe Radar rules fire against that metadata — second enforcement layer.
If merchant API has a logic error and passes a charge above max_amount,
Radar catches it here before it clears.
Cross-Link
Full RAR authorization_details schema design, Pushed Authorization Requests (PAR) combination, and AS introspection implementation are covered in the OAuth + PKCE + DPoP spoke.
§7 · Chargeback Economics
Agent-Initiated Purchases — The Merchant Bears the Liability.
Standard card network chargeback rules apply to agent-initiated purchases. Stripe does not currently carve out a separate liability regime for agent commerce. (Re-verify before launch: Stripe's Agentic Commerce Protocol (ACP) and Shared Payment Token (SPT) documentation does not explicitly address chargeback liability allocation as of this writing.)
Current Liability Framework
Under Visa and Mastercard rules, a Visa 10.4 or Mastercard 4837 chargeback reason code — "No Cardholder Authorization" — is the primary vector for fraud disputes. The issuing bank credits the cardholder and debits the merchant. Stripe charges a $15 dispute fee per chargeback on US cards (re-verify; fee structures vary by country and account type).
For agent-initiated purchases, the merchant must demonstrate valid principal authorization at the time of purchase, plus proof the transaction was initiated by a party the cardholder authorized. Without the artifacts below, the chargeback will succeed and the merchant absorbs the loss.
Merchant Defense Evidence Package
| Evidence Artifact |
What It Proves |
Where It Comes From |
Retention |
| OAuth grant log |
Principal explicitly authorized the agent, including scopes and RAR authorization_details |
Authorization server grant record, timestamped |
Minimum 18 months |
| RAR authorization_details payload |
Transaction was within the principal-authorized spending cap and MCC constraints |
Introspection response snapshot at charge time |
Minimum 18 months |
| Token introspection record |
Access token was valid, unrevoked, and within authorization parameters at charge time |
Resource server introspection call result, logged atomically with charge |
Minimum 18 months |
| Stripe charge metadata |
Charge ID, timestamp, and agent-specific metadata linking to the above artifacts |
Stripe payment_intent.succeeded webhook payload |
Minimum 18 months |
| DPoP binding proof |
DPoP public key thumbprint matched the token's cnf claim — requesting client held the private key |
Resource server DPoP verification log; dpop_jkt field in transaction record |
Minimum 18 months |
Platform-Breach Liability
Speculation Flag
No established card network rule or Stripe policy currently addresses the specific case where an agent platform is compromised and all delegated tokens are used fraudulently at scale. The current expectation, based on standard merchant services agreements, is that the merchant (the agentmall operator) bears liability unless it can prove individual principal consent for each transaction. In a platform-breach scenario, merchants face chargebacks on every transaction made by the attacker-controlled agent — at scale, with limited ability to recover. Mitigation: per-transaction authorization records, RAR single-use reference enforcement, and post-charge notification to principals are the current best practices.
Atomic Transaction Evidence Record — Schema
Durable transaction record — write on payment_intent.succeeded
{
"stripe_charge_id": "ch_3Px...",
"stripe_customer_id": "cus_...",
"oauth_grant_id": "grant_abc123",
"rar_reference_id": "txn-ref-abc123",
"token_issued_at": "2026-06-03T14:22:00Z",
"principal_id": "user_xyz",
"amount_cents": 4999,
"currency": "usd",
"mcc": "5734",
"agent_platform_id": "platform_abc",
"dpop_jkt": "sha256_thumbprint_of_public_key",
"radar_risk_score": 28,
"charge_timestamp": "2026-06-03T14:23:18Z",
"retain_until": "2028-01-01T00:00:00Z"
}
§8 · Real Radar Rule Examples
Five Complete Rules for Agent Commerce.
All rules below assume you pass the metadata payload from §4 on every agent-initiated charge. All rules are complete and deployable — no truncation. Start every rule in Review mode and backtest against six months of data before switching any to Block.
Rule 1 — Block New Token + High Amount + MCC Mismatch (Composite Risk)
Radar for Fraud Teams — Block rule
Block if ::is_agent_flow:: = 'true'
AND ::agent_token_age_minutes:: < 60
AND :amount_in_usd: > 50
AND NOT (::principal_typical_mcc:: = :mcc:)
Catches: freshly leaked token being rushed into use in the wrong merchant category before the principal notices. The composite condition (all three must be true) dramatically reduces false positives — a legitimate agent will rarely have a brand-new token, a high transaction, AND a category mismatch simultaneously.
Rule 2 — Flag Spending Cap Probing
Radar for Fraud Teams — Review rule
Review if ::is_agent_flow:: = 'true'
AND :authorized_charges_per_email_hourly: > 5
AND :amount_in_usd: < 10
Catches: micro-transaction probing pattern against the enforcement boundary. Five successful sub-$10 charges from the same principal email within one hour is an unusual cluster for legitimate agent workflows.
Rule 3 — Flag Cross-Border Anomaly on Agent Traffic
Radar for Fraud Teams — Review rule
Review if ::is_agent_flow:: = 'true'
AND :ip_country: != ::principal_home_country::
Catches: token reuse from attacker infrastructure in a different country from the principal's registered home country. Pass principal_home_country as ISO alpha-2 from your principal profile at charge time.
Rule 4 — Flag Late-Night Agent Transactions
Radar for Fraud Teams — Review rule
Review if ::is_agent_flow:: = 'true'
AND ::transaction_hour_principal_tz:: < 4
Catches: transactions between midnight and 4 AM in the principal's local timezone — unlikely to be principal-scheduled legitimate workflows. Compute the principal's local hour server-side before the Stripe call and pass it as transaction_hour_principal_tz (integer 0–23).
Rule 5 — Flag New Payment Instrument Used Immediately in Agent Flow
Radar for Fraud Teams — Review rule
Review if ::is_agent_flow:: = 'true'
AND :minutes_since_per_payment_instrument_fingerprint_first_seen: < 1440
Catches: newly added card (less than 24 hours since Stripe first saw the card fingerprint across its network) being immediately used via agent. This is the card-on-file compromise pattern — attacker adds a stolen card to the principal's account and routes the agent to use it before the principal notices.
Backtest Before Block
Radar for Fraud Teams provides a backtesting interface against the last six months of transactions. Run all new agent rules in Review mode before switching to Block. Use Stripe Sigma to query radar_early_fraud_warnings and disputes tables to understand the base rate of legitimate agent transactions that would be caught. Rules with >1% false positive rate on known-legitimate traffic need threshold adjustment before graduating to Block.
§9 · Trust-Layer Interaction Map
How This Plugs Into OAuth + RAR + DPoP + Agents Page.
Fraud prevention does not stand alone. Each layer in the agent-ready commerce architecture catches a different slice of the threat model. The table below maps what this spoke owns versus what the OAuth spoke and the Agents Page spoke own.
| Layer |
Technology |
What It Catches |
What It Misses |
Owned By |
| Auth Layer |
OAuth 2.1 + PKCE + DPoP |
Token exfiltration reuse; unauthorized token issuance; stolen refresh tokens |
Post-auth transaction anomalies; merchant category drift; spending pattern fraud |
OAuth spoke |
| Authorization Scope Layer |
RAR (RFC 9396) + introspection |
Spending cap overrun; MCC mismatch; single-use token replay; location-bound token misuse |
Behavioral velocity anomalies; device/IP fraud signals; timing attacks |
OAuth spoke |
| Payment Layer |
Stripe Radar + custom rules |
Velocity probing; cross-border anomaly; category drift via metadata; time-of-day anomaly; cap probing |
Sub-threshold distributed probing across many tokens; fraud that exactly mimics legitimate principal behavior |
This spoke |
| Dispute Layer |
Stripe Radar dispute score + Sigma + chargeback evidence package |
Post-hoc fraud identification; dispute probability scoring; chargeback defense |
Does not prevent fraud proactively — retroactive only |
This spoke |
| Agent Discovery Layer |
/agents page + UCP compatibility |
Agent-readable trust signals; capability declarations; refund + dispute policy visibility |
Not a fraud prevention layer directly |
Agents Page spoke |
4-Layer Agent-Ready Model Integration
The 4-Layer Agent-Ready Model — Structured Data → API Endpoint → MCP Tool Description → UCP Compatibility — intersects with fraud prevention at each layer:
- Structured Data layer: Product catalog + pricing metadata must include MCC codes and authorization policy fields that agents can read before initiating a purchase, enabling pre-purchase authorization checks and reducing MCC-drift fraud.
- API Endpoint layer: The payment endpoint is where Stripe metadata gets attached and Radar rules fire. Instrument here. RAR introspection validation belongs here too.
- MCP Tool Description layer: MCP tool schemas should declare expected spending ranges and merchant categories so agents can surface these to principals for upfront consent, reducing post-purchase disputes.
- UCP Compatibility layer: UCP compatibility signals include fraud policy declarations — refund policy, dispute process, authorization scope requirements — that honest agents use for routing and that fraud detection systems can validate against post-transaction.
§10 · Common Mistakes
Eight ways agent fraud prevention breaks in production.
1. Relying on Stripe's base Radar ML without custom rules
The base ML tier has no mechanism for agent-specific signals. It cannot know the token was issued 45 minutes ago, or that the principal's typical MCC is 5734, unless you pass that data. Without Fraud Teams and custom metadata rules, Radar treats agent transactions as unusual-but-unclassified CNP traffic and will miss the most dangerous attack vectors. Fix: upgrade to Radar for Fraud Teams. Pass is_agent_flow, agent_token_age_minutes, principal_typical_mcc, and principal_home_country on every agent-initiated charge.
2. Passing RAR authorization_details to the AS but not forwarding enforced parameters to Stripe
Operators often implement RAR at the OAuth layer (good) but then call stripe.charges.create() without propagating the max_amount or MCC constraints as metadata. Radar has no visibility into what the AS authorized. The two layers are disconnected. Fix: at the merchant API layer, extract authorization_details from the introspection response and attach the relevant fields as Stripe charge metadata before the payment call.
3. Not logging OAuth grant artifacts at charge time
Fraud happens. When a chargeback arrives, you need the OAuth grant timestamp, the token's sub claim, the RAR authorization_details, and the Stripe charge ID — all linked to the same transaction record. If you log these separately, you may not be able to reconstruct the chain in time to contest the dispute. Fix: at charge creation time, write a single atomic record linking all of these identifiers to durable storage.
4. Using DPoP without server-side nonce enforcement
DPoP without nonces is vulnerable to replay within the proof's validity window (typically a few minutes). An attacker who intercepts a DPoP proof in transit can replay it within that window. Fix: configure your authorization server to require DPoP nonces per RFC 9449 §8. Clients should persist the latest nonce from the server's DPoP-Nonce response header and include it in subsequent proofs.
5. Not backtesting Radar rules before enabling Block action
A rule like Block if :ip_country: != ::principal_home_country:: will block every principal who travels or uses a VPN. Running it in Block mode without a backtest will produce false positives at scale. Fix: enable all new rules in Review mode first. Backtest against the last six months of transactions using Radar's backtest interface. Graduate to Block only after confirming the false positive rate is acceptable.
6. Treating agent velocity the same as human velocity
An agent that legitimately processes 50 transactions per hour for a principal running a bulk-purchasing workflow will trigger standard hourly velocity rules calibrated for human shoppers. Fix: tier your velocity rules by agent type. Pass a principal_authorized_tps metadata field reflecting the RAR-authorized transaction rate. Write Radar rules that allow higher velocity for principals who explicitly authorized high-rate workflows.
7. Assuming a soft decline means no fraud
Agents retry soft declines programmatically. Stripe's Smart Retries and Adaptive Acceptance will attempt to recover the transaction. A fraudulent token that generates a soft decline on the first attempt may succeed on retry. Fix: write a Radar rule that places an agent-attributed charge in Review after a first soft decline within the same session. Pass a retry_count metadata field incremented by your application and escalate disposition when ::retry_count:: > 1.
8. No principal notification on agent purchase completion
Even with perfect fraud controls, principals may not recognize legitimate purchases the agent made on their behalf — leading to first-party chargebacks (friendly fraud). Fix: send a push notification or email to the principal for every agent-completed purchase above a threshold (e.g., $25). Include item description, amount, timestamp, and a dispute-prevention link. This alone reduces first-party chargeback rates materially.
§11 · FAQ
Frequently asked questions.
Can Stripe Radar detect that a transaction was made by an AI agent rather than a human?
Out of the box, no. Stripe Radar's ML is trained on behavioral and network signals common across human-initiated card-not-present transactions. It does not have a native "is this an agent?" classifier. Agent flows often present as off-session, headless, with no browser fingerprint or behavioral biometric — signals that can look like bot traffic or compromised-account activity. The solution is to explicitly pass is_agent_flow: true in Stripe charge metadata and write custom Radar rules (requiring Fraud Teams) keyed to that flag plus agent-specific signals like token age and MCC context.
What is the Stripe Radar for Fraud Teams pricing, and is it worth it for an agent commerce operator?
Radar for Fraud Teams is currently ~$0.02 per screened transaction (re-verify at stripe.com/radar/pricing before launch). For an operator processing $500,000/month with an average transaction of $75, that is approximately 6,667 transactions × $0.02 = $133/month for the custom rules engine, backtesting, advanced analytics, and Sigma data access. Given that a single successful fraud chargeback on a $200 agent transaction costs the merchant the $200 plus a ~$15 dispute fee, the break-even is roughly one prevented chargeback per two months. For agent commerce where fraud signals are novel and default rules inadequate, the upgrade is almost always economically justified.
Does DPoP eliminate agent transaction fraud?
No. DPoP eliminates the specific threat of token exfiltration and bearer-token replay — an attacker who steals the access token but does not control the agent's private key cannot use it. DPoP does not protect against: (a) full device/runtime compromise where the attacker also controls the private key; (b) social engineering of the principal to authorize a malicious agent; (c) prompt injection attacks that cause the legitimate agent itself to make fraudulent purchases; (d) friendly fraud where the principal disputes a legitimate agent purchase. Payment-layer controls (Radar) and principal notification remain necessary alongside DPoP.
Who is liable for a chargeback on an agent-initiated purchase — the principal, the agent platform, or the merchant?
Under current card network rules as applied by Stripe, the merchant (the entity that processed the charge and received the funds) is the liable party. The principal's issuer will chargeback the merchant, not the agent platform. Whether the merchant can then recover from the agent platform depends on contractual terms between the merchant and the platform — Stripe has no visibility into that relationship. Stripe's standard terms apply the same chargeback rules to agent-initiated transactions as to any other CNP transaction. (Re-verify: Stripe's ACP documentation and SPT terms should be checked for any liability carve-outs specific to ACP-flow transactions.)
What is Stripe Sigma's role in agent fraud detection?
Stripe Sigma provides SQL access to all Stripe data including Radar signals, early fraud warnings, dispute records, and charge metadata. It does not block transactions in real time — that is Radar's role. Sigma's value in the agent context is post-hoc pattern analysis: identifying which agent-attributed charges have elevated dispute rates, which principal accounts show spending cap probing patterns, and which MCCs are anomalous for your agent traffic. These insights feed back into custom Radar rule development. Sigma is included with Stripe payments and with Radar for Fraud Teams accounts; Radar data in Sigma requires the Fraud Teams tier.
How does Stripe's Agentic Commerce Protocol (ACP) and Shared Payment Token (SPT) affect fraud prevention?
ACP is an open standard co-developed by Stripe and OpenAI for agent-to-merchant transaction flows. SPTs are scoped to a specific merchant and cart total, reducing the blast radius compared to a general-purpose bearer token — a stolen SPT can only be used at the specific merchant for the specific cart total it was issued for. However, SPTs are still bearer tokens in the sense that possession confers use. The fraud prevention stack described in this document (Radar rules, metadata enrichment, DPoP at the OAuth layer) remains necessary.
My agent retries programmatically and I'm worried about Visa/Mastercard retry penalty thresholds. What is the risk?
Visa allows up to 15 retry attempts per card per rolling 30-day period; Mastercard allows 35. Exceeding these can result in fines from acquirers. Stripe's Adaptive Acceptance reduces retry counts by identifying true false declines efficiently — the updated TabTransformer+ model reduced retry attempts by 35% while recovering more revenue. However, agents that implement their own retry logic outside of Stripe's managed flows may inadvertently exceed network limits. Use Stripe's Smart Retries (managed) rather than implementing raw retry loops. Pass ::retry_count:: metadata to track retry depth in Radar rules.
Can I use Stripe Radar to enforce RAR spending limits, or do I need to do that at my own API layer?
Stripe Radar cannot read OAuth authorization_details directly — Stripe has no visibility into your authorization server. You must propagate the enforced parameters (spending cap, MCC restrictions) as charge metadata, then write Radar rules that enforce them as a second layer. This is a belt-and-suspenders architecture: the AS enforces limits at the token level, and Radar enforces them again at the payment level. Neither layer alone is sufficient. The AS enforcement catches misuse at the API layer before payment is attempted; Radar catches any bypasses or merchant-API logic errors before the charge clears.
§12 · Step-by-Step
The 30-day rollout, in five steps.
Each step mirrors the HowTo JSON-LD at the top of this page word for word.
Step 1 — Instrument every agent-initiated charge with structured metadata
Before any rules can work, Stripe needs the context. Add the following fields to every stripe.paymentIntents.create() or stripe.charges.create() call on your agent-originated payment path: is_agent_flow (string 'true'), agent_token_age_minutes (integer minutes since OAuth grant was issued), principal_home_country (ISO alpha-2 from principal profile), principal_typical_mcc (comma-separated MCC list from authorization history), transaction_hour_principal_tz (0–23 integer, principal's local time), and rar_reference_id (the unique reference from RAR authorization_details). This is the foundation all downstream rules depend on.
Step 2 — Upgrade to Radar for Fraud Teams and configure risk thresholds
Navigate to the Radar settings in the Stripe Dashboard. Enable Radar for Fraud Teams (re-verify pricing at stripe.com/radar/pricing). Set your base risk threshold — for agent commerce, consider lowering the elevated threshold from 65 to 60 given the absence of behavioral biometric signal on agent traffic. Select the risk setting ("conservative," "balanced," or "aggressive") that aligns with your acceptable false positive rate. Enable early fraud warnings.
Step 3 — Deploy the compound agent fraud rules in Review mode, then backtest
Add each rule from Section 8 of this document starting with Review (not Block) disposition. Use Radar's backtest feature to see how the rules would have performed against the last six months of your transaction data. Identify the false positive rate for each rule. Rules with >1% false positive rate on known-legitimate traffic need threshold adjustment before graduating to Block. Use Stripe Sigma to cross-reference disputed transactions against the rules that would have caught them to measure true positive rates.
Step 4 — Implement DPoP on your authorization server for agent client registrations
Require DPoP for all OAuth clients registered as agent runtimes. Generate ES256 key pairs in hardware-backed secure storage (non-extractable). Enforce server nonces (RFC 9449 Section 8). Propagate the DPoP JWK thumbprint to your transaction logging layer so it can be included in chargeback evidence. Update your token introspection response to include the cnf claim so resource servers can verify binding. Cross-reference /agentmall_spoke_oauth for AS configuration specifics.
Step 5 — Build a durable transaction evidence record and principal notification flow
At the moment each agent-initiated charge is confirmed by Stripe (on payment_intent.succeeded webhook), write an immutable record containing: stripe_charge_id, stripe_customer_id, oauth_grant_id, rar_reference_id, token_issued_at (from introspection), principal_id, amount_cents, currency, mcc, agent_platform_id, dpop_jkt (if DPoP implemented), radar_risk_score, and charge_timestamp. Simultaneously dispatch a push/email notification to the principal with charge details and a one-click dispute flag. This evidence package is your chargeback defense and your principal-trust mechanism. Retain for a minimum of 18 months (chargebacks can arrive up to 120 days post-transaction; arbitration extends this further).
§13 · Continue the Guide
The complete trust layer.
The Window
Auth is solved. Payment cleared. Now make it fraud-proof.
Good OAuth + PKCE + DPoP puts the right tokens in the right hands at the right time. But the post-auth attack surface is real — tokens leak, principals get phished, and agents probe enforcement boundaries programmatically. The operators who close that window first will carry lower dispute rates, lower chargeback costs, and the kind of trust signals that agents prefer to route toward. The full AgentMall Roadmap connects this payment-layer spoke to the auth layer, the bot-verification edge, and the agent-facing trust signals that complete the picture.
Open the AgentMall Roadmap →