Fraud Prevention When the Buyer Is an Agent — Stripe Radar, Adaptive Acceptance, and Agent-Specific Risk Signals

§1 · The Agent-Fraud Problem

Auth Passed, Payment Cleared — It's Still Fraud.

Traditional fraud prevention assumes a human at the keyboard. A card-present signal, a behavioral biometric, a CAPTCHA challenge, a 3DS step-up — all of these rely on biological latency and physical context. When the buyer is an agent, those assumptions collapse entirely.

Consider the threat model after a principal has successfully completed OAuth + PKCE authorization: valid access token, authorized scopes, spending limit intact, Stripe payment method on file. The transaction will clear. But the token leaks — intercepted from an insecure agent environment, logged to stdout, stored unencrypted in agent memory, or exfiltrated via prompt injection against the agent's tool chain. The attacker now has everything they need, and standard fraud signals offer nothing useful in response.

Agents do not move a mouse. They do not pause before checkout. They do not produce browser fingerprints, device sensor readings, or human typing cadence. Radar's passive behavioral signals — the ones that catch human fraudsters — produce null or anomalous readings for legitimate agents and fraudulent token replays alike. Worse: agents retry harder than humans. A soft decline that causes a human to abandon checkout becomes a programmatic retry loop. Stripe's Adaptive Acceptance ML was trained predominantly on human patterns; the retry-aggressiveness signal that flags a human fraud attempt may appear as normal agentic behavior.

This is the structural problem: fraud does not disappear when authentication is good. It shifts. The attack surface moves from pre-auth to post-auth. The attacker's goal is no longer to steal credentials — it is to steal or misuse the downstream artifact of successful authentication: the access token, the payment session, the Shared Payment Token.

Post-Auth Attack Taxonomy

Speculation Flag

Agent-specific fraud rate data is not yet publicly available as a distinct metric. Industry figures for card-not-present fraud rates (typically cited in the 0.06%–0.10% range by Nilson Report) are the closest proxy, but agent transactions carry a distinct risk profile — programmatic retry, no behavioral friction, off-session flow — that likely places them closer to the higher-risk end of the CNP distribution. Until agent-specific fraud data is published by card networks or processors, treat all percentage figures extrapolated from CNP data as provisional and flag accordingly.

Attack Type	Mechanism	Auth-Layer Visibility	Payment-Layer Signal
Token exfiltration	Attacker steals bearer access token from agent runtime via prompt injection, log leak, or memory dump	None — token was legitimately issued	Token age anomaly; new IP/device; velocity spike
Phished principal credentials	Attacker hijacks the principal's OAuth session before grant; agent proceeds with attacker-controlled grant	Detectable via unusual AS login signal, not at payment layer	Geographic anomaly; unusual MCC
Spending cap probing	Many small transactions just under RAR `max_amount` to map enforcement boundary — adapted card-testing for agents	Not visible at auth layer	Hourly velocity cluster below a threshold
Account takeover via compromised agent platform	Attacker breaches the agent platform itself; all delegated tokens now under attacker control	None — tokens appear legitimate	Sudden velocity spike across many principals simultaneously
Merchant category drift	Agent authorized for "software subscriptions" makes purchases in "electronics" or "wire transfers"	Detectable via RAR `authorization_details` mismatch if AS enforces strictly	MCC not in principal's typical MCC list

Auth Layer

DPoP + RAR

Binds tokens to client key pairs and constrains spending parameters at issuance time. Catches token exfiltration and cap overrun before payment is attempted.

Payment Layer

Stripe Radar + Metadata

Custom rules keyed to agent metadata catch velocity probing, cross-border anomalies, and category drift that the auth layer never sees.

Dispute Layer

Evidence Package

OAuth grant log + RAR authorization_details + token introspection record + DPoP binding proof = your chargeback defense under Visa 10.4 / Mastercard 4837.

§2 · Stripe Radar Deep Dive

Three Radar Tiers — Only One Works for Agent Commerce.

Stripe Radar is trained on data from merchants processing more than $1.9 trillion in payments annually. Stripe reports a 92% probability that any card presented has been seen before on the Stripe network — dense cross-merchant signal that standalone fraud tools cannot replicate. But the base tier gives you ML scoring without the ability to inject the context that makes agent transactions distinct. For agent commerce, Radar for Fraud Teams is the operative tier.

Radar Tier Comparison

Feature	Radar (Base ML)	Radar for Fraud Teams	Stripe Sigma
Pricing	Included with Stripe payments — no additional charge	~$0.02/screened transaction (re-verify before launch at stripe.com/radar/pricing)	Included with Stripe payments; Data Pipeline pricing separate (re-verify)
ML fraud scoring	✓ Every transaction scored 0–100	✓ Same plus custom ML	N/A — analytics layer
Risk levels	`normal`, `elevated`, `highest`	Same, plus adjustable thresholds	Queryable via SQL
Custom rules engine	✗ — no custom rules	✓ Hundreds of attributes; metadata support via double-colon syntax	N/A
AI-powered Radar Assistant	✗	✓ Natural language → rule syntax	N/A
Block/allow lists	Basic	✓ Full (card, email, IP)	N/A
Backtesting on 6-month history	✗	✓	N/A
Dispute probability score	✗	✓ ML-based likelihood of winning dispute	Queryable
Smart Refunds	✗	✓	N/A
Agent-specific custom metadata rules	✗ — no mechanism to read your metadata	✓ Double-colon syntax for all `metadata` fields you pass	Post-hoc query only

Critical

Stripe Radar base ML does NOT detect agent-specific fraud out of the box. It cannot know the token was issued 45 minutes ago, or that the principal's typical MCC is 5734, unless you pass that data via metadata. Without Fraud Teams and custom metadata rules, Radar treats agent transactions as unusual-but-unclassified CNP traffic and will miss the most dangerous attack vectors.

Custom Rules Syntax

Radar for Fraud Teams rules follow the structure {action} if {attribute} {operator} {value}. Actions are Block, Review, Allow, and Request 3D Secure.

Standard attributes use single colons: :risk_score:, :amount_in_usd:, :ip_country:, :card_country:, :minutes_since_customer_was_created:. Metadata attributes use double colons: ::agent_token_age_minutes::, ::principal_typical_mcc::, ::is_agent_flow::. Customer metadata: ::[customer:field_name]::. Compound conditions use AND, OR, NOT.

Radar Rule — Compound Condition Example

Block if ::is_agent_flow:: = 'true' AND :amount_in_usd: > 50 AND :minutes_since_customer_was_created: < 60

Radar Rule — Cross-Border Agent Review

Review if :ip_country: != :card_country: AND ::is_agent_flow:: = 'true'

ML Scoring and Risk Thresholds

Every transaction receives a risk_score from 0 (least risky) to 100 (riskiest). Default thresholds: ≥65 = elevated, ≥75 = highest. These are adjustable in Risk Settings for Fraud Teams accounts. Custom rules can target :risk_score: directly: Block if :risk_score: > 85. The Fraud Teams tier also exposes :bot_score:, :fraudulent_dispute_score:, and :early_fraud_warning_score: as rule attributes.

For agent commerce, consider lowering the elevated threshold from 65 to 60. Agent flows generate no behavioral biometric signal — the absence of those signals artificially deflates the risk score on what may be a fraudulent replay.

Velocity Attributes Built into Radar

Attribute	Counts	Agent Use
`:authorized_charges_per_email_hourly:`	Successful charges keyed to email in past hour	Cap probing detection — cluster of micro-transactions from one principal
`:blocked_charges_per_email_hourly:`	Blocked charges keyed to email in past hour	Retry storm detection after an initial block
`:email_count_for_ip_hourly:`	Distinct emails from same IP in past hour	Platform-breach detection — attacker using same infra for many principals
`:minutes_since_per_payment_instrument_fingerprint_first_seen:`	Age of payment instrument in Stripe's network	New card + immediate agent use = card-on-file compromise pattern

§3 · Adaptive Acceptance

How Agents Break the Retry Model.

Stripe's Adaptive Acceptance uses ML to identify false declines and retry them with optimized parameters — message format, routing, postal code normalization — invisibly, before the customer sees the decline. In 2024, Adaptive Acceptance recovered a record $6 billion in falsely declined transactions, a 60% year-over-year increase in retry success rate. Stripe recently migrated from a gradient-boosted XGBoost model to a TabTransformer+ deep neural network architecture, achieving 70% greater precision in identifying legitimate declined transactions.

The Agent-Specific Retry Problem

Adaptive Acceptance's retry signal is tuned on predominantly human behavioral patterns. Legitimate agents retry programmatically — there is no human abandonment signal after a soft decline. Fraudulent token reuse also retries programmatically. From the ML's perspective, both look similar. This creates two concrete operational risks:

Over-retry fines. Mastercard allows 35 retry attempts per card per rolling 30-day period; Visa allows 15. Exceeding these can result in fines from acquirers. Agents that implement their own retry logic outside of Stripe's managed flows may inadvertently hit these limits. Use Stripe's Smart Retries rather than raw retry loops.
False-negative pressure on fraud detection. The retry-aggressiveness signal that flags a human fraud attempt may appear as normal agentic behavior, reducing Radar's effective detection rate on fraudulent token replays.

Signal	Human Behavior	Legitimate Agent Behavior	Fraudulent Token Replay	Distinguishing Feature
Retry after soft decline	Rare — human abandons or uses different card	Programmatic — retries immediately per configured logic	Programmatic — retries aggressively until hard decline or success	Token metadata: age, binding, principal context — NOT retry pattern
Browser fingerprint	Rich device + behavioral signal	Headless / absent	Headless / absent	Not distinguishing — both absent
Typing cadence / mouse movement	Unique behavioral biometric	Not applicable	Not applicable	Not distinguishing — both absent
Transaction timing	Business hours, human time zones	May run 24/7 per principal's scheduled workflow	Often off-hours — attacker rushes token use	Time-of-day in principal's timezone is a weak but useful signal
Geographic consistency	IP near cardholder billing address	Agent IP may differ from principal home country	IP in attacker's country — different from principal	`::principal_home_country::` metadata makes this a strong signal

Operator Note

The distinguishing feature between a legitimate agent retry and a fraudulent token replay is not behavioral latency — it is token metadata: age since issuance, DPoP binding proof, principal context. Operators who instrument Stripe charges with this metadata give Adaptive Acceptance's models the signal they need to generalize across merchant traffic and calibrate to agent behavioral baselines over time.

§4 · Agent-Specific Risk Signals

Eight Signals Base Radar Cannot See Without Your Metadata.

Speculation Flag

Empirical agent-specific fraud rate data does not yet exist at publication volume. The signal patterns below are derived from card-testing, account takeover, and token theft research adapted to the agentic context. These are the closest available proxies, not confirmed agent fraud benchmarks.

Signal	What to Detect	Radar Rule Pattern	Disposition
Token issued recently + high amount	Leaked token rushed into use before principal notices	`Block if ::agent_token_age_minutes:: < 60 AND :amount_in_usd: > 50`	Block
Spending cap probing	Many small transactions just under RAR `max_amount`; adapted card-testing for agents	`Review if ::is_agent_flow:: = 'true' AND :authorized_charges_per_email_hourly: > 5 AND :amount_in_usd: < 10`	Flag for review; escalate if pattern continues
Token reuse from new IP/device	Attacker replaying token from different infrastructure	`Review if ::is_agent_flow:: = 'true' AND :ip_country: != ::principal_home_country::`	Flag; correlate with login anomaly signals
Merchant category drift	Agent authorized for `software` purchases attempting `electronics` or `wire_transfer`	`Block if ::is_agent_flow:: = 'true' AND NOT (:mcc: IN ::principal_typical_mcc_list::)`	Block or Request 3D Secure
Time-of-day anomaly	3 AM in principal's timezone; no legitimate principal-initiated agent tasks at that hour	`Review if ::is_agent_flow:: = 'true' AND ::transaction_hour_principal_tz:: < 5`	Flag for review
Cross-border anomaly	IP geolocation inconsistent with principal residence and card issuer country	`Block if :ip_country: != :card_country: AND ::is_agent_flow:: = 'true'`	Block
Velocity inconsistent with principal history	Rate far exceeding principal's historical transaction frequency	`Review if :authorized_charges_per_email_hourly: > ::principal_baseline_hourly_txn_count::`	Flag
New payment instrument + agent flow	Fresh card added to principal account then immediately used by agent	`Review if ::is_agent_flow:: = 'true' AND :minutes_since_per_payment_instrument_fingerprint_first_seen: < 1440`	Flag

Spending Cap Probing in Detail

Classic card testing involves running many small transactions across stolen cards to identify which are active. In the agent context the attack adapts: the attacker holds a valid token with a known max_amount embedded in the RAR authorization_details. Rather than probing card validity, they probe for three things:

Authorization server enforcement boundaries — does the AS actually reject charges above max_amount, or does it pass them through?
Merchant-side rate limits — how many transactions per hour does the platform allow before triggering a review?
Risk score calibration — do small transactions below a threshold go through without Radar review?

The signal is a cluster of agent-attributed transactions from the same principal, same card, same time window, all just below either the RAR spending cap or the apparent Radar block threshold. Radar's velocity attribute :authorized_charges_per_email_hourly: will catch high-volume probing. For sophisticated low-velocity probing, the time-window anomaly is the more reliable signal.

Under Visa's VAMP program, even 300,000 combined approved + declined authorization attempts per month can trigger acquirer scrutiny — counts, not just dollar losses, matter. Agent-generated micro-transactions aggregate toward these thresholds faster than human ones.

Required Metadata Fields — Pass on Every Agent Charge

Stripe metadata payload — stripe.paymentIntents.create()

{
  "is_agent_flow": "true",
  "agent_token_age_minutes": "45",
  "principal_home_country": "US",
  "principal_typical_mcc": "5734",
  "spending_cap_usd": "200",
  "rar_reference_id": "txn-ref-abc123",
  "transaction_hour_principal_tz": "14",
  "retry_count": "0"
}

The 30-Day AgentMall Newsletter

One operator note per week. The trust layer in your inbox.

Field-tested patterns, real failure modes, and the next trust-layer spoke as it ships. No fluff. Cancel any time.

§5 · DPoP Integration

Shrinking the Fraud Blast Radius — RFC 9449 Token Binding.

DPoP (Demonstrating Proof of Possession), standardized as RFC 9449 in September 2023, binds OAuth access and refresh tokens to a public/private key pair held by the client. An attacker who exfiltrates the access token without the matching private key cannot use it — the resource server rejects the DPoP proof verification. For agent commerce, this is the single highest-leverage auth-layer fraud mitigation against token exfiltration.

Scenario	Without DPoP	With DPoP (RFC 9449)	Residual Risk
Bearer token exfiltrated from logs	Full spending authority up to RAR limits — attacker can use immediately	Token useless without private key — attack neutralized	None from this vector
Access token intercepted in transit	Replay possible; bearer token usable anywhere the token is accepted	Proof JWT contains `htu` (target URL) + `htm` (method) + fresh `iat` — replay blocked by RS	Millisecond-window replay; mitigated by server nonces (RFC 9449 §8)
Full device/runtime compromise	Attacker controls token	Attacker also controls private key — DPoP provides no protection	Requires platform-layer compromise detection
Refresh token theft	Attacker can obtain new access tokens indefinitely	Refresh tokens are also DPoP-bound — same private key required	None from this vector if refresh tokens are bound
Prompt injection → agent makes fraudulent purchase	Agent's legitimate token used for attacker-directed transaction	DPoP provides no protection — legitimate client making the call	Requires Radar rules + RAR MCC enforcement

DPoP Flow for Agent Commerce

The client generates an asymmetric key pair (P-256/ES256). On every token request and every resource request, the client signs a fresh proof JWT containing the HTTP method, endpoint URL, current timestamp (iat), and a jti nonce. The authorization server computes the SHA-256 thumbprint of the public key and embeds it in a cnf claim in the access token. The resource server verifies the DPoP proof on each API call — including the payment call to Stripe's charge endpoint — confirming the presenting client holds the private key matching the bound public key.

Non-extractable key generation is essential: use crypto.subtle.generateKey with extractable: false in browser environments; use hardware-backed secure enclaves in native agent runtimes. A DPoP key pair that can be extracted provides no more security than a bearer token.

Server-issued nonces (RFC 9449 §8) narrow the proof validity window, preventing even millisecond-scale replay of intercepted proofs. Configure your authorization server to issue nonces and require clients to persist the latest DPoP-Nonce response header for subsequent proofs.

OAuth 2.1 and the Model Context Protocol both list sender-constrained tokens — DPoP or mTLS — as recommended hardening for public clients, which now explicitly includes AI agents. FAPI 2.0 (the open-banking security profile) names DPoP as one of two acceptable sender-constraining mechanisms.

Cross-Link

Full DPoP implementation specifics — authorization server configuration, PKCE integration, token introspection response shape including the cnf claim — are covered in the OAuth + PKCE + DPoP spoke. This section covers only the fraud-prevention angle.

§6 · Authorization-Layer Fraud Prevention

RAR + Introspection — Spending Limits the AS Actually Enforces.

Rich Authorization Requests (RFC 9396) allow OAuth clients to embed structured, transaction-level permission details in the authorization request via the authorization_details parameter. For agent commerce, this delivers spending caps, MCC restrictions, single-use authorization references, and location binding — all embedded at token issuance time.

Global Correction

OAuth scopes alone are NOT sufficient for agent spending limits. Scopes like purchase:limited convey a category of permission but carry no structured amount, currency, or MCC constraints. RFC 9396 RAR's authorization_details is the canonical answer. A scope string cannot be enforced as a $150 spending cap; "max_amount": 150, "currency": "USD" in authorization_details can.

Four RAR Controls for Agent Commerce

Control	authorization_details Field	Fraud Type Caught	Enforcement Gap
Spending cap	`"max_amount": 150, "currency": "USD"`	Amount overrun by exfiltrated token or compromised agent	Only as good as AS + RS enforcement; Stripe is unaware of RAR limits unless you propagate them via metadata
MCC restriction	`"allowed_mcc": ["5045", "5734"]`	Merchant category drift — agent authorized for software used in electronics or wire transfers	Same enforcement gap; must propagate to Stripe metadata for a second check
Single-use reference	Unique reference number in `authorization_details`	Token replay — same authorization used for multiple transactions	Requires idempotent reference registry at resource server; must mark reference as consumed on first use
Location binding	`"locations": ["https://api.merchant-a.com"]`	Token accepted at wrong merchant endpoint	Requires AS to embed locations array in issued token and RS to validate against its own URI

Two-Layer Enforcement Architecture

RAR enforcement happens at the authorization server and resource server — not at Stripe. If the merchant's own API does not validate the introspected authorization_details before calling stripe.charges.create(), RAR provides no protection at the payment layer. The two layers must be explicitly wired together.

Two-Layer Flow — AS + RS + Stripe Metadata

1. Principal authorizes via OAuth + PKCE + RAR:
   authorization_details = {
     "type": "agent_purchase",
     "max_amount": 200,
     "currency": "USD",
     "allowed_mcc": ["5734"],
     "single_use_ref": "txn-ref-abc123"
   }

2. Agent calls merchant API with access token + DPoP proof.

3. Merchant RS introspects token → extracts authorization_details:
   - Validates: amount <= max_amount (200)
   - Validates: requested MCC in allowed_mcc (["5734"])
   - Validates: single_use_ref not in consumed-ref registry
   - Marks single_use_ref as consumed atomically

4. Merchant passes authorization_details as Stripe charge metadata:
   metadata = {
     "is_agent_flow": "true",
     "rar_max_amount": "200",
     "rar_allowed_mcc": "5734",
     "rar_reference_id": "txn-ref-abc123",
     "agent_token_age_minutes": "38"
   }

5. Stripe Radar rules fire against that metadata — second enforcement layer.
   If merchant API has a logic error and passes a charge above max_amount,
   Radar catches it here before it clears.

Cross-Link

Full RAR authorization_details schema design, Pushed Authorization Requests (PAR) combination, and AS introspection implementation are covered in the OAuth + PKCE + DPoP spoke.

§7 · Chargeback Economics

Agent-Initiated Purchases — The Merchant Bears the Liability.

Standard card network chargeback rules apply to agent-initiated purchases. Stripe does not currently carve out a separate liability regime for agent commerce. (Re-verify before launch: Stripe's Agentic Commerce Protocol (ACP) and Shared Payment Token (SPT) documentation does not explicitly address chargeback liability allocation as of this writing.)

Current Liability Framework

Under Visa and Mastercard rules, a Visa 10.4 or Mastercard 4837 chargeback reason code — "No Cardholder Authorization" — is the primary vector for fraud disputes. The issuing bank credits the cardholder and debits the merchant. Stripe charges a $15 dispute fee per chargeback on US cards (re-verify; fee structures vary by country and account type).

For agent-initiated purchases, the merchant must demonstrate valid principal authorization at the time of purchase, plus proof the transaction was initiated by a party the cardholder authorized. Without the artifacts below, the chargeback will succeed and the merchant absorbs the loss.

Merchant Defense Evidence Package

Evidence Artifact	What It Proves	Where It Comes From	Retention
OAuth grant log	Principal explicitly authorized the agent, including scopes and RAR `authorization_details`	Authorization server grant record, timestamped	Minimum 18 months
RAR authorization_details payload	Transaction was within the principal-authorized spending cap and MCC constraints	Introspection response snapshot at charge time	Minimum 18 months
Token introspection record	Access token was valid, unrevoked, and within authorization parameters at charge time	Resource server introspection call result, logged atomically with charge	Minimum 18 months
Stripe charge metadata	Charge ID, timestamp, and agent-specific metadata linking to the above artifacts	Stripe `payment_intent.succeeded` webhook payload	Minimum 18 months
DPoP binding proof	DPoP public key thumbprint matched the token's `cnf` claim — requesting client held the private key	Resource server DPoP verification log; `dpop_jkt` field in transaction record	Minimum 18 months

Platform-Breach Liability

Speculation Flag

No established card network rule or Stripe policy currently addresses the specific case where an agent platform is compromised and all delegated tokens are used fraudulently at scale. The current expectation, based on standard merchant services agreements, is that the merchant (the agentmall operator) bears liability unless it can prove individual principal consent for each transaction. In a platform-breach scenario, merchants face chargebacks on every transaction made by the attacker-controlled agent — at scale, with limited ability to recover. Mitigation: per-transaction authorization records, RAR single-use reference enforcement, and post-charge notification to principals are the current best practices.

Atomic Transaction Evidence Record — Schema

Durable transaction record — write on payment_intent.succeeded

{
  "stripe_charge_id": "ch_3Px...",
  "stripe_customer_id": "cus_...",
  "oauth_grant_id": "grant_abc123",
  "rar_reference_id": "txn-ref-abc123",
  "token_issued_at": "2026-06-03T14:22:00Z",
  "principal_id": "user_xyz",
  "amount_cents": 4999,
  "currency": "usd",
  "mcc": "5734",
  "agent_platform_id": "platform_abc",
  "dpop_jkt": "sha256_thumbprint_of_public_key",
  "radar_risk_score": 28,
  "charge_timestamp": "2026-06-03T14:23:18Z",
  "retain_until": "2028-01-01T00:00:00Z"
}

§8 · Real Radar Rule Examples

Five Complete Rules for Agent Commerce.

All rules below assume you pass the metadata payload from §4 on every agent-initiated charge. All rules are complete and deployable — no truncation. Start every rule in Review mode and backtest against six months of data before switching any to Block.

Rule 1 — Block New Token + High Amount + MCC Mismatch (Composite Risk)

Radar for Fraud Teams — Block rule

Block if ::is_agent_flow:: = 'true'
  AND ::agent_token_age_minutes:: < 60
  AND :amount_in_usd: > 50
  AND NOT (::principal_typical_mcc:: = :mcc:)

Catches: freshly leaked token being rushed into use in the wrong merchant category before the principal notices. The composite condition (all three must be true) dramatically reduces false positives — a legitimate agent will rarely have a brand-new token, a high transaction, AND a category mismatch simultaneously.

Rule 2 — Flag Spending Cap Probing

Radar for Fraud Teams — Review rule

Review if ::is_agent_flow:: = 'true'
  AND :authorized_charges_per_email_hourly: > 5
  AND :amount_in_usd: < 10

Catches: micro-transaction probing pattern against the enforcement boundary. Five successful sub-$10 charges from the same principal email within one hour is an unusual cluster for legitimate agent workflows.

Rule 3 — Flag Cross-Border Anomaly on Agent Traffic

Radar for Fraud Teams — Review rule

Review if ::is_agent_flow:: = 'true'
  AND :ip_country: != ::principal_home_country::

Catches: token reuse from attacker infrastructure in a different country from the principal's registered home country. Pass principal_home_country as ISO alpha-2 from your principal profile at charge time.

Rule 4 — Flag Late-Night Agent Transactions

Radar for Fraud Teams — Review rule

Review if ::is_agent_flow:: = 'true'
  AND ::transaction_hour_principal_tz:: < 4

Catches: transactions between midnight and 4 AM in the principal's local timezone — unlikely to be principal-scheduled legitimate workflows. Compute the principal's local hour server-side before the Stripe call and pass it as transaction_hour_principal_tz (integer 0–23).

Rule 5 — Flag New Payment Instrument Used Immediately in Agent Flow

Radar for Fraud Teams — Review rule

Review if ::is_agent_flow:: = 'true'
  AND :minutes_since_per_payment_instrument_fingerprint_first_seen: < 1440

Catches: newly added card (less than 24 hours since Stripe first saw the card fingerprint across its network) being immediately used via agent. This is the card-on-file compromise pattern — attacker adds a stolen card to the principal's account and routes the agent to use it before the principal notices.

Backtest Before Block

Radar for Fraud Teams provides a backtesting interface against the last six months of transactions. Run all new agent rules in Review mode before switching to Block. Use Stripe Sigma to query radar_early_fraud_warnings and disputes tables to understand the base rate of legitimate agent transactions that would be caught. Rules with >1% false positive rate on known-legitimate traffic need threshold adjustment before graduating to Block.

§9 · Trust-Layer Interaction Map

How This Plugs Into OAuth + RAR + DPoP + Agents Page.

Fraud prevention does not stand alone. Each layer in the agent-ready commerce architecture catches a different slice of the threat model. The table below maps what this spoke owns versus what the OAuth spoke and the Agents Page spoke own.

Layer	Technology	What It Catches	What It Misses	Owned By
Auth Layer	OAuth 2.1 + PKCE + DPoP	Token exfiltration reuse; unauthorized token issuance; stolen refresh tokens	Post-auth transaction anomalies; merchant category drift; spending pattern fraud	OAuth spoke
Authorization Scope Layer	RAR (RFC 9396) + introspection	Spending cap overrun; MCC mismatch; single-use token replay; location-bound token misuse	Behavioral velocity anomalies; device/IP fraud signals; timing attacks	OAuth spoke
Payment Layer	Stripe Radar + custom rules	Velocity probing; cross-border anomaly; category drift via metadata; time-of-day anomaly; cap probing	Sub-threshold distributed probing across many tokens; fraud that exactly mimics legitimate principal behavior	This spoke
Dispute Layer	Stripe Radar dispute score + Sigma + chargeback evidence package	Post-hoc fraud identification; dispute probability scoring; chargeback defense	Does not prevent fraud proactively — retroactive only	This spoke
Agent Discovery Layer	`/agents` page + UCP compatibility	Agent-readable trust signals; capability declarations; refund + dispute policy visibility	Not a fraud prevention layer directly	Agents Page spoke

4-Layer Agent-Ready Model Integration

The 4-Layer Agent-Ready Model — Structured Data → API Endpoint → MCP Tool Description → UCP Compatibility — intersects with fraud prevention at each layer:

Structured Data layer: Product catalog + pricing metadata must include MCC codes and authorization policy fields that agents can read before initiating a purchase, enabling pre-purchase authorization checks and reducing MCC-drift fraud.
API Endpoint layer: The payment endpoint is where Stripe metadata gets attached and Radar rules fire. Instrument here. RAR introspection validation belongs here too.
MCP Tool Description layer: MCP tool schemas should declare expected spending ranges and merchant categories so agents can surface these to principals for upfront consent, reducing post-purchase disputes.
UCP Compatibility layer: UCP compatibility signals include fraud policy declarations — refund policy, dispute process, authorization scope requirements — that honest agents use for routing and that fraud detection systems can validate against post-transaction.

§10 · Common Mistakes

Eight ways agent fraud prevention breaks in production.

1. Relying on Stripe's base Radar ML without custom rules

The base ML tier has no mechanism for agent-specific signals. It cannot know the token was issued 45 minutes ago, or that the principal's typical MCC is 5734, unless you pass that data. Without Fraud Teams and custom metadata rules, Radar treats agent transactions as unusual-but-unclassified CNP traffic and will miss the most dangerous attack vectors. Fix: upgrade to Radar for Fraud Teams. Pass is_agent_flow, agent_token_age_minutes, principal_typical_mcc, and principal_home_country on every agent-initiated charge.

2. Passing RAR authorization_details to the AS but not forwarding enforced parameters to Stripe

Operators often implement RAR at the OAuth layer (good) but then call stripe.charges.create() without propagating the max_amount or MCC constraints as metadata. Radar has no visibility into what the AS authorized. The two layers are disconnected. Fix: at the merchant API layer, extract authorization_details from the introspection response and attach the relevant fields as Stripe charge metadata before the payment call.

3. Not logging OAuth grant artifacts at charge time

Fraud happens. When a chargeback arrives, you need the OAuth grant timestamp, the token's sub claim, the RAR authorization_details, and the Stripe charge ID — all linked to the same transaction record. If you log these separately, you may not be able to reconstruct the chain in time to contest the dispute. Fix: at charge creation time, write a single atomic record linking all of these identifiers to durable storage.

4. Using DPoP without server-side nonce enforcement

DPoP without nonces is vulnerable to replay within the proof's validity window (typically a few minutes). An attacker who intercepts a DPoP proof in transit can replay it within that window. Fix: configure your authorization server to require DPoP nonces per RFC 9449 §8. Clients should persist the latest nonce from the server's DPoP-Nonce response header and include it in subsequent proofs.

5. Not backtesting Radar rules before enabling Block action

A rule like Block if :ip_country: != ::principal_home_country:: will block every principal who travels or uses a VPN. Running it in Block mode without a backtest will produce false positives at scale. Fix: enable all new rules in Review mode first. Backtest against the last six months of transactions using Radar's backtest interface. Graduate to Block only after confirming the false positive rate is acceptable.

6. Treating agent velocity the same as human velocity

An agent that legitimately processes 50 transactions per hour for a principal running a bulk-purchasing workflow will trigger standard hourly velocity rules calibrated for human shoppers. Fix: tier your velocity rules by agent type. Pass a principal_authorized_tps metadata field reflecting the RAR-authorized transaction rate. Write Radar rules that allow higher velocity for principals who explicitly authorized high-rate workflows.

7. Assuming a soft decline means no fraud

Agents retry soft declines programmatically. Stripe's Smart Retries and Adaptive Acceptance will attempt to recover the transaction. A fraudulent token that generates a soft decline on the first attempt may succeed on retry. Fix: write a Radar rule that places an agent-attributed charge in Review after a first soft decline within the same session. Pass a retry_count metadata field incremented by your application and escalate disposition when ::retry_count:: > 1.

8. No principal notification on agent purchase completion

Even with perfect fraud controls, principals may not recognize legitimate purchases the agent made on their behalf — leading to first-party chargebacks (friendly fraud). Fix: send a push notification or email to the principal for every agent-completed purchase above a threshold (e.g., $25). Include item description, amount, timestamp, and a dispute-prevention link. This alone reduces first-party chargeback rates materially.

§11 · FAQ

Frequently asked questions.

Can Stripe Radar detect that a transaction was made by an AI agent rather than a human?

Out of the box, no. Stripe Radar's ML is trained on behavioral and network signals common across human-initiated card-not-present transactions. It does not have a native "is this an agent?" classifier. Agent flows often present as off-session, headless, with no browser fingerprint or behavioral biometric — signals that can look like bot traffic or compromised-account activity. The solution is to explicitly pass is_agent_flow: true in Stripe charge metadata and write custom Radar rules (requiring Fraud Teams) keyed to that flag plus agent-specific signals like token age and MCC context.

What is the Stripe Radar for Fraud Teams pricing, and is it worth it for an agent commerce operator?

Radar for Fraud Teams is currently ~$0.02 per screened transaction (re-verify at stripe.com/radar/pricing before launch). For an operator processing $500,000/month with an average transaction of $75, that is approximately 6,667 transactions × $0.02 = $133/month for the custom rules engine, backtesting, advanced analytics, and Sigma data access. Given that a single successful fraud chargeback on a $200 agent transaction costs the merchant the $200 plus a ~$15 dispute fee, the break-even is roughly one prevented chargeback per two months. For agent commerce where fraud signals are novel and default rules inadequate, the upgrade is almost always economically justified.

Does DPoP eliminate agent transaction fraud?

No. DPoP eliminates the specific threat of token exfiltration and bearer-token replay — an attacker who steals the access token but does not control the agent's private key cannot use it. DPoP does not protect against: (a) full device/runtime compromise where the attacker also controls the private key; (b) social engineering of the principal to authorize a malicious agent; (c) prompt injection attacks that cause the legitimate agent itself to make fraudulent purchases; (d) friendly fraud where the principal disputes a legitimate agent purchase. Payment-layer controls (Radar) and principal notification remain necessary alongside DPoP.

Who is liable for a chargeback on an agent-initiated purchase — the principal, the agent platform, or the merchant?

Under current card network rules as applied by Stripe, the merchant (the entity that processed the charge and received the funds) is the liable party. The principal's issuer will chargeback the merchant, not the agent platform. Whether the merchant can then recover from the agent platform depends on contractual terms between the merchant and the platform — Stripe has no visibility into that relationship. Stripe's standard terms apply the same chargeback rules to agent-initiated transactions as to any other CNP transaction. (Re-verify: Stripe's ACP documentation and SPT terms should be checked for any liability carve-outs specific to ACP-flow transactions.)

What is Stripe Sigma's role in agent fraud detection?

Stripe Sigma provides SQL access to all Stripe data including Radar signals, early fraud warnings, dispute records, and charge metadata. It does not block transactions in real time — that is Radar's role. Sigma's value in the agent context is post-hoc pattern analysis: identifying which agent-attributed charges have elevated dispute rates, which principal accounts show spending cap probing patterns, and which MCCs are anomalous for your agent traffic. These insights feed back into custom Radar rule development. Sigma is included with Stripe payments and with Radar for Fraud Teams accounts; Radar data in Sigma requires the Fraud Teams tier.

How does Stripe's Agentic Commerce Protocol (ACP) and Shared Payment Token (SPT) affect fraud prevention?

ACP is an open standard co-developed by Stripe and OpenAI for agent-to-merchant transaction flows. SPTs are scoped to a specific merchant and cart total, reducing the blast radius compared to a general-purpose bearer token — a stolen SPT can only be used at the specific merchant for the specific cart total it was issued for. However, SPTs are still bearer tokens in the sense that possession confers use. The fraud prevention stack described in this document (Radar rules, metadata enrichment, DPoP at the OAuth layer) remains necessary.

My agent retries programmatically and I'm worried about Visa/Mastercard retry penalty thresholds. What is the risk?

Visa allows up to 15 retry attempts per card per rolling 30-day period; Mastercard allows 35. Exceeding these can result in fines from acquirers. Stripe's Adaptive Acceptance reduces retry counts by identifying true false declines efficiently — the updated TabTransformer+ model reduced retry attempts by 35% while recovering more revenue. However, agents that implement their own retry logic outside of Stripe's managed flows may inadvertently exceed network limits. Use Stripe's Smart Retries (managed) rather than implementing raw retry loops. Pass ::retry_count:: metadata to track retry depth in Radar rules.

Can I use Stripe Radar to enforce RAR spending limits, or do I need to do that at my own API layer?

Stripe Radar cannot read OAuth authorization_details directly — Stripe has no visibility into your authorization server. You must propagate the enforced parameters (spending cap, MCC restrictions) as charge metadata, then write Radar rules that enforce them as a second layer. This is a belt-and-suspenders architecture: the AS enforces limits at the token level, and Radar enforces them again at the payment level. Neither layer alone is sufficient. The AS enforcement catches misuse at the API layer before payment is attempted; Radar catches any bypasses or merchant-API logic errors before the charge clears.

§12 · Step-by-Step

The 30-day rollout, in five steps.

Each step mirrors the HowTo JSON-LD at the top of this page word for word.

Step 1 — Instrument every agent-initiated charge with structured metadata

Before any rules can work, Stripe needs the context. Add the following fields to every stripe.paymentIntents.create() or stripe.charges.create() call on your agent-originated payment path: is_agent_flow (string 'true'), agent_token_age_minutes (integer minutes since OAuth grant was issued), principal_home_country (ISO alpha-2 from principal profile), principal_typical_mcc (comma-separated MCC list from authorization history), transaction_hour_principal_tz (0–23 integer, principal's local time), and rar_reference_id (the unique reference from RAR authorization_details). This is the foundation all downstream rules depend on.

Step 2 — Upgrade to Radar for Fraud Teams and configure risk thresholds

Navigate to the Radar settings in the Stripe Dashboard. Enable Radar for Fraud Teams (re-verify pricing at stripe.com/radar/pricing). Set your base risk threshold — for agent commerce, consider lowering the elevated threshold from 65 to 60 given the absence of behavioral biometric signal on agent traffic. Select the risk setting ("conservative," "balanced," or "aggressive") that aligns with your acceptable false positive rate. Enable early fraud warnings.

Step 3 — Deploy the compound agent fraud rules in Review mode, then backtest

Add each rule from Section 8 of this document starting with Review (not Block) disposition. Use Radar's backtest feature to see how the rules would have performed against the last six months of your transaction data. Identify the false positive rate for each rule. Rules with >1% false positive rate on known-legitimate traffic need threshold adjustment before graduating to Block. Use Stripe Sigma to cross-reference disputed transactions against the rules that would have caught them to measure true positive rates.

Step 4 — Implement DPoP on your authorization server for agent client registrations

Require DPoP for all OAuth clients registered as agent runtimes. Generate ES256 key pairs in hardware-backed secure storage (non-extractable). Enforce server nonces (RFC 9449 Section 8). Propagate the DPoP JWK thumbprint to your transaction logging layer so it can be included in chargeback evidence. Update your token introspection response to include the cnf claim so resource servers can verify binding. Cross-reference /agentmall_spoke_oauth for AS configuration specifics.

Step 5 — Build a durable transaction evidence record and principal notification flow

At the moment each agent-initiated charge is confirmed by Stripe (on payment_intent.succeeded webhook), write an immutable record containing: stripe_charge_id, stripe_customer_id, oauth_grant_id, rar_reference_id, token_issued_at (from introspection), principal_id, amount_cents, currency, mcc, agent_platform_id, dpop_jkt (if DPoP implemented), radar_risk_score, and charge_timestamp. Simultaneously dispatch a push/email notification to the principal with charge details and a one-click dispute flag. This evidence package is your chargeback defense and your principal-trust mechanism. Retain for a minimum of 18 months (chargebacks can arrive up to 120 days post-transaction; arbitration extends this further).

§13 · Continue the Guide

The complete trust layer.

Auth Layer

OAuth + PKCE + DPoP

The upstream trust layer: delegated purchase authority, RFC 9396 RAR spending limits, DPoP token binding implementation, and the RFC 7009 revoke path. Wire this spoke's metadata pipeline directly to the OAuth grant record.

Security

Security Hardening

Broader security posture beyond fraud: infrastructure hardening, secrets management, dependency audit, and the security controls that reduce the blast radius when a platform breach occurs.

Roadmap

AgentMall Roadmap

The 4-Layer Agent-Ready Model and the full trust-batch spoke sequence. Fraud prevention is layer 3 of 4; see where the other spokes plug in.

Bot Verification

Cloudflare Bot Verification

Distinguish legitimate agents from scrapers and fraudulent bots at the edge — before they reach your payment endpoint. Works alongside Radar to stop automated attacks earlier in the funnel.

Privacy

Privacy Compliance

GDPR and CCPA compliance for agent-collected transaction data — including the right-to-erasure tension with chargeback evidence retention requirements.

Trust Signals

Verified Reviews

Cryptographically verified purchase-linked reviews that agents can read as trust signals before initiating transactions — reducing dispute-driven chargebacks from principal uncertainty.

The Window

Auth is solved. Payment cleared. Now make it fraud-proof.

Good OAuth + PKCE + DPoP puts the right tokens in the right hands at the right time. But the post-auth attack surface is real — tokens leak, principals get phished, and agents probe enforcement boundaries programmatically. The operators who close that window first will carry lower dispute rates, lower chargeback costs, and the kind of trust signals that agents prefer to route toward. The full AgentMall Roadmap connects this payment-layer spoke to the auth layer, the bot-verification edge, and the agent-facing trust signals that complete the picture.

Open the AgentMall Roadmap →

Fraud Prevention When the Buyer Is an Agent — Stripe Radar, Adaptive Acceptance, and Agent-Specific Risk Signals.

Auth Passed, Payment Cleared — It's Still Fraud.

Post-Auth Attack Taxonomy

DPoP + RAR

Stripe Radar + Metadata

Evidence Package

Three Radar Tiers — Only One Works for Agent Commerce.

Radar Tier Comparison

Custom Rules Syntax

ML Scoring and Risk Thresholds

Velocity Attributes Built into Radar

How Agents Break the Retry Model.

The Agent-Specific Retry Problem

Eight Signals Base Radar Cannot See Without Your Metadata.

Spending Cap Probing in Detail

Required Metadata Fields — Pass on Every Agent Charge

One operator note per week. The trust layer in your inbox.

Shrinking the Fraud Blast Radius — RFC 9449 Token Binding.

DPoP Flow for Agent Commerce

RAR + Introspection — Spending Limits the AS Actually Enforces.

Four RAR Controls for Agent Commerce

Two-Layer Enforcement Architecture

Agent-Initiated Purchases — The Merchant Bears the Liability.

Current Liability Framework

Merchant Defense Evidence Package

Platform-Breach Liability

Atomic Transaction Evidence Record — Schema

Five Complete Rules for Agent Commerce.

Rule 1 — Block New Token + High Amount + MCC Mismatch (Composite Risk)

Rule 2 — Flag Spending Cap Probing

Rule 3 — Flag Cross-Border Anomaly on Agent Traffic

Rule 4 — Flag Late-Night Agent Transactions

Rule 5 — Flag New Payment Instrument Used Immediately in Agent Flow

How This Plugs Into OAuth + RAR + DPoP + Agents Page.

4-Layer Agent-Ready Model Integration

Eight ways agent fraud prevention breaks in production.

1. Relying on Stripe's base Radar ML without custom rules

2. Passing RAR authorization_details to the AS but not forwarding enforced parameters to Stripe

3. Not logging OAuth grant artifacts at charge time

4. Using DPoP without server-side nonce enforcement

5. Not backtesting Radar rules before enabling Block action

6. Treating agent velocity the same as human velocity

7. Assuming a soft decline means no fraud

8. No principal notification on agent purchase completion

Frequently asked questions.

Can Stripe Radar detect that a transaction was made by an AI agent rather than a human?

What is the Stripe Radar for Fraud Teams pricing, and is it worth it for an agent commerce operator?

Does DPoP eliminate agent transaction fraud?

Who is liable for a chargeback on an agent-initiated purchase — the principal, the agent platform, or the merchant?

What is Stripe Sigma's role in agent fraud detection?

How does Stripe's Agentic Commerce Protocol (ACP) and Shared Payment Token (SPT) affect fraud prevention?

My agent retries programmatically and I'm worried about Visa/Mastercard retry penalty thresholds. What is the risk?

Can I use Stripe Radar to enforce RAR spending limits, or do I need to do that at my own API layer?

The 30-day rollout, in five steps.

Step 1 — Instrument every agent-initiated charge with structured metadata

Step 2 — Upgrade to Radar for Fraud Teams and configure risk thresholds

Step 3 — Deploy the compound agent fraud rules in Review mode, then backtest

Step 4 — Implement DPoP on your authorization server for agent client registrations

Step 5 — Build a durable transaction evidence record and principal notification flow

The complete trust layer.

OAuth + PKCE + DPoP

Security Hardening

AgentMall Roadmap

Cloudflare Bot Verification

Privacy Compliance

Verified Reviews

Auth is solved. Payment cleared. Now make it fraud-proof.

One AgentMall note per week.