Idempotent APIs: Stop Duplicate Charges, Double Emails, and Ghost Orders

2026-04-20·11 min read·deep-dive

A user taps "Pay" on your checkout page. The request takes 8 seconds because your payment provider is slow. They tap again. Now they have two charges, two order confirmation emails, and one angry support ticket.

Or: your mobile app's retry logic fires three times on a flaky 4G connection. Your backend dutifully creates three identical records.

Or: a webhook consumer crashes after writing to the database but before acking the message. The queue redelivers. You process it twice.

These are all the same problem: the network is unreliable, and "exactly once" does not exist. The only tool you have is idempotency — the property that processing a request twice has the same effect as processing it once.

This post walks through what idempotency actually means, how to implement it correctly (including the race conditions everyone forgets), and how the big players like Stripe, AWS, and Shopify design their idempotency layers.

What idempotency is — and isn't

A method is idempotent if calling it N times produces the same result as calling it once. HTTP already gives you some of this for free:

GET, HEAD, OPTIONS — safe and idempotent
PUT, DELETE — idempotent (but not safe)
POST, PATCH — not idempotent by default

The trap: REST tutorials say "use PUT for updates because it's idempotent." That's true for simple overwrites, but PUT /users/123 { "credits": 100 } is only idempotent if you're setting the value to 100. If your handler does credits += 100, you've just broken the contract.

Idempotency is about observable effects, not HTTP verbs:

Creating a row with a natural unique key (email, slug) → idempotent
Creating a row with an auto-increment ID → not idempotent
UPDATE balance = 50 WHERE id = 1 → idempotent
UPDATE balance = balance + 50 WHERE id = 1 → not idempotent
Sending an email → not idempotent (unless you dedupe at the sender)
Publishing to Kafka with an idempotent producer config → idempotent (Kafka dedupes on producerId + sequence)

Two categories of operations require explicit idempotency infrastructure:

Non-deterministic side effects — charging cards, sending emails, calling external APIs.
State mutations with auto-generated identifiers — creating orders, users, uploads.

For these, you need an idempotency key.

The idempotency key pattern

The client generates a unique key (UUID v4 is standard) and sends it with the request:

POST /v1/charges HTTP/1.1
Idempotency-Key: a4f8c2e1-9b3d-4c7a-8e1f-2a6b9c4d8e3f
Content-Type: application/json

{ "amount": 5000, "currency": "usd", "source": "tok_visa" }

The server's job:

Look up the key. If found, return the cached response without re-executing.
If not found, execute the operation, store the result, and return it.
Handle two clients racing with the same key (which does happen).

That's the contract. The implementation is where it gets interesting.

Storage: what you actually need to persist

A minimal idempotency record needs:

Field	Purpose
`key`	Primary key (UUID from client)
`request_hash`	SHA-256 of request body + path
`status`	`pending`, `completed`, `failed`
`response_code`	Cached HTTP status
`response_body`	Cached response (JSON)
`created_at`	For TTL (expire old keys)
`locked_at`	For detecting stuck requests
`user_id`	Scope keys per user to prevent cross-tenant collisions

The request_hash matters. If a client reuses the same idempotency key with a different payload, that's almost always a bug — you should reject with 422 Unprocessable Entity, not silently return the old response or process the new one. Stripe documents this exact behavior.

Here's a PostgreSQL schema:

CREATE TABLE idempotency_keys (
  key UUID NOT NULL,
  user_id BIGINT NOT NULL,
  request_hash BYTEA NOT NULL,
  status VARCHAR(16) NOT NULL,
  response_code SMALLINT,
  response_body JSONB,
  resource_id BIGINT,
  locked_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  expires_at TIMESTAMPTZ NOT NULL DEFAULT NOW() + INTERVAL '24 hours',
  PRIMARY KEY (user_id, key)
);

CREATE INDEX idx_idempotency_expires ON idempotency_keys(expires_at);

Note the composite primary key: scoping keys per-user prevents one tenant's UUID collision from affecting another.

The naive implementation (and why it's broken)

Here's what most developers write on their first attempt:

@app.post("/charges")
def create_charge(req, idempotency_key: str):
    existing = db.query(
        "SELECT * FROM idempotency_keys WHERE user_id = %s AND key = %s",
        (req.user_id, idempotency_key)
    )
    if existing:
        return existing.response_body, existing.response_code

    result = stripe.charges.create(amount=req.amount, source=req.source)

    db.execute(
        "INSERT INTO idempotency_keys (user_id, key, response_body, response_code) "
        "VALUES (%s, %s, %s, 200)",
        (req.user_id, idempotency_key, json.dumps(result))
    )
    return result, 200

This code has three race conditions:

Double-submit race: Two requests with the same key arrive simultaneously. Both pass the SELECT check, both call Stripe, both try to INSERT. One fails on the unique constraint — but the card has been charged twice.
Crash race: The Stripe call succeeds, but the server crashes before INSERT. The client retries. Card charged twice.
Long-running race: Stripe takes 15 seconds. Client times out, retries. Server processes both in parallel.

Fixing all three requires a state machine and a lock.

The correct implementation

Here's the pattern Stripe and most serious payment systems use:

from enum import Enum

class Status(str, Enum):
    PENDING = "pending"
    COMPLETED = "completed"
    FAILED = "failed"

def idempotent_create_charge(req, idempotency_key, request_hash):
    # Step 1: try to acquire the key atomically
    try:
        with db.transaction():
            db.execute("""
                INSERT INTO idempotency_keys
                    (user_id, key, request_hash, status, locked_at)
                VALUES (%s, %s, %s, 'pending', NOW())
            """, (req.user_id, idempotency_key, request_hash))
            acquired = True
    except UniqueViolation:
        acquired = False

    if not acquired:
        # Another request owns this key. Fetch current state.
        row = db.query_one("""
            SELECT request_hash, status, response_code, response_body, locked_at
            FROM idempotency_keys
            WHERE user_id = %s AND key = %s
        """, (req.user_id, idempotency_key))

        # Reject payload mismatch
        if row.request_hash != request_hash:
            return {"error": "idempotency_key_reused"}, 422

        if row.status == Status.COMPLETED:
            return row.response_body, row.response_code

        if row.status == Status.PENDING:
            # Stuck or concurrent? If lock is old, it's stuck.
            if (now() - row.locked_at).seconds > 60:
                # Recovery path — see next section
                return recover_stuck_key(idempotency_key), 409
            return {"error": "request_in_progress"}, 409

        if row.status == Status.FAILED:
            return row.response_body, row.response_code

    # Step 2: we own the key. Do the work.
    try:
        result = stripe.charges.create(
            amount=req.amount,
            source=req.source,
            idempotency_key=idempotency_key,  # also pass it upstream!
        )
        response = result.to_dict()
        code = 200
        status = Status.COMPLETED
    except stripe.error.CardError as e:
        response = {"error": str(e)}
        code = 402
        status = Status.FAILED

    # Step 3: persist the result
    db.execute("""
        UPDATE idempotency_keys
        SET status = %s, response_code = %s, response_body = %s, locked_at = NULL
        WHERE user_id = %s AND key = %s
    """, (status, code, json.dumps(response), req.user_id, idempotency_key))

    return response, code

Three things to notice:

One, we pass the idempotency key downstream. Stripe, AWS, and most modern APIs accept client-generated idempotency keys. If you're proxying to them, forward yours. The key propagates through the whole call chain and each layer dedupes independently.

Two, we store failures too. If Stripe rejects the card (402), we cache that response. A retry returns the same 402 instead of re-attempting — otherwise a client stuck in a retry loop could trigger fraud alerts.

Three, we have a pending state with a lock timestamp. This is where most implementations cut corners and then have mysterious bugs at 3am.

The stuck-request problem

What happens if the server crashes while processing? The row stays in pending forever, and every retry gets 409 request_in_progress. That's worse than no idempotency at all.

You need a recovery strategy. The options:

Option A — expire pending keys after a timeout. If locked_at is older than, say, 60 seconds, assume the original worker is dead and let the retry proceed. Safe only if your underlying operation is also idempotent (you passed the key to Stripe, right?).

Option B — use a background reconciler. Sweep pending rows older than N seconds. For each, check the real state of the side effect (query Stripe for a charge with that idempotency key), then update the row.

Option C — require the client to explicitly poll. Return 202 Accepted with a status URL. The client checks until completed. More work for clients, but unambiguous.

Stripe uses a variant of B: their idempotency layer records what endpoint was called and replays the cached response on retry, but they run reconciliation jobs to close out stale pending records.

For most applications, Option A with a 60-second timeout plus passing the key upstream is the sweet spot. The upstream API will reject the duplicate even if your recovery is wrong.

Why Redis isn't enough (usually)

A tempting shortcut: use Redis SET NX EX to hold the idempotency lock, and skip the database.

acquired = redis.set(f"idem:{key}", "pending", nx=True, ex=86400)

This works for deduplication, but fails for response replay. When the same key comes back 10 seconds later, you need to return the same response body. If you only stored "pending" in Redis, you have nothing to return.

You have three viable architectures:

Redis lock + DB record — Redis for fast "is this in flight?" checks, DB for durable response storage. Good throughput, more moving parts.
DB only — simpler, uses SELECT ... FOR UPDATE or a INSERT ... ON CONFLICT pattern. Fine up to a few thousand RPS.
Redis only — works if your responses are small and losing them on a Redis failure is acceptable. Risky for payments.

For anything involving money, use durable storage. For "don't send the welcome email twice," Redis-only is fine.

The client side matters too

Server-side idempotency is useless if clients don't generate stable keys. Common mistakes:

Key generated per retry instead of per logical operation. If your SDK does uuid.uuid4() inside the retry loop, each retry gets a fresh key and the server can't deduplicate.
Key derived from request body. Hashing the payload means the same payload always gets the same key — so a user legitimately trying to charge the same card for the same amount twice gets blocked.
Key stored only in memory. If the app crashes mid-request, a retry on app restart generates a new key.

Correct client pattern (simplified):

async function charge(amount: number, source: string) {
  // Generate ONCE per logical operation, persist to local storage
  const key = await getOrCreateIdempotencyKey(`charge:${amount}:${source}:${cartId}`);

  return await withRetry(() =>
    fetch("/v1/charges", {
      method: "POST",
      headers: { "Idempotency-Key": key },
      body: JSON.stringify({ amount, source }),
    })
  );
}

The key lifecycle belongs to the business operation, not the HTTP request.

Scoping and expiration

Two design decisions that bite teams later:

Key scope. Are idempotency keys global, per-user, or per-endpoint? Per-user (composite key) is almost always right. Global means one user's UUID collision affects everyone. Per-endpoint means POST /charges and POST /refunds with the same key don't collide — which is usually what you want, because an SDK might reuse keys across operations by accident.

Retention. 24 hours is the industry default (Stripe, AWS API Gateway). Longer wastes storage. Shorter breaks clients that retry after network partitions or app restarts. Make it configurable but default high.

Cleanup. Do not rely on DELETE WHERE expires_at < NOW() in a cron job — on large tables that locks rows you don't want locked. Use partition-by-day tables and DROP PARTITION, or a TTL-indexed store like DynamoDB.

What to make idempotent

Not every endpoint needs this machinery. The checklist:

Does it cause a non-reversible side effect? (charge, email, SMS, external API call) → yes
Does it create a resource the client needs to reference by ID? → yes
Does it spend a limited resource (inventory, rate-limit budget, quota)? → yes
Is it a pure read or a trivially idempotent update (SET status = 'active')? → no

For everything in the "yes" bucket, idempotency is not optional. For everything else, skip the complexity.

Takeaways

"Exactly once" is a myth. "At least once + idempotent processing" is how reliable systems work.
Idempotency keys are about observable effects, not HTTP verbs. POST can be idempotent; PUT isn't automatically.
Use a state machine: pending → completed/failed, with a lock timestamp for recovery.
Store the response, not just the lock. Redis-only is fine for dedup, not replay.
Propagate idempotency keys to every downstream service that accepts them.
Reject payload mismatches with 422, not silent wrong behavior.
Scope keys per-user, expire after ~24 hours, and never rely on bulk DELETE for cleanup.

Get this right once, centralize it in middleware, and every new endpoint gets reliability for free. Get it wrong, and you'll debug phantom double-charges for years.

#api#backend#distributed-systems#redis#postgresql#reliability