API Idempotency and Retries: Practical Patterns for Safe Requests
By Chris Moen • Published 2026-02-24
Looking for API idempotency retries best practices? Here’s the quick answer plus client and server patterns, code, and tests to make retries safe with idempotency keys, exponential backoff, and clear dedup rules.
Quick answer: To make API idempotency retries safe, use idempotency keys and a disciplined retry policy. The client retries transient failures with exponential backoff and jitter while sending the same key. The server enforces “one result per key” and returns the original response for duplicates.
What is API idempotency and why it matters
Idempotency means repeating the same request does not change the final state. Retries are safe only if idempotency is enforced; otherwise you risk duplicate side effects (e.g., double charges or duplicate records). It also makes backfills and replays predictable.
- One logical operation should have one effect.
- The same request must return the same outcome.
When to retry API calls
Retry only when failure is likely transient. Typical retryable cases:
- Connection errors and timeouts (including 408)
- HTTP 429 with Retry-After
- HTTP 500, 502, 503, 504
Do not retry on clear client errors like 400 or 401.
Idempotency keys: making POST and PATCH safe to retry
Clients generate a unique idempotency key per logical operation and send it with the request. The server stores the key and the final response. If the same key arrives again, the server returns the stored response instead of repeating side effects.
- Scope keys to endpoint and tenant.
- Bind each key to a hash of the request body to prevent misuse.
- Keep keys at least as long as the maximum retry window.
Client example (Python):
import uuid, requests
def post_with_idempotency(url, json, session=None, key=None, timeout=10): session = session or requests.Session() key = key or str(uuid.uuid4()) headers = {"Idempotency-Key": key, "Content-Type": "application/json"} resp = session.post(url, json=json, headers=headers, timeout=timeout) return resp
A safe client-side retry strategy
Retry only when the method/endpoint is idempotent or protected by a key. Use exponential backoff with jitter, respect Retry-After, and cap attempts and total time.
Client retry template (Python):
import time, random, uuid, requests
RETRYABLE_STATUSES = {408, 429, 500, 502, 503, 504}
def should_retry(resp, exc): if exc or resp is None: return True return resp.status_code in RETRYABLE_STATUSES
def parse_retry_after(resp, default_delay): ra = resp.headers.get("Retry-After") if resp else None if not ra: return default_delay try: return max(float(ra), 0.0) except ValueError: return default_delay
def call_with_retries( url, method="POST", json=None, headers=None, max_retries=3, base_delay=0.5, idempotency_key=None, timeout=10 ): session = requests.Session() headers = headers or {} if method.upper() == "POST": headers.setdefault("Idempotency-Key", idempotency_key or str(uuid.uuid4()))
last_exc = None for attempt in range(max_retries + 1): try: resp = session.request(method, url, json=json, headers=headers, timeout=timeout) if should_retry(resp, None) and attempt
Server-side enforcement of idempotency
On the server, store the key and the final response. Enforce a unique index on the key within the endpoint/tenant scope. Return the first response for duplicate keys. Bind the key to a request hash to prevent replay with different bodies.
Schema example (PostgreSQL):
CREATE TABLE idempotency_keys ( id SERIAL PRIMARY KEY, endpoint TEXT NOT NULL, tenant_id TEXT NOT NULL, idem_key TEXT NOT NULL, request_hash TEXT NOT NULL, status_code INT, response_body JSONB, state TEXT NOT NULL DEFAULT 'in_progress', created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE (endpoint, tenant_id, idem_key) );
FastAPI example (Python):
import hashlib, json from fastapi import FastAPI, Request, HTTPException from pydantic import BaseModel
app = FastAPI() store = {} # replace with a real DB
def body_hash(data): return hashlib.sha256(json.dumps(data, sort_keys=True).encode()).hexdigest()
class CreateOrder(BaseModel): customer_id: str items: list
@app.post("/v1/orders") async def create_order(req: Request, payload: CreateOrder): tenant_id = "t1" # derive from auth in real code idem_key = req.headers.get("Idempotency-Key") if not idem_key: raise HTTPException(status_code=400, detail="Missing Idempotency-Key")
endpoint = "/v1/orders" rh = body_hash(payload.model_dump()) key = (endpoint, tenant_id, idem_key) row = store.get(key)
if row: if row["request_hash"] != rh: raise HTTPException(status_code=409, detail="Idempotency-Key re-use with different payload") return row["response"], row["status"]
Insert placeholder (guard concurrent duplicates)
store[key] = {"request_hash": rh, "state": "in_progress"}
try:
Do the side effect exactly once (use a DB transaction with upserts)
order_id = "ord_123" # created in a transaction response = {"order_id": order_id, "status": "created"} status = 201 store[key].update({"state": "done", "response": response, "status": status}) return response, status except Exception:
Mark as failed so client can retry with same key
store[key].update({"state": "failed"}) raise
- Insert a placeholder row first to handle concurrent duplicates safely.
- Use a transaction for side effects and for saving the final response.
- If a duplicate key arrives while processing, either ask the client to poll or return 409 with a retry hint.
Async work, queues, and at-least-once delivery
Use the outbox pattern. Write the domain change and an outbox record (including the idempotency key or a business key) in the same transaction. A worker reads the outbox and processes messages at least once. Downstream consumers upsert by business key or check a processed set.
- Outbox table: id, topic, payload, idem_key, created_at, processed_at
- Processed table or cache: message_id or business key with TTL
- Workers use upsert/merge to avoid duplicates on replays
Example upsert (PostgreSQL):
INSERT INTO payments (payment_id, amount, currency, status) VALUES ($1, $2, $3, 'confirmed') ON CONFLICT (payment_id) DO UPDATE SET status = EXCLUDED.status;
Which HTTP methods are safe to retry
- GET, HEAD, OPTIONS: safe to retry if the server is implemented correctly.
- PUT, DELETE: can be idempotent by design.
- POST: safe to retry only with idempotency keys or if the operation is implemented as idempotent.
- PATCH: not idempotent by default. Use keys if you must retry.
Backoff and jitter defaults
Use exponential backoff to reduce load, add jitter to avoid synchronized storms, respect Retry-After for 429/503, and cap the total retry budget.
- Max 3–5 retries
- Base delay 0.5–1.0 seconds
- Full jitter up to ~250 ms
Common anti-patterns to avoid
- Blind INSERT for writes. Prefer upsert/replace.
- Generating new IDs for duplicates. Tie IDs to business keys.
- Using timestamps in primary keys for dedup.
- Planning to deduplicate later. Do it at write time.
These mirror data pipeline practices:
- Overwrite a partition equals replace a scope.
- MERGE equals upsert to avoid duplicates.
How to test and monitor idempotency and retries
Test:
- Send the same POST with the same key many times. You must get the same response, once.
- Induce a timeout after the server commits. Client retry must return the first result.
- Simulate concurrent duplicates. Only one side effect should occur.
- Kill workers mid-process. The outbox should recover without duplicates.
Monitor:
- Count of duplicate keys served from cache
- Retry rates by status code
- Time to first success per operation
- DLQ volume for async pipelines
- Data freshness for async results
FAQ
How long should I keep idempotency keys?
Keep keys for at least the maximum retry window you allow—often a few hours to a few days. Align with business SLAs and storage budget.
Should I allow idempotency keys across endpoints?
No. Scope keys to endpoint and tenant, and store a request hash. Reject reuse with a different payload.
What if the client times out but the server finished?
The client retries with the same key. The server returns the stored result without repeating the side effect.
Do I need idempotency for GET requests?
GET is safe to retry by definition. Add ETags and caching for efficiency. Idempotency keys are not needed for GET.
How do I handle rate limits during retries?
Honor Retry-After, increase backoff, use a retry budget, and consider a circuit breaker to pause when limits are hit.
Where this fits with agent orchestration
If you orchestrate multi-step automations or long-running jobs with coding agents, apply the same API idempotency and retries patterns at workflow boundaries and external calls. Breyta is a workflow and agent orchestration platform for coding agents that helps teams build, run, and publish reliable workflows with deterministic execution, clear run history, versioned flow definitions, approvals, waits, and an agent-first CLI. These operational controls complement the idempotency and retry techniques in this guide.
Related reading: AI Agent Build Patterns: Reliable execution loops, tooling, and production practices and Breyta Launch: CLI-First AI Workflows for Coding Agents.