WDK Idempotency, Errors & Retries
Durable execution removes the "did the request actually go through?" anxiety, but only if your steps are idempotent. The single most common WDK / Temporal / Step Functions bug is a non-idempotent step that sends two emails on retry. Treat steps like HTTP PUTs: design them to be safely repeatable, key your side-effecting external calls with a deterministic id, and the rest of the model takes care of itself.
What It Defines
Step functions are retried automatically on error (default: up to 3 attempts) and their results are cached in the event log on success. That makes idempotency the developer's responsibility: a step that charges a card, sends an email, or calls a non-idempotent third-party API needs an idempotency key (often the workflow run id or a derived hash) so a retry doesn't double-charge. WDK distinguishes terminal errors (don't retry) from transient ones (do retry), exposes per-step retry policies, and surfaces the workflow-wide error in the dashboard. Calling a step *outside* a workflow strips both retries and observability — they degrade to plain async calls.
Canonical (Normative)
Frontend cloud platform and the steward of the open-source Workflow DevKit (Apache-2.0). Vercel publishes the WDK spec and reference implementation ("use workflow" / "use step" directives, durable timers, hooks, streaming) and runs the integrated workflow runtime on its platform. WDK can also be self-hosted on Docker, AWS, or DigitalOcean. Vercel also publishes the AI SDK and the Vercel Functions runtime spec.
Related Specs
WDK is the durable-execution model arriving inside the JavaScript ecosystem proper: instead of writing a state machine, you write `async` code with two extra string directives and you get crash-safe, resumable, retry-aware, observable workflows. If you're building AI agents, multi-step background jobs, human-in-the-loop flows, or anything that previously needed BullMQ + a state machine + careful idempotency, WDK collapses that into a single mental model. Even if you don't pick WDK specifically, the directives + replay model are the durable-execution pattern Temporal pioneered and Cloudflare/Restate/DBOS now ship — knowing one teaches you the rest.
If you remember one thing about durable execution, remember this: workflow code runs many times and the event log decides what's real. That single fact explains every WDK rule — why steps must be idempotent, why you can't read `Date.now()` directly in a workflow, why parameters are passed by value, why mutating an object inside a step doesn't propagate, and why bundlers occasionally bite you. Internalize the replay model and the rest of WDK (and Temporal, and Cloudflare Workflows) becomes obvious.
Most "my workflow exploded after a deploy" reports come down to something non-serializable hiding in a step argument or return value. Treat the boundary like an API: pass plain data, return plain data, do all I/O and mutation inside steps, and never assume an object reference survives. Same discipline applies to Temporal activities and Step Functions tasks — it's a property of event-sourced replay, not a WDK quirk.
This is the core contract of every web API, browser request, and server response. You can't design or debug HTTP without knowing this.