Case study · 03/2022 — 05/2022

WakeFlow

WakeFlow was a focused experiment in workflow automation: define a sequence of steps, hang triggers off them, and let the system run the rest. The interesting part was not the visual editor but the scheduler underneath, the thing that actually has to fire jobs at the right time, in the right order, and survive a process restart without dropping work on the floor.

Workflow OrchestrationDurable State MachinesAutomation

Problem

Workflow engines look simple from the outside: "do A, then B, then C", and become unpleasant the moment you ask the obvious questions. What happens if step B is in flight when the worker dies? What happens if the schedule for step C drifts? What happens if two replicas pick up the same trigger? The project was an excuse to confront those questions head-on instead of papering over them with a queueing service and crossed fingers.

Approach

Workflow runs were modelled as durable state machines with each step writing its outcome before signalling the next. Triggers were treated as first-class records rather than ephemeral messages so a restart could replay them deterministically. The scheduler used a leasing model to make sure two replicas could not advance the same run, and idempotency was required at every step boundary. Event-driven triggers and time-based triggers shared the same code path, which kept the surface area small.

Outcome

WakeFlow ended up being more useful as a study than as a product. The mental model durable state machine, leased advancement, idempotent steps has carried into the way I approach any long-running orchestration since, including ETL pipelines and the scheduler primitives behind the bigger systems I have shipped after it.

Stack

Node.jsTypeScriptPostgreSQLDurable state machinesLeased schedulersEvent-driven triggers

Visit WakeFlow ↗Talk about a similar project ← All case studies