How It Works
Durable Workflow uses Laravel's queued jobs and event sourced persistence to create durable coroutines. Workflows suspend through Fiber-backed helper calls for a durable replay contract.
Runtime
The runtime uses the same broad Laravel primitives, but the runtime model is more explicit:
WorkflowStub::make()reserves a durable workflow instance id- caller-supplied public instance ids are validated up front as non-empty URL-safe strings up to 191 characters, so blank, overlong, or unsupported-character ids fail before the runtime tries to reserve or reuse anything in storage
- accepting a start creates a distinct run id, a durable start command record, and the first workflow task in one transaction
signalWithStart()and the matching webhook route link one durable start command plus one durable signal command under a shared intake-group marker, and the runtime records the accepted signal before the first workflow task can run user code- starts can attach a searchable
business_key, exact-match stringvisibility_labels, and returned-onlymemometadata; the runtime storesbusiness_keyandvisibility_labelson the workflow instance, run, and run-summary projection, records them on typed start history, carries them intocontinueAsNew()runs, and exposes them through selected-run detail, history export, Waterline list filters, and Waterline saved operational views.memois also recorded on the workflow instance, run, typed start history, selected-run detail, history export, and latercontinueAsNew()runs, but it intentionally stays out of run-summary filters and saved-view matching. That same run-summary visibility contract now also carriesrepair_blocked_reason, the durable booleanrepair_attention, and the durabletask_problemflag, so fleet filters and saved views can isolate badge-visible repair blockers such asunsupported_historyorwaiting_for_compatible_worker, keeprepair_not_neededout of those views, and still isolate broader replay, missing-task, or workflow-task transport problems without opening every selected run first - each accepted or rejected run-scoped command also gets a durable
command_sequenceinside that run, so command history and signal application do not depend oncreated_atties - older runs that predate
command_sequenceare backfilled into that same per-run order during migration and again on later command intake, so new signals, updates, and operator commands cannot leapfrog legacy rows that were recorded before the sequence column existed - durable command records now also capture ingress metadata such as command source, caller label, auth outcome, request route, request fingerprint, and the accepted payload itself; compound start-time intake such as
signalWithStart()also stampscontext.intake.mode = signal_with_startplus a sharedcontext.intake.group_idonto the linked start and signal commands so history export and Waterline can correlate the pair without scraping raw request bodies, and workflow-originated child or continue-as-new starts now record their parent run and workflow step in command context, withchild_call_idattached when that run belongs to a parent-issued child invocation - stable workflow and activity type keys can come from
#[Type(...)]attributes orworkflows.v2.typesconfig registration; the service provider validates at boot that no class is registered under multiple type keys and that config keys agree with any#[Type]attribute on the mapped class, so duplicate or conflicting type identities fail fast instead of silently producing ambiguous durable records - stable external signal names are declared explicitly with repeatable
#[Signal('...')]workflow attributes, so signal ingress can reject typos before they become durable accepted commands - accepted and rejected signals also mint one first-class durable
workflow_signal_recordslifecycle row linked back to the signal command; selected-run detail exposes those rows throughsignals[*]whilecommands[*].signal_id,commands[*].signal_status, andcommands[*].signal_wait_idremain the command-list compatibility bridge. Final v2 writes those lifecycle rows on the command path. - those
#[Signal(...)]declarations may also include an ordered parameter contract, and the runtime snapshotsworkflow_definition_fingerprint,declared_queries,declared_query_contracts,declared_signals,declared_signal_contracts,declared_updates, anddeclared_update_contractsonto typedWorkflowStartedhistory so later webhook, PHP, and Waterline intake can validate named or positional arguments, declared scalar or objecttype, andallows_nullrules from durable run metadata instead of only from a live class; selected-run detail also exposes normalizeddeclared_query_targets,declared_signal_targets, anddeclared_update_targetsarrays so operator clients can keep every declared target visible while still attaching parameter metadata when a durable contract exists, and those normalized arrays stay present even when the contract source isunavailable - final v2 treats the
WorkflowStartedcommand-contract snapshot as the only authoritative source for declared query, signal, update, and entrypoint metadata. Selected-run detail and history export reportdeclared_contract_source = durable_historywhen that complete snapshot is present, anddeclared_contract_source = unavailablewith empty normalized target arrays when it is missing or incomplete. The clean-slate engine does not reflect a live class to rebuild missing command contracts and does not expose command-contract normalization pressure in fleet metrics or health checks - the current workflow task replays the selected run and applies one bounded unit of work
- named workflows are straight-line only and suspend through Fiber-backed helpers such as
activity(),await(),timer(),sideEffect(),getVersion(), andall([...])without writingyieldin the workflow body;await('signal-name')is the workflow-code helper for one named signal value, closures such asfn () => activity(...)andfn () => child(...)feed barrier topology intoall([...]), andasync(...)callbacks use that same straight-line-only helper contract - query methods marked with
#[QueryMethod]replay committed history for the current or selected run without applying pending signals implicitly, can declare a stable public target name through#[QueryMethod('public-name')], and now snapshot their ordered parameter contract for selected-run detail plus Waterline's read-only query operator; selected-run detail also exposescan_queryandquery_blocked_reasonso operator clients can tell when durable query targets exist but the workflow definition is not currently replayable getVersion()records one typedVersionMarkerRecordedhistory event per workflow step, and workflow or query replay reuses that committed version marker instead of branching from live code alone; when a run reaches a newly introduced branch point with no marker yet, replay now checks the start-timeworkflow_definition_fingerprintfromWorkflowStartedbefore it falls back to the compatibility marker, so same-compatibility runs that started before the branch was deployed can stay onWorkflowStub::DEFAULT_VERSIONwithout synthesizing a new marker or consuming a new workflow step- selected-run replay-safety diagnostics are fingerprint-scoped: when the current loadable workflow class no longer matches the run's snapped
workflow_definition_fingerprint, Waterline and run detail reportworkflow_determinism_source = definition_driftinstead of pretending that today's source scan is authoritative for that older run - update methods marked with
#[UpdateMethod]replay committed history, record typedUpdateAcceptedwhen the command is accepted, apply under the run lock on the workflow worker, append typedUpdateAppliedandUpdateCompletedentries when they run, and append typedUpdateRejectedhistory when a targeted run rejects the command before application; callers can wait for completion withattemptUpdate*or submit accepted-only work withsubmitUpdate*/ webhook or Waterlinewait_for = accepted, and both paths record the accepted lifecycle first and use the durable workflow task path for application instead of executing the update body directly insideWorkflowStub, the webhook handler, or Waterline's controller. Write-side webhook and Waterline update requests accept onlywait_for = acceptedorwait_for = completed(or omit the field to keep thecompleteddefault); lookup responses reservewait_for = statusforinspectUpdate()and the update-status endpoints. Completion-waiting callers wait only up to the configuredworkflows.v2.update_wait.completion_timeout_secondsbudget or an explicit per-call override, then fall back to the still-open accepted lifecycle withwait_for,wait_timed_out, andwait_timeout_secondsresponse metadata instead of blocking indefinitely; the engine also mints one first-class durableworkflow_updatesrow per update lifecycle, gives it its ownupdate_id, and exposes that row through webhook responses, selected-run detail, Waterline's dedicated Updates table, and the selected-run wait surface instead of making operators infer everything back out of generic command rows alone. While that lifecycle stays accepted, selected-run summaries projectwait_kind = update,open_wait_id = update:{update_id}, andresume_source_kind = workflow_update; if the backing workflow task disappears, repair now keeps pointing at the accepted update instead of falling back to the older underlying signal, child, or timer wait. When you declare#[UpdateMethod('public-name')], that durable alias becomes the canonical update target in command history, webhook routes, and Waterline instead of the PHP method name, and the engine also snapshots each declared parameter contract so named or positional update intake can rejectrejected_invalid_argumentswith durablevalidation_errorsfor missing arguments, unknown arguments, type mismatches, or nullability violations before the update body runs, even when the current worker can only recover that contract fromWorkflowStartedhistory rather than from a loadable live class; when a selected run still durably declares the target but the workflow definition cannot be replayed, the update rejects asrejected_workflow_definition_unavailablewithrejection_reason = workflow_definition_unavailable; Waterline drives selected-run Signal and Update operator forms from the normalizeddeclared_signal_targetsanddeclared_update_targetsdetail arrays, with the olderdeclared_signals,declared_signal_contracts,declared_updates, anddeclared_update_contractsfields retained as compatibility metadata await()projects typed condition waits and named signal waits. For condition waits, replay advances only from committedConditionWaitSatisfiedorConditionWaitTimedOuthistory instead of re-evaluating the predicate speculatively during queries;await()accepts an optional stable condition key, persists it in typed condition and timeout-timer history, exposes it to Waterline ascondition_key, and validates the recorded key during worker and query replay before treating the current condition wait as the same durable step. For named signals,await('name')waits for a durable signal payload andawait('name', timeout: ...)returnsnullwhen the timeout wins. Adding a key to an already-recorded unkeyed condition wait is also treated as replay drift, because the old history did not durably name that predicate. Replay also validates that the same workflow sequence did not already record a different typed step shape before appending or consuming step history. The shape guard covers activity, child-workflow, pure timer, signal-wait, side-effect, version-marker, continue-as-new, andall([...])leaf sequences, so a current build cannot schedule an activity over committed timer or child history at the same sequence. Forall([...]), worker and query replay compare recorded parallel group topology against the current barrier, including arity and nested group path. Typed activity or child leaf history from anall([...])step must carry that group metadata; older leaf events that lack it block replay ashistory_shape_mismatchinstead of being guessed into the current barrier shape. Worker-side condition-key, predicate-fingerprint, history-shape, or parallel-topology mismatch blocks replay withliveness_state = workflow_replay_blockedandtasks[*].transport_state = replay_blockedinstead of committingWorkflowFailed; after a compatible build is deployed, an operator can repair the run to retry the task.- timeout-backed condition waits keep the wait itself as
wait_kind = conditionwhile using a normal durable timer row and timer task as the timeout transport; when a durable update flips the predicate before the timer fires, the runtime now republishes an existing ready workflow task or creates one if none is open so the worker can re-evaluate the wait, then cancels the stale timeout timer - selected-run summaries and
workflow_run_waitsprojection rows also rebuild timeout-backed condition-waitdeadline_at,resume_source_kind, andresume_source_idfrom typedConditionWait*plusTimerScheduledtimeout transport history, so Waterline keeps the original blocked-on deadline and timeout identity even if the liveworkflow_timersrow later drifts or disappears; an unrelated open workflow task row no longer replaces that typed condition wait as the selected wait. Once timeoutTimerScheduledhistory exists, the worker also requires matchingTimerFiredhistory before applyingConditionWaitTimedOut, so a drifted mutable timer row cannot make the timeout win by itself. After the timeout transport recordsTimerFired, the run waits for a workflow task to applyConditionWaitTimedOut, and repair recreates that workflow task from typed history if the timer row or resume task disappears first - pure timers stay blocked from typed
TimerScheduledhistory until the run commits a matchingTimerFired, and selected-run summaries, waits, tasks, timer lists, and history exports rebuild timer identity and deadline metadata from that typed timer history before falling back to mutable side tables for non-terminal older rows. On those timer rows,statusis always the authoritative selected-run timer state,source_statusis the status value reported by that authority, androw_statusis only the current mutableworkflow_timers.statusdiagnostic when a timer row still exists. That means typed history never yields to a drifted mutable timer row: if the durable history still sayspendingwhile the row later saysfired, the selected-run timer staysstatus = pendingandsource_status = pending, and onlyrow_status = firedchanges. A fired or otherwise terminalworkflow_timersrow with no typed timer history is treated as replay drift instead of a timer result; Waterline-facing detail marks that fallback asstatus = unsupported,history_authority = unsupported_terminal_without_history,history_unsupported_reason = terminal_timer_row_without_typed_history, keeps the mutable terminal state only asrow_status, and omitsresume_source_kind/resume_source_idbecause that row is diagnostic-only rather than a durable resume path. - completed activity outcomes replay from typed
ActivityCompletedandActivityFailedhistory, while activity cancellation is recorded as typedActivityCancelledhistory when cancel or terminate closes an in-flight execution or an activity worker observes that stop through the bridge.ActivityCancelledis a terminal activity fact for worker and query replay, so it wins over earlier open activity history for that workflow step instead of leaving replay parked on the staleActivityScheduledorActivityStartedevent. When a step already has typed open activity history such asActivityScheduled,ActivityStarted,ActivityHeartbeatRecorded, orActivityRetryScheduled, workflow replay and query replay stay blocked until a matching terminal activity event is committed instead of accepting a drifted terminalactivity_executionsrow. A completed, failed, or cancelledactivity_executionsrow with no typed activity history is explicitly unsupported for replay and blocks ashistory_shape_mismatchwith recorded eventsno typed history; selected-run activity and wait projections mark that fallback asstatus = unsupported,history_authority = unsupported_terminal_without_history,history_unsupported_reason = terminal_activity_row_without_typed_history, keep the mutable status only asrow_status, and omitresume_source_kind/resume_source_idbecause that row is diagnostic-only rather than a durable resume path. Non-terminal older rows may still be used only to keep an open wait visible. - selected-run activity detail, compatibility
logs, activity-backedchartData, open activity waits, task labels, and run-summary liveness rebuild from typedActivityScheduled,ActivityStarted,ActivityHeartbeatRecorded,ActivityRetryScheduled,ActivityCompleted,ActivityFailed, andActivityCancelledsnapshots first, so Waterline keeps the original activity class, arguments, running-vs-completed-or-cancelled status, retry-policy snapshot, idempotency key, attempt count, latest attempt id, per-attempt task id, worker, heartbeat, lease, close timestamps, bounded heartbeat progress, and heartbeat or cancellation timeline points even if the mutableactivity_executions,activity_attempts, or task rows later drift or disappear.ActivityHeartbeatRecordedaccepts one normalized operator-facingprogressobject with optionalmessage,current,total,unit, and flatdetails, and selected-run detail plus history export expose that same snapshot back aslast_heartbeat_progresson the activity and attempt views. Grouped activity barrier identity also comes from typed history: final v2 recordsparallel_group_pathon the typed activity events and does not infer missing grouped metadata from mutable activity rows. If grouped typed history lacks that metadata, replay, query, export, and Waterline projection reporthistory_shape_mismatchrather than manufacturing a barrier identity. For event-backed activities, mutable execution results and activity close timestamps are not used unless the typed history has a terminal activity event; unsupported row-only terminal fallbacks also suppress the mutable result and close timestamp instead of presenting them as durable output, and run-summary liveness reportsworkflow_replay_blockedwhen that unsupported fallback is the selected run's only apparent progress source. The activity rows themselves mark older mutable fallback evidence withdiagnostic_only = true, and the legacylogs[*]/chartData[*]compatibility arrays echo that samehistory_authority,history_unsupported_reason, anddiagnostic_onlymetadata instead of flattening older-row evidence into an apparently authoritative activity result. When a pending activity loses both its mutable execution row and activity task before it starts, repair restores the execution row fromActivityScheduledhistory before recreating the durable activity task - selected-run timeline entries carry stable event identity fields such as
entry_kind,source_kind, andsource_id, and the entry's primary command, task, activity, timer, child, and failure state comes from the recorded event snapshot first instead of inheriting whatever those mutable side rows say later - recorded version markers appear in that same selected-run timeline as typed
VersionMarkerRecordedpoints with the durablechange_id, selectedversion, and supported range, so Waterline can explain which branch a long-lived run is following without inventing compatibility-era markers that were never durably committed - selected-run
exception_count, compatibilityexceptions[*], update lifecycle failure fields, timeline failure metadata, and history-exportfailures[*]also rebuild from typedActivityFailed, parent-sideChildRunFailed,WorkflowFailed, failedUpdateCompleted, andFailureHandledhistory first, so Waterline and exported bundles keep failure ids plus durable exception type aliases, message, file, line, trace, declared custom-property detail, update or child source metadata, handled disposition, and stable multi-failure ordering even if mutable update, command, or failure rows later drift or disappear. When a selected run can only recover a failure from an older mutable failure row, the detail/export payload now marks that row ashistory_authority = failure_row_fallbackwithdiagnostic_only = trueinstead of presenting it as indistinguishable typed failure history. Failed update detail is keyed by the durableupdate_idin typed history; command ids are preserved when available, but they are not required to show the update failure lifecycle. - when typed failure payloads are present, replay restores the original activity exception class and custom properties instead of flattening handled failures into a generic
RuntimeException; if the payload carriestypeand that alias is registered underworkflows.v2.types.exceptions, replay resolves the alias before falling back to the recorded PHP class. Imported v1 failures with no durabletypecan be bridged throughworkflows.v2.types.exception_class_aliases; new v2 failures should use stable aliases before throwable classes move. Operator views report the resolved class plus whether resolution came fromexception_type,class_alias,recorded_class,unresolved,misconfigured, orunrestorable. Unresolved mappings, invalid configured aliases, and loadable classes that cannot be safely restored now block replay withUnresolvedWorkflowFailureException,exception_replay_blocked = true, and areplay_blockedworkflow task instead of being delivered to broad workflow catch blocks as a generic runtime exception - if an earlier accepted signal is still waiting to be applied on the current run, later updates reject as
rejected_pending_signalwithrejection_reason = earlier_signal_pendinginstead of running the workflow task inline on the caller path; let the queued workflow task apply the signal first, then retry the update against the advanced state - when a signal payload violates the declared durable signal contract, the runtime now rejects it as
rejected_invalid_argumentswithrejection_reason = invalid_signal_argumentsand durablevalidation_errorsbefore the signal command is accepted, including missing arguments, unknown arguments, type mismatches, and nullability violations - unknown signal and update targets now reject through typed durable command outcomes instead of being accepted implicitly or failing as bare adapter errors
- the runtime exposes straight-line workflow-code helpers such as
activity(),await(),child(),all(),sideEffect(),timer(), andcontinueAsNew(), plus explicit externalsignal(),update(),repair(),cancel(),terminate(), andarchive()commands; signal, update, repair, cancel, and terminate target the current instance run, while archive can target a closed selected run in the instance sideEffect()records one typedSideEffectRecordedhistory event per workflow step, and replay or query paths reuse that committed value instead of re-running the closure- activity completion, activity failure, activity cancellation, handled failure continuation, child workflow scheduling and closure, side-effect recording, timer fire, signal receipt/application, update acceptance/application/completion or rejection, continue-as-new lineage, accepted repair commands, cancellation, termination, and archival append typed history before the run summary is updated
- when multiple accepted signal commands are pending for the same run, the current slice applies them in durable
command_sequenceorder - those repeated same-name signals now also keep one durable
signal_wait_idend-to-end in typed history, even when a later wait opens only after the signal command was already accepted - signal waits and typed timeline entries now keep the accepted command's snapped sequence, target, payload preview, source, and transport-adjacent task metadata in the event payload itself, so Waterline can still explain repeated same-name signals and event-era task state even if the mutable command or task rows drift later
- activity claim advances the durable execution attempt count, mints a fresh current-attempt id, opens one durable
activity_attemptsrow for that try, and only lets that currently claimed attempt write completion or failure history, so late results from an expired lease cannot overwrite a newer reclaimed attempt - adapter-style activity workers can use
Workflow\V2\ActivityTaskBridgeas the first worker boundary: claim a ready durable activity task by task id, receive the codec-tagged argument payload plus activity type, heartbeat byactivity_attempt_id, and complete or fail that attempt without loading the PHP activity class. The same bridge is exposed over authenticated HTTP/JSON through the webhook routes foractivity-tasks/{taskId}/claimandactivity-attempts/{attemptId}status, heartbeat, completion, and failure. Bridge claims and the default PHP activity job share the same backend or compatibility checks, lease creation, durable attempt row, and typedActivityStartedhistory path, while the same recorder writesActivityCompleted,ActivityFailed,ActivityRetryScheduled, orActivityCancelledhistory and dispatches the next durable workflow or retry task after commit. Adapter workers can callheartbeatStatus($activityAttemptId)or the matching heartbeat webhook for a structured stop contract withcan_continue,cancel_requested,reason, attempt/task/run status, and lease timestamps; when a cancel or terminate command has closed the run, that heartbeat response closes the attempt lease, records the cancellation observation if it is missing, and late completion or failure is ignored as stale. This is still a first bridge for known durable ids, not yet a long-poll discovery or hosted worker-service contract - retryable activity failures close the failed attempt, record
ActivityRetryScheduled, and create the next durable activity task from the activity execution's snappedretry_policywhile the workflow stays parked on the same activity execution until a later attempt succeeds or the snapped retry budget is exhausted; theActivityScheduledhistory snapshot, Waterline activity detail, and history export carry that policy, andactivityId()remains the execution-level idempotency key for external side effects - the upgrade path also normalizes older already-started activity executions that predate
activity_attemptsinto one latest-known durable attempt row pluscurrent_attempt_id, so heartbeat renewal, repair, and Waterline attempt detail keep working across mixed-era data even though earlier releases never stored every historical attempt durably - queue jobs carry task ids, while the durable task row remains the source of truth for whether work is ready, leased, or completed
- ordinary queue workers also run a light recovery sweep on
Looping, which records a database-backed compatibility heartbeat snapshot for that worker, optional compatibility namespace, and queue scope, re-dispatches overdue ready tasks, reclaims expired workflow, activity, and timer task leases, and recreates missing workflow, child-resolution workflow, accepted-update, accepted-signal, pending-activity, or timer tasks for runs already projected asrepair_neededwith no open task row, without duplicating in-flight running activities; when the pending activity's mutable execution row is also gone, recovery restores it from typedActivityScheduledhistory before creating the replacement activity task. Selected-run task detail now exposes those lost transport expectations before repair as synthetictransport_state = missingrows withtask_missing = true, carrying the activity, timer, condition-wait, child, update, signal, command, retry,expected_task_id, or generic selected-run workflow-task identity that typed history, wait state, or the no-resume-source invariant can still prove. Child-resolution, accepted-update, and accepted-signal workflow tasks carryworkflow_wait_kind, open wait id, resume source, and the durable child, update, signal, or command id metadata both when they are first scheduled and when repair recreates them, so Waterline can tie command-application or child-result transport back to the durable source before and after transport loss. The redispatch threshold, loop throttle, scan limit, and repeated-failure backoff cap are configured underworkflows.v2.task_repairand echoed in Waterline operator metrics asrepair_policy. The first dispatch failure can be repaired immediately, while repeated dispatch or claim failures set durablerepair_available_atbackoff on the task and keep Waterline detail intransport_state = repair_backoffuntil the next repair window. Candidate selection is scope-fair acrossconnection,queue, andcompatibility, so one hot repair scope cannot consume every existing-task or missing-run slot in a worker-loop pass while other scopes have candidates. Task claim still respects compatibility markers and the backend capability matrix before leasing, but transport-level recovery no longer depends on the scanning worker being able to execute that task itself. During a rolling upgrade, the fleet view also falls back to the older cache heartbeat format until those workers have restarted onto the database-backed snapshot path. - selected-run detail exposes that fleet view as
compatibility_namespace,compatibility_fleet_reason, andcompatibility_fleet, with one in-scope worker snapshot carryingworker_id,namespace,host,process_id,connection,queue,supported,supports_required,recorded_at,expires_at, andsource, so Waterline can show which active workers are actually advertising the selected marker instead of only a boolean summary, isolate fleets when several apps share one workflow database, and label legacy cache snapshots during mixed-fleet upgrades; when a compatibility namespace is configured, database heartbeat rows must match it, while older cache snapshots remain visible as rollout fallback withnamespace = nulluntil those workers restart onto the new path - the Waterline dashboard stats endpoint is served through
OperatorObservabilityRepository::dashboardSummary(), so totals, recent-run counts, extrema from the run-summary projection, and deeper operator metrics share the same replaceable operator-observability boundary as selected-run detail and history export. The metrics come from durable run summaries, workflow tasks, activity executions, activity attempts, worker compatibility heartbeats, projection rows, history events, and backend capability diagnostics, including archived run counts, runnable task backlog, retrying activity counts, failed activity-attempt counts, delayed and leased task counts, unhealthy transport, claim, or lease counts, repair-needed runs, claim-failed runs, compatibility-blocked runs, selected-run wait projection drift, timeline projection drift, active worker counts, active queue-scope counts, active repair-policy thresholds, scope-fair selected repair candidates, per-queue repair-pressure scopes, the configured database/queue/cache capability snapshot, and how many active workers advertise the current required compatibility marker. - the Waterline dashboard stats endpoint also exposes history-budget metrics from durable run summaries, including how many selected runs currently recommend
continueAsNew(), the maximum projected event count, the maximum projected history byte size, and the active thresholds - the Waterline dashboard stats endpoint also exposes projection health under
operator_metrics.projections.run_summariesincludes total durable runs, projected summaries, missing summaries, stale summaries whose durable run fields drifted, orphaned summaries, and whether a rebuild is needed.run_waitsincludes wait row count, projected-run count, canonical wait-run count, projected canonical wait-run count, missing wait-run count, stale projected wait-run count, summaries with current open waits, missing current open-wait rows, orphaned wait rows, and whether a rebuild is needed.run_timeline_entriesincludes history-event count, timeline row count, projected-run count, canonical history-run count, projected canonical history-run count, missing history-run count, stale projected history-run count, missing history-event rows, orphaned timeline rows, and whether a rebuild is needed.run_timer_entriesincludes timer row count, projected-run count, canonical timer-run count, projected canonical timer-run count, missing timer-run count, stale projected timer-run count, orphaned timer rows, and whether a rebuild is needed.run_lineage_entriesincludes lineage row count, projected-run count, canonical lineage-run count, projected canonical lineage-run count, missing lineage-run count, stale projected lineage-run count, orphaned lineage rows, and whether a rebuild is needed. Operators can refresh that bridge withphp artisan workflow:v2:rebuild-projections --needs-rebuild --prune-stale, use--missingfor only absent summary rows, and use--prune-staleto remove summaries whose run row no longer exists.--needs-rebuilduses the same canonical wait, timeline, timer, and lineage projector comparisons that selected-run detail and history export use, so stale selected-run payload drift is rebuilt even when rows still exist. The command rebuilds the selected run's summary, itsworkflow_run_waitsrows, itsworkflow_run_timeline_entriesrows, itsworkflow_run_timer_entriesrows, and itsworkflow_run_lineage_entriesrows in the same pass, and honors configured v2 run, run-summary, run-wait, run-timeline-entry, run-timer-entry, and run-lineage-entry model classes so it repairs the same projection surface Waterline reports - selected runs can be exported as a versioned replay/debug bundle through
Workflow\V2\Support\HistoryExport,WorkflowStub::historyExport(), Waterline's selected-run history export endpoint, orphp artisan workflow:v2:history-export. The bundle keeps the ordered typed history events, selected-run projection metadata underselected_run, selected-runwaitsandtimelinesnapshots, command records, signal lifecycles, update lifecycles, task rows, activities, activity attempts, timers, failures, lineage links, run metadata, archive metadata, compatibility marker, payload codec, and raw stored argument/output payloads in one JSON-friendly artifact. It also carriescodec_schemasand apayload_manifestso offline consumers can enumerate every encoded payload path, codec, redaction state, Avro framing mode, and writer schema instead of inferring decode rules from section names. Exported activity, activity-attempt, timer, and lineage sections are rebuilt from typed history first, with mutable activity, attempt, timer, andworkflow_linksrows kept as fallback or enrichment for older data. Exported activity status, unsupported-history diagnostics, synthetic current-attempt visibility, and thediagnostic_onlyflag come from the same mixed-era activity view as selected-run detail, so row-only terminal or open-row fallback activity evidence stays aligned across both surfaces. Timer exports use the same authority contract as selected-run detail:statusis the authoritative timer state,source_statusis the status value from that authority, androw_statusis only mutable-row diagnostics. Completed, failed, or cancelled activity rows without typed activity history export as unsupported diagnostics instead of durable results. Fired or cancelled timer rows without typed timer history also export as unsupported diagnostics withdiagnostic_only = true,history_authority = unsupported_terminal_without_history,history_unsupported_reason = terminal_timer_row_without_typed_history, and the mutable terminal state preserved only asrow_status. Lineage export follows that same selected-run snapshot boundary:links.parents[*]andlinks.children[*]echo the projected lineage payload, includinghistory_authorityanddiagnostic_only, instead of doing an extra liveworkflow_linksreread on the export path. A configuredWorkflow\V2\Contracts\HistoryExportRedactorcan replace exported payload and diagnostic fields before the artifact leaves the app; the bundle reportsredaction.applied,redaction.policy, and the concreteredaction.pathsthat were passed through that policy. Each exported artifact also carriesintegrity.canonicalization, a SHA-256integrity.checksum, and, whenworkflows.v2.history_export.signing_keyis configured, an HMAC-SHA256integrity.signatureplus optionalintegrity.key_idso warehouse or incident-review tooling can verify the exact redacted bundle it received. Terminal runs sethistory_complete = true; non-terminal runs can still be exported as point-in-time debugging snapshots but are not archive-complete. - the run summary projection carries the operator-facing next-resume view, including
business_key,visibility_labels,liveness_state,open_wait_id,resume_source_kind,resume_source_id,next_task_id,next_task_type,next_task_status,sort_timestamp, an opaquesort_keyfor stable Waterline list ordering,history_event_count,history_size_bytes,continue_as_new_recommended, andis_terminalso list/detail consumers can distinguish closed runs without re-deriving that state from raw status strings; Waterline applies that samesort_timestampplus run-id tie-breaker contract when it queries list pages, and now uses raw terminal status for dedicatedfailed,cancelled, andterminatedlist screens while still preservingstatus_bucket = failedas the compatibility bridge for the latter two states - Waterline list routes echo the active visibility contract under
visibility_filters, including the contractversion, selectedbucket, exact-match fielddefinition, mergedappliedfilters after any saved-view resolution, and the resolvedsaved_viewpayload when?view=...is in effect; the current exact-match filter set coversinstance_id,run_id,workflow_type,business_key,compatibility,declared_entry_mode,declared_contract_source,connection,queue,status,status_bucket,closed_reason,wait_kind,liveness_state,repair_blocked_reason,repair_attention,task_problem,is_current_run,continue_as_new_recommended,archived,is_terminal, and exactlabel[key]=value/labels[key]=valuematches. The current contract version is5, and current builds accept saved views written against versions1through5. The shareddefinitionpayload includes field labels, editor input types, bounded-field option catalogs, ordering, label-textarea metadata, and the repair-triage option catalog (description,tone, andbadge_visible) so Waterline and other operator clients can render the current filter and repair contract instead of hard-coding one. Waterline also ships built-in system views such assystem:running,system:running-task-problems, andsystem:running-repair-blocked; the repair-blocked view appliesrepair_attention = trueso clients can reuse the durable/searchable badge contract instead of hard-coding reason codes - selected-run wait rows are persisted in
workflow_run_waitsby the run-summary projection pass and distinguish an open backing task from a merely historical task row, so Waterline can tell the difference between healthy resume backing and stale task metadata. Selected-run timeline rows are persisted inworkflow_run_timeline_entries, selected-run timer rows inworkflow_run_timer_entries, and selected-run lineage rows inworkflow_run_lineage_entries, from that same rebuildable projection pass. Selected-run detail and history export both read those surfaces through the same selected-run snapshot contract, so export-levelselected_run.waits_projection_source,selected_run.timeline_projection_source,selected_run.timers_projection_source, andselected_run.lineage_projection_sourcematch the same rebuildable wait, timeline, timer, and lineage payloads that detail uses. When rows were already in sync, those sources reportworkflow_run_waits,workflow_run_timeline_entries,workflow_run_timer_entries, andworkflow_run_lineage_entries; when detail or history export had to recreate missing or stale projection rows on read, they report the matching*_rebuiltsource instead of falling back to ad hoc live reconstruction. Fleet metrics expose the same missing, stale, and orphaned projection drift underoperator_metrics.projections.*, with wait, timeline, timer, and lineage rebuild selection driven by the same canonical projector comparisons used by selected-run detail and export. The rebuilt payloads continue to surface older compatibility bridges only as typed-history-backed diagnostics or enrichment rather than as the primary selected-run contract - selected-run detail and history export also report
current_run_source, so instance-scoped lookups can say whether the current run came from typed continue-as-new lineage or from the durable run-order fallback whenworkflow_instances.current_run_iddrifted - the webhook surface mirrors that command model with explicit start routes plus both instance-targeted and run-targeted command routes under
/webhooks/instances/{workflowId}and/webhooks/instances/{workflowId}/runs/{runId} continueAsNew()keeps the public instance id stable, closes the selected run withclosed_reason = continued, creates the next run immediately, records an explicit lineage link instead of relying on relationship sentinels, and gives that new run its own acceptedstartcommand sourced from the prior run- when that handoff happens after workflow-class drift, the new run now stores the resolved class from the durable type map and snapshots its declared signal or update contract from that resolved definition instead of carrying the stale missing FQCN forward forever
When a worker can no longer load a stored PHP workflow or activity class name directly, the engine can fall back to the configured durable type map. That lets you keep the durable workflow_type or activity_type stable across class renames, as long as the new code still registers the old durable key. Starts from reserved instances and later continueAsNew() generations also normalize newly written runs onto that resolved workflow class so command-contract snapshots and future replay do not depend on the dead class name lingering in storage.
The current task types are:
- workflow task
- activity task
- timer task
Child waiting is modeled through the durable parent/child link and child run state rather than through a separate child task type. When a child closes, the parent resumes through a normal workflow task whose payload names the child-resolution source.
Queries, updates, and child workflows are implemented. The child-workflow surface is intentionally narrow in the current release.
In the current child-workflow slice, calling child() creates a durable child instance and child run plus a durable parent/child lineage link. That child run also gets its own accepted start command with source = workflow, so selected-run command history can explain which parent run, workflow step, and stable child_call_id created it. The parent run waits with wait_kind = child while that child run is active. When the child closes, the runtime first records the parent's own typed child-resolution event, then creates a parent workflow task whose payload carries workflow_wait_kind = child, child_call_id, child_workflow_run_id, open_wait_id, and the child_workflow_run resume source. A completed child resumes the parent with the child output, while a failed child resumes the parent by throwing an exception derived from parent-side child failure history. Parent replay and query now require the parent's typed ChildRun* history as the durable child-outcome authority. Child terminal history and legacy mutable child rows remain available for lineage, diagnostics, and payload enrichment after the parent-side resolution event exists, but a terminal child row without parent typed child history is not enough to resume or query the parent. In particular, once the parent has durably entered a child wait through ChildWorkflowScheduled or ChildRunStarted, query replay keeps that step blocked until the parent commits its own ChildRunCompleted, ChildRunFailed, ChildRunCancelled, or ChildRunTerminated history instead of treating a drifted terminal child row as if the parent had already observed the outcome. If only the child terminal row or link survives and the parent typed child step history is missing, worker and query replay block with history_shape_mismatch and recorded events no typed history; selected-run child waits surface status = unsupported, history_authority = unsupported_terminal_without_history, and history_unsupported_reason = terminal_child_link_without_typed_parent_history, and run-summary liveness reports workflow_replay_blocked when that unsupported child fallback is the selected run's only apparent progress source. If the parent workflow task row is lost after that resolution event, selected-run detail stays anchored to the typed child-resolution history and repair recreates the task with the same child payload.
Explicitly unsupported older activity, timer, and child fallbacks remain visible in selected-run waits as diagnostics, but selected-run task detail does not synthesize missing transport rows for those unsupported waits.
The runtime also supports all([...]) fan-in barriers for activities, for child workflows, and for mixed activity-plus-child groups, including nested all([...]) groups inside the same workflow step. Build those barriers with closures such as fn () => activity(...) and fn () => child(...), or nested all([...]) groups. The parent schedules every durable leaf member, waits until the whole enclosing barrier tree can make progress, returns results in the original nested array shape once every member completes successfully, wakes immediately on the first failed activity or the first failed/cancelled/terminated child, and otherwise suppresses the parent wake-up task until the last successful member in every enclosing group closes. When several failed, cancelled, or terminated members are already closed before the parent replays, the thrown failure is selected by earliest recorded close time, with the lower barrier leaf index as the exact-timestamp tie break; the same rule is used for query replay. Waterline detail exposes those grouped waits with open_wait_count, innermost parallel_group_* metadata, and parallel_group_path when one open wait belongs to more than one barrier. Homogeneous activity barriers use parallel_group_kind = activity, homogeneous child barriers use parallel_group_kind = child, and mixed barriers use parallel_group_kind = mixed with parallel-calls:* group ids so multiple open waits stay visible as one coherent barrier instead of one activity or child pretending to be the only active wait. Replay compatibility comes from committed typed history: if all typed leaf events for a grouped activity or child are missing parallel_group_path, replay blocks as history_shape_mismatch instead of guessing from mutable side rows.
async($callback) is implemented as a package-owned child workflow with workflow type durable-workflow.async. That means async callbacks get real child run ids, parent-side child_call_id lineage, typed child start and close history, and Waterline child-wait visibility instead of a side channel. Async callbacks use the same straight-line-only helper contract as named workflows, and the public helper rejects generator-style yield callbacks. The callback is serialized with Laravel's serializable closure support, so named child(...) workflows remain the better contract for cross-service or long-lived public workflow types.
The runtime includes history-backed child handles through $this->child() and $this->children(), plus parent-side child signaling helpers on those handles. Separate launch handles for async(...) and higher-level bounded-concurrency helpers are still not part of the current surface.
Operator-command behavior is intentionally engine-level:
repair()targets the current selected run and restores durable progress whenliveness_state = repair_needed- accepted repair commands currently return
repair_dispatchedwhen the runtime re-dispatches an overdue ready task, reclaims an expired lease, or recreates a missing workflow, child-resolution workflow, accepted-update, accepted-signal, pending-activity, timer, or condition-timeout workflow task; they returnrepair_not_neededwhen the run already has a healthy durable resume path, when the selected run is already inside an in-flight activity with authoritative typed activity history but no open task row, or when a caller forces repair on older diagnostic-only mutable activity, timer, or child rows that do not name a durable repair candidate - accepted repair commands also append typed
RepairRequestedhistory entries, which point at the repaired task when one was needed - healthy signal waits before receipt stay read-only from a repair perspective because
wait_kind = signalalready names the durable satisfier; afterSignalReceived, a missing application workflow task is repairable and stays identified as a signal application wait withworkflow_wait_kind = signal - healthy child waits stay read-only while the child is still open; after parent-side
ChildRunCompleted,ChildRunFailed,ChildRunCancelled, orChildRunTerminatedhistory is committed, a missing parent resume workflow task is repairable and stays identified as a child-resolution wait withworkflow_wait_kind = child - a running activity without an open activity task is surfaced as
liveness_state = activity_running_without_taskonly when typed activity history is still authoritative for that in-flight execution; Waterline leaves the run observable but hides Repair so operators do not duplicate in-flight user code - older open activity, timer, and child waits that only survive as mutable rows or links without typed history remain visible as diagnostics with
history_authority = mutable_open_fallbackanddiagnostic_only = true, but they no longer populate the selected-run durablewait_kind,open_wait_id, orresume_source_*contract; instead the run projectsliveness_state = workflow_replay_blocked, hides Repair withrepair_blocked_reason = unsupported_history, and treats those rows as observability-only evidence rather than a durable resume or repair source - selected-run detail exposes per-action operator availability fields for the implemented surface:
can_query/query_blocked_reason,can_signal/signal_blocked_reason,can_update/update_blocked_reason,can_repair/repair_blocked_reason, the durable/searchablerepair_attentionbridge,repair_blocked,can_archive/archive_blocked_reason, andcan_cancel/cancel_blocked_reasonpluscan_terminate/terminate_blocked_reason;repair_blockedis the stable metadata companion for the reason code and carries the operator-facing label, description, tone, and whether Waterline should badge it in list views. The oldercan_issue_terminal_commandsflag remains the coarse compatibility bridge for terminal controls cancel()closes the current run ascancelledterminate()closes the current run asterminated- accepted terminal commands record durable command rows and typed history such as
CancelRequested/WorkflowCancelledorTerminateRequested/WorkflowTerminated archive()marks a closed selected run as archived while preserving its durable history, command audit trail, and history export; it recordsArchiveRequestedplusWorkflowArchived, accepts already archived runs asarchive_not_needed, and rejects open runs asrejected_run_not_closed- rejected repair, cancel, terminate, and archive commands still record durable rejected command rows with outcomes such as
rejected_not_started,rejected_not_current,rejected_not_active, orrejected_run_not_closed - open workflow tasks, pending activity executions, and pending timers are marked
cancelled - open timer waits are superseded durably, and late timer jobs no-op instead of reopening the run
- workflow-level cancellation does not magically hard-stop arbitrary user code already executing inside an activity process
Queues
Queued jobs are background processes that are scheduled to run at a later time. Laravel supports running queues via Amazon SQS, Redis, or even a relational database. Workflows and activities are both queued jobs but each behaves a little differently. A workflow will be dispatched mutliple times during normal operation. A workflow runs, dispatches one or more activities and then exits again until the activities are completed. An activity will only execute once during normal operation, as it will only be retried in the case of an error.
Event Sourcing
Event sourcing is a way to build up the current state from a sequence of saved events rather than saving the state directly. This has several benefits, such as providing a complete history of the execution events which can be used to resume a workflow if the server it is running on crashes.
Coroutines
Coroutines are functions that allow execution to be suspended and resumed by returning control to the calling function. Durable suspension points are expressed as straight-line Fiber-backed helper calls such as activity(), await(), timer(), and sideEffect().
User workflow code lives in handle(), which is an ordinary method that calls those suspension helpers directly. Older workflows and activities that still implement execute() continue to load through a compatibility path, but the runtime rejects mixed handle()/execute() inheritance so an entry method never silently changes across one class hierarchy. The runtime first checks whether the step already completed durably. If so, the cached result is replayed from history instead of running the step a second time. Otherwise, the runtime queues the next activity, timer, or child work and suspends until that durable step completes or fails.
Activities
By calling multiple activities, a workflow can orchestrate the results between each of the activities. The execution of the workflow and the durable steps it schedules are interleaved: the workflow reaches an activity call, suspends until that activity completes, and then continues execution from where it left off.
If a workflow fails, the events leading up to the failure are replayed to rebuild the current state. This allows the workflow to pick up where it left off, with the same inputs and outputs as before, ensuring determinism.
Promises
Promises are used to represent the result of an asynchronous operation, such as an activity. The helper call itself suspends through the Fiber-backed runtime while keeping deterministic wait behavior.
Example
Straight-line Fiber-backed helpers are the authoring model.
use Workflow\V2\Workflow;
use function Workflow\V2\{activity, all};
class MyWorkflow extends Workflow
{
public function handle(): array
{
return [
activity(TestActivity::class),
activity(TestOtherActivity::class),
fn () => all([
fn () => activity(TestParallelActivity::class),
fn () => activity(TestParallelOtherActivity::class),
]),
];
}
}
Sequence Diagram
This sequence diagram shows how a workflow progresses through a series of activities, both serial and parallel.
- The workflow starts by getting dispatched as a queued job.
- The first activity, TestActivity, is then dispatched as a queued job. The workflow job then exits. Once TestActivity has completed, it saves the result to the database and returns control to the workflow by dispatching it again.
- At this point, the workflow enters the event sourcing replay loop. This is where it goes back to the database and looks at the event stream to rebuild the current state. This is necessary because the workflow is not a long running process. The workflow exits while any activities are running and then is dispatched again after completion.
- Once the event stream has been replayed, the workflow continues to the next activity, TestOtherActivity, and starts it by dispatching it as a queued job. Again, once TestOtherActivity has completed, it saves the result to the database and returns control to the workflow by dispatching it as a queued job.
- The workflow then enters the event sourcing replay loop again, rebuilding the current state from the event stream.
- Next, the workflow starts two parallel activities, TestParallelActivity and TestOtherParallelActivity. Both activities are dispatched. Once they have completed, they save the results to the database and return control to the workflow.
- Finally, the workflow enters the event sourcing replay loop one last time to rebuild the current state from the event stream. This completes the execution of the workflow.
Summary
The sequence diagram illustrates the workflow starting with the TestActivity and then the TestOtherActivity being executed in series. After both activities complete, the workflow replayed the events in order to rebuild the current state. This process is necessary in order to ensure that the workflow can be resumed after a crash or other interruption.
The need for determinism comes into play when the events are replayed. In order for the workflow to rebuild the correct state, the code for each activity must produce the same result when run multiple times with the same inputs. This means that activities should avoid using things like random numbers (unless using a side effect) or dates, as these will produce different results each time they are run.
The need for idempotency comes into play when an API fails to return a response even though it has actually completed successfully. For example, if an activity charges a customer and is not idempotent, rerunning it after a a failed response could result in the customer being charged twice. To avoid this, activities should be designed to be idempotent.