Monitoring
Waterline is a separate UI that works alongside Horizon. Think of Waterline as being to workflows what Horizon is to queues.
Waterline ships only with the embedded Laravel host that installs the
durable-workflow/workflow package; it reads that app's durable state in
process. The standalone server distribution does not run Waterline. Operators
who run the standalone server read the equivalent
durable-state facts through GET /api/system/health,
GET /api/system/operator-metrics, and the workflow control-plane routes
documented in the
Server API Reference. The
Operator Operating Envelope maps the
Waterline routes below onto their server-side counterparts.
Durable Workflow has two observability planes:
| Plane | Source of truth | Typical questions |
|---|---|---|
| Durable state | Workflow database, Waterline projections, and history export | Did the workflow start? Which run is current? Which signal, update, timer, activity, retry, or failure was committed? Which operator action is safe now? |
| Worker/runtime telemetry | Queue worker logs, SDK metrics recorders, Prometheus/OpenMetrics endpoints, and application traces | Are workers polling? How long do tasks take? Is an exporter configured? Did custom application metrics leave the worker process? |
Waterline intentionally does not replace worker metrics. If a custom metric was recorded in activity or worker code, scrape the worker's telemetry endpoint. Use Waterline to correlate that runtime signal with the durable workflow history and current run state.
When worker telemetry shows repeated claims, late completion races, or stuck leases, read Execution Guarantees and Idempotency alongside this guide. That contract separates at-least-once transport uncertainty from duplicate durable outcomes so duplicate-looking evidence does not turn into the wrong operational conclusion.
Dashboard View

The dashboard shows running totals, recent-run counters, and fleet-wide metrics so you can tell at a glance whether work is flowing, stalling, or failing.
Use the Operator Operating Envelope when you need the rollout and runbook contract for those facts: which diagnostics block traffic, which are advisory, how queue-health facts split between Waterline and worker telemetry, and how to verify rebuild, export, and archive paths.
Workflow View

The workflow detail view shows the durable timeline for a single run: the activities, signals, timers, and child workflows that happened in order, each with its inputs, outputs, and timing.
Installing Waterline
Install Waterline into your Laravel application alongside the workflow package and run its migrations. See durable-workflow/waterline for the full installation and configuration guide.
List and detail API
Waterline's list views (/waterline/api/flows/{bucket}) and selected-run
detail endpoint (/waterline/api/flows/{id}) return typed JSON contracts
that you can consume directly from your own dashboards or scripts. The
Waterline Operator API Reference documents the
endpoint list, selected-run field families, history export, actionability,
schedules, saved views, preferences, and operator-action contract.
Actionability Contract
Waterline annotates list rows, selected-run detail responses, and history
exports with a versioned actionability contract. Consumers should treat
actionability_contract.schema = waterline.actionability and
actionability_contract.version = 1 as the contract identifier for the fields
below.
Run-level actionability answers whether the selected run can be repaired:
| Field | Meaning |
|---|---|
repair_state | One of repairable, blocked, not_needed, or unknown. |
repairable | Boolean shorthand for repair_state = repairable. |
blocked_reason | Stable reason code when repair_state = blocked. |
status_bucket | The Waterline bucket that shaped the run-level decision. |
closed_reason | Durable close reason when the run is closed. |
task_problem | Whether Waterline saw a task-level problem on the run. |
diagnostic_only_evidence | True when at least one child evidence row is informative but not a resume source. |
Evidence rows under activities, waits, timers, exceptions, logs, and
timeline/export entries can also include their own actionability block:
| Field | Meaning |
|---|---|
state | actionable when the row is a valid repair source, otherwise diagnostic_only. |
repair_source | True only for rows backed by a repairable source authority. |
diagnostic_only | True when the row must not be used as a resume source. |
history_authority | Source authority, such as typed_history, mutable_open_fallback, failure_row_fallback, or unsupported_terminal_without_history. |
history_unsupported_reason | Stable reason code for unsupported fallback history. |
Automation should gate repair, resume, and replay affordances from
actionability.repair_state, actionability.repairable, and row-level
actionability.repair_source. A row with diagnostic_only = true is never a
durable resume source, even when it contains useful failure or fallback
metadata. Rows with history_authority = unsupported_terminal_without_history
are diagnostic evidence only; they explain why a run is blocked, but they do
not prove enough typed history to rebuild progress safely.
Control-plane actions from Waterline
Operators can cancel, terminate, repair, and archive workflows directly
from the detail view. Each action maps to a POST on the same run id and
returns either 200 with the resulting state or 409 when the action is
not valid for the run's current state.
Related Guides
- Execution Guarantees and Idempotency explains the replay, retry, lease-expiry, and durable-outcome contract that shapes operator evidence.
- Operator Operating Envelope ties health, queue state, rebuild, export, archive, and topology expectations into one operator contract.
- Failures and Recovery explains retry exhaustion, non-retryable failures, timeouts, and repair behavior behind the dashboard facts.
- AI-Assisted Development names the Waterline, CLI, MCP, and LLM-readable contracts that agents should use when diagnosing workflow state.