Skip to main content
Version: 2.0 prerelease

Agent Tooling Contract

2.0 prerelease

This page documents unreleased 2.0 guidance. The public default docs and canonical LLM bundles remain on the stable 1.x line until the release status is explicitly changed.

Durable Workflow v2 keeps AI-assisted development boring by exposing product facts through stable contracts. A tool should not infer workflow state from HTML, parse logs as the source of truth, or guess which SDK behavior matches a CLI command. It should read the docs version, discover the local surface, call documented operations, and report named facts.

This page defines the contract shape that future MCP tools, local agents, scripts, and SDKs should preserve.

Contract Layers

LayerStable handleContract expectation
Docs retrievalCanonical llms.txt and llms-full.txt track the public site's stable 1.x default Docs path. 2.0 remains reachable through the pinned llms-2.0.txt / llms-full-2.0.txt prerelease aliases.Use canonical URLs for default product work. Pin -1.x.txt when a URL must name the stable major line, and pin -2.0.txt only for explicitly prerelease 2.0 tasks.
Local discovery/mcp/workflows list_workflowsThe app-owned MCP allow-list names exposed workflow keys, required credentials, expected arguments, and smoke-test suitability.
Workflow operationsMCP start_workflow, get_workflow_result, get_workflow_history; dw JSON commands; SDK clientsEvery client reports workflow id, run id, namespace, task queue, command status, and named failure fields without scraping a UI.
Server diagnostics/api/cluster/info, dw server:info --output=json, dw doctor --output=json, dw debug workflow --output=jsonCompatibility, protocol, task-queue, worker, and stuck-run facts are machine-readable and bounded.
Durable evidenceWaterline selected-run detail and history exportReplay, waits, timers, lineage, projection source, integrity checks, durable failures, and operator actionability come from typed state.
Cross-language parityCLI/Python request fixtures and SDK testsShared control-plane operations keep their request shape aligned across languages.

Each layer should be usable on its own. Together, they give an agent enough context to make a small change, prove it, and explain the result.

MCP Tool Design

New MCP tools should expose Durable Workflow concepts directly:

  • Use product nouns such as workflow, run, task queue, schedule, history, failure, worker, namespace, and compatibility.
  • Return stable identifiers and named status fields instead of prose-only summaries.
  • Include bounded arrays and previews for history, failures, and payloads so a client can inspect them without downloading an unbounded trace.
  • Separate discovery from mutation. A client should be able to ask what exists and what credentials are required before starting or commanding a workflow.
  • Mark no-credential smoke workflows explicitly so agents can test local wiring without touching external services.
  • Never include secret values, customer-specific hostnames, or account-specific credentials in tool descriptions or result metadata.

The sample app's /mcp/workflows endpoint is the reference local workflow surface. Future project-specific MCP servers should keep the same posture: configuration owns the allow-list, tools operate only the listed workflows, and results cite durable workflow facts that a human can reproduce through dw, Waterline, or an SDK.

Command And SDK Parity

Automation should be able to move between clients without semantic drift:

Operation familyRequired parity signal
start, signal, update, query, repair, cancel, terminate, archiveCLI and SDK request bodies match the documented control-plane shape.
history, describe, list, resultResponses preserve stable identifiers, status names, timestamps, and failure fields.
task queues and workersCapacity, leases, slots, compatibility, and worker ids remain structured facts.
external executionInput and result envelopes stay language-neutral and carry named bridge outcomes.

When adding a CLI command, SDK method, or MCP tool for a control-plane action, prefer a shared fixture or documented JSON example that another client can assert against. Human-friendly tables can exist, but JSON or JSONL is the automation contract.

Diagnosis Report Shape

When a tool explains a failed or stuck run, the report should include:

  • docs version and source page used
  • command, SDK method, or MCP tool called
  • workflow id, run id, namespace, and task queue
  • current status and latest durable failure summary
  • recent typed history event names
  • pending waits, timers, tasks, or external activity leases
  • worker and task-queue compatibility facts when available
  • named exit code, HTTP status, validation error, or blocked reason

That shape keeps reports portable across local Laravel apps, standalone server deployments, Python workers, and future SDKs.

Guardrails For Agents

Give agents permission to automate ceremony, not to bypass the durable model:

  • Keep workflow orchestration deterministic.
  • Put I/O, randomness, external API calls, and credentials in activities or external handlers.
  • Use signals, updates, queries, schedules, timers, and message streams instead of ad hoc state tables for workflow control.
  • Treat Waterline history export as evidence, not as a mutation API.
  • Treat route lists, database internals, and framework-specific model rows as implementation details unless the public docs explicitly name them.

The invariant remains human-readable: workflows record durable decisions, activities perform fallible work, and replay must be able to explain what happened. The tooling contract gives agents stable handles for the rest.