Final piece of the contract execution increment.
- run_contract_validation: env up → backend cases + UI flows → env down
- contract_validation_gate: bounded self-heal loop (execute_contract_fix_phase
feeds CONTRACT_EXEC_FAILURES back to a focused fix prompt, commits, re-runs)
- Wired as a per-epic gate after the story loop (v1 granularity: the app
reflects the whole epic before contracts run), opt-in by harness presence and
--skip-contract-validation
- Exit-code-honest: CONTRACT_VALIDATION_FAILED makes the epic exit non-zero if
contracts never pass, mirroring the preflight gate
Tested: orchestrator brings the env up/down and runs backend cases against a
live mock server with correct pass/fail + failure detail. Live UI flows and the
fix loop's Claude calls need the real app/CLI.
This completes the UI-contract + execution work held on this branch; ready to
bundle into one PR.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Second half of the contract execution engine (held on branch per bundle plan).
- generate_playwright_spec: translate ui.flows into a Playwright spec - goto /
getByTestId click+fill (with getByLabel/getByRole/getByText fallback),
visible/hidden/text/url assertions, role-based storageState for allowed/
forbidden checks; persistence is delegated to backend cases
- run_ui_flows: generate the spec, run it via the project's `npx playwright
test --reporter=json`, and parse results
- parse_playwright_report: read stats.unexpected + failed titles into
CONTRACT_EXEC_FAILURES for the fix loop
- _pw_locator (testid → label → role → text) and _ts_safe helpers
Tested: spec generation for the canonical "create a quote" allowed flow + a
"viewer cannot" forbidden flow produces correct TS; report parsing handles
pass and fail. Live browser execution needs the real app + browser binaries.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First half of per-story contract execution (held on branch per the bundle plan).
- contract-exec.sh: granularity-agnostic engine the caller can invoke per-story
or per-epic
- contract_env_up / contract_env_down: bring the sample env up (setup → start →
poll readiness) and tear it down
- run_backend_cases: for each harness case, call the API (curl), assert status
and response body_contains (jq subset match), then verify persistence by
invoking datastore.verify_command as `<cmd> --table <t> --where <json>`;
failures are collected in CONTRACT_EXEC_FAILURES for the fix loop
- _json_contains: JSON-subset assertion helper
Tested against a local mock API: passing case (status + multi-field body +
persistence), status-mismatch failure, and persistence-miss failure all behave
correctly; env up/down orchestration smoke-tested. UI flow execution and gate
wiring are the next pieces; live end-to-end needs a real app.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>