732 lines
22 KiB
Markdown
732 lines
22 KiB
Markdown
---
|
|
description: "Orchestrate test failure analysis and coordinate parallel specialist test fixers with strategic analysis mode"
|
|
argument-hint: "[test_scope] [--run-first] [--coverage] [--fast] [--strategic] [--research] [--force-escalate] [--no-chain] [--api-only] [--database-only] [--vitest-only] [--pytest-only] [--playwright-only] [--only-category=<unit|integration|e2e|acceptance>]"
|
|
allowed-tools: ["Task", "TodoWrite", "Bash", "Grep", "Read", "LS", "Glob", "SlashCommand"]
|
|
---
|
|
|
|
# Test Orchestration Command (v2.0)
|
|
|
|
Execute this test orchestration procedure for: "$ARGUMENTS"
|
|
|
|
---
|
|
|
|
## ORCHESTRATOR GUARD RAILS
|
|
|
|
### PROHIBITED (NEVER do directly):
|
|
- Direct edits to test files
|
|
- Direct edits to source files
|
|
- pytest --fix or similar
|
|
- git add / git commit
|
|
- pip install / uv add
|
|
- Modifying test configuration
|
|
|
|
### ALLOWED (delegation only):
|
|
- Task(subagent_type="unit-test-fixer", ...)
|
|
- Task(subagent_type="api-test-fixer", ...)
|
|
- Task(subagent_type="database-test-fixer", ...)
|
|
- Task(subagent_type="e2e-test-fixer", ...)
|
|
- Task(subagent_type="type-error-fixer", ...)
|
|
- Task(subagent_type="import-error-fixer", ...)
|
|
- Read-only bash commands for analysis
|
|
- Grep/Glob/Read for investigation
|
|
|
|
**WHY:** Ensures expert handling by specialists, prevents conflicts, maintains audit trail.
|
|
|
|
---
|
|
|
|
## STEP 0: MODE DETECTION + AUTO-ESCALATION + DEPTH PROTECTION
|
|
|
|
### 0a. Depth Protection (prevent infinite loops)
|
|
|
|
```bash
|
|
echo "SLASH_DEPTH=${SLASH_DEPTH:-0}"
|
|
```
|
|
|
|
If SLASH_DEPTH >= 3:
|
|
- Report: "Maximum orchestration depth (3) reached. Exiting to prevent loop."
|
|
- EXIT immediately
|
|
|
|
Otherwise, set for any chained commands:
|
|
```bash
|
|
export SLASH_DEPTH=$((${SLASH_DEPTH:-0} + 1))
|
|
```
|
|
|
|
### 0b. Parse Strategic Flags
|
|
|
|
Check "$ARGUMENTS" for strategic triggers:
|
|
- `--strategic` = Force strategic mode
|
|
- `--research` = Research best practices only (no fixes)
|
|
- `--force-escalate` = Force strategic mode regardless of history
|
|
|
|
If ANY strategic flag present → Set STRATEGIC_MODE=true
|
|
|
|
### 0c. Auto-Escalation Detection
|
|
|
|
Check git history for recurring test fix attempts:
|
|
```bash
|
|
TEST_FIX_COUNT=$(git log --oneline -20 | grep -iE "fix.*(test|spec|jest|pytest|vitest)" | wc -l | tr -d ' ')
|
|
echo "TEST_FIX_COUNT=$TEST_FIX_COUNT"
|
|
```
|
|
|
|
If TEST_FIX_COUNT >= 3:
|
|
- Report: "Detected $TEST_FIX_COUNT test fix attempts in recent history. Auto-escalating to strategic mode."
|
|
- Set STRATEGIC_MODE=true
|
|
|
|
### 0d. Mode Decision
|
|
|
|
| Condition | Mode |
|
|
|-----------|------|
|
|
| --strategic OR --research OR --force-escalate | STRATEGIC |
|
|
| TEST_FIX_COUNT >= 3 | STRATEGIC (auto-escalated) |
|
|
| Otherwise | TACTICAL (default) |
|
|
|
|
Report the mode: "Operating in [TACTICAL/STRATEGIC] mode."
|
|
|
|
---
|
|
|
|
## STEP 1: Parse Arguments
|
|
|
|
Check "$ARGUMENTS" for these flags:
|
|
- `--run-first` = Ignore cached results, run fresh tests
|
|
- `--pytest-only` = Focus on pytest (backend) only
|
|
- `--vitest-only` = Focus on Vitest (frontend) only
|
|
- `--playwright-only` = Focus on Playwright (E2E) only
|
|
- `--coverage` = Include coverage analysis
|
|
- `--fast` = Skip slow tests
|
|
- `--no-chain` = Disable chain invocation after fixes
|
|
- `--only-category=<category>` = Target specific test category for faster iteration
|
|
|
|
**Parse --only-category for targeted test execution:**
|
|
```bash
|
|
# Parse --only-category for finer control
|
|
if [[ "$ARGUMENTS" =~ "--only-category="([a-zA-Z]+) ]]; then
|
|
TARGET_CATEGORY="${BASH_REMATCH[1]}"
|
|
echo "🎯 Targeting only '$TARGET_CATEGORY' tests"
|
|
# Used in STEP 4 to filter pytest: -k $TARGET_CATEGORY
|
|
fi
|
|
```
|
|
|
|
Valid categories: `unit`, `integration`, `e2e`, `acceptance`, `api`, `database`
|
|
|
|
---
|
|
|
|
## STEP 2: Discover Cached Test Results
|
|
|
|
Run these commands ONE AT A TIME:
|
|
|
|
**2a. Project info:**
|
|
```bash
|
|
echo "Project: $(basename $PWD) | Branch: $(git branch --show-current) | Root: $PWD"
|
|
```
|
|
|
|
**2b. Check if pytest results exist:**
|
|
```bash
|
|
test -f "test-results/pytest/junit.xml" && echo "PYTEST_EXISTS=yes" || echo "PYTEST_EXISTS=no"
|
|
```
|
|
|
|
**2c. If pytest results exist, get stats:**
|
|
```bash
|
|
echo "PYTEST_AGE=$(($(date +%s) - $(stat -f %m test-results/pytest/junit.xml 2>/dev/null || stat -c %Y test-results/pytest/junit.xml 2>/dev/null)))s"
|
|
```
|
|
```bash
|
|
echo "PYTEST_TESTS=$(grep -o 'tests="[0-9]*"' test-results/pytest/junit.xml | head -1 | grep -o '[0-9]*')"
|
|
```
|
|
```bash
|
|
echo "PYTEST_FAILURES=$(grep -o 'failures="[0-9]*"' test-results/pytest/junit.xml | head -1 | grep -o '[0-9]*')"
|
|
```
|
|
|
|
**2d. Check Vitest results:**
|
|
```bash
|
|
test -f "test-results/vitest/results.json" && echo "VITEST_EXISTS=yes" || echo "VITEST_EXISTS=no"
|
|
```
|
|
|
|
**2e. Check Playwright results:**
|
|
```bash
|
|
test -f "test-results/playwright/results.json" && echo "PLAYWRIGHT_EXISTS=yes" || echo "PLAYWRIGHT_EXISTS=no"
|
|
```
|
|
|
|
---
|
|
|
|
## STEP 2.5: Test Framework Intelligence
|
|
|
|
Detect test framework configuration:
|
|
|
|
**2.5a. Pytest configuration:**
|
|
```bash
|
|
grep -A 20 "\[tool.pytest" pyproject.toml 2>/dev/null | head -25 || echo "No pytest config in pyproject.toml"
|
|
```
|
|
|
|
**2.5b. Available pytest markers:**
|
|
```bash
|
|
grep -rh "pytest.mark\." tests/ 2>/dev/null | sed 's/.*@pytest.mark.\([a-zA-Z_]*\).*/\1/' | sort -u | head -10
|
|
```
|
|
|
|
**2.5c. Check for slow tests:**
|
|
```bash
|
|
grep -l "@pytest.mark.slow" tests/**/*.py 2>/dev/null | wc -l | xargs echo "Slow tests:"
|
|
```
|
|
|
|
Save detected markers and configuration for agent context.
|
|
|
|
---
|
|
|
|
## STEP 2.6: Discover Project Context (SHARED CACHE - Token Efficient)
|
|
|
|
**Token Savings**: Using shared discovery cache saves ~14K tokens (2K per agent x 7 agents).
|
|
|
|
```bash
|
|
# 📊 SHARED DISCOVERY - Use cached context, refresh if stale (>15 min)
|
|
echo "=== Loading Shared Project Context ==="
|
|
|
|
# Source shared discovery helper (creates/uses cache)
|
|
if [[ -f "$HOME/.claude/scripts/shared-discovery.sh" ]]; then
|
|
source "$HOME/.claude/scripts/shared-discovery.sh"
|
|
discover_project_context
|
|
|
|
# SHARED_CONTEXT now contains pre-built context for agents
|
|
# Variables available: PROJECT_TYPE, VALIDATION_CMD, TEST_FRAMEWORK, RULES_SUMMARY
|
|
else
|
|
# Fallback: inline discovery (less efficient)
|
|
echo "⚠️ Shared discovery not found, using inline discovery"
|
|
|
|
PROJECT_CONTEXT=""
|
|
[ -f "CLAUDE.md" ] && PROJECT_CONTEXT="Read CLAUDE.md for project conventions. "
|
|
[ -d ".claude/rules" ] && PROJECT_CONTEXT+="Check .claude/rules/ for patterns. "
|
|
|
|
PROJECT_TYPE=""
|
|
[ -f "pyproject.toml" ] && PROJECT_TYPE="python"
|
|
[ -f "package.json" ] && PROJECT_TYPE="${PROJECT_TYPE:+$PROJECT_TYPE+}node"
|
|
|
|
SHARED_CONTEXT="$PROJECT_CONTEXT"
|
|
fi
|
|
|
|
# Display cached context summary
|
|
echo "PROJECT_TYPE=$PROJECT_TYPE"
|
|
echo "VALIDATION_CMD=${VALIDATION_CMD:-pnpm prepush}"
|
|
echo "TEST_FRAMEWORK=${TEST_FRAMEWORK:-pytest}"
|
|
```
|
|
|
|
**CRITICAL**: Pass `$SHARED_CONTEXT` to ALL agent prompts instead of asking each agent to discover.
|
|
This prevents 7 agents from each running discovery independently.
|
|
|
|
---
|
|
|
|
## STEP 3: Decision Logic + Early Exit
|
|
|
|
Based on discovery, decide:
|
|
|
|
| Condition | Action |
|
|
|-----------|--------|
|
|
| `--run-first` flag present | Go to STEP 4 (run fresh tests) |
|
|
| PYTEST_EXISTS=yes AND AGE < 900s AND FAILURES > 0 | Go to STEP 5 (read results) |
|
|
| PYTEST_EXISTS=yes AND AGE < 900s AND FAILURES = 0 | **EARLY EXIT** (see below) |
|
|
| PYTEST_EXISTS=no OR AGE >= 900s | Go to STEP 4 (run fresh tests) |
|
|
|
|
### EARLY EXIT OPTIMIZATION (Token Savings: ~80%)
|
|
|
|
If ALL tests are passing from cached results:
|
|
|
|
```
|
|
✅ All tests passing (PYTEST_FAILURES=0, VITEST_FAILURES=0)
|
|
📊 No failures to fix. Skipping agent dispatch.
|
|
💰 Token savings: ~80K tokens (avoided 7 agent dispatches)
|
|
|
|
Output JSON summary:
|
|
{
|
|
"status": "all_passing",
|
|
"tests_run": $PYTEST_TESTS,
|
|
"failures": 0,
|
|
"agents_dispatched": 0,
|
|
"action": "none_required"
|
|
}
|
|
|
|
→ Go to STEP 10 (chain invocation) or EXIT if --no-chain
|
|
```
|
|
|
|
**DO NOT:**
|
|
- Run discovery phase (STEP 2.6) if no failures
|
|
- Dispatch any agents
|
|
- Run strategic analysis
|
|
- Generate documentation
|
|
|
|
This avoids full pipeline when unnecessary.
|
|
|
|
---
|
|
|
|
## STEP 4: Run Fresh Tests (if needed)
|
|
|
|
**4a. Run pytest:**
|
|
```bash
|
|
mkdir -p test-results/pytest && cd apps/api && uv run pytest -v --tb=short --junitxml=../../test-results/pytest/junit.xml 2>&1 | tail -40
|
|
```
|
|
|
|
**4b. Run Vitest (if config exists):**
|
|
```bash
|
|
test -f "apps/web/vitest.config.ts" && mkdir -p test-results/vitest && cd apps/web && npx vitest run --reporter=json --outputFile=../../test-results/vitest/results.json 2>&1 | tail -25
|
|
```
|
|
|
|
**4c. Run Playwright (if config exists):**
|
|
```bash
|
|
test -f "playwright.config.ts" && mkdir -p test-results/playwright && npx playwright test --reporter=json 2>&1 | tee test-results/playwright/results.json | tail -25
|
|
```
|
|
|
|
**4d. If --coverage flag present:**
|
|
```bash
|
|
mkdir -p test-results/pytest && cd apps/api && uv run pytest --cov=app --cov-report=xml:../../test-results/pytest/coverage.xml --cov-report=term-missing 2>&1 | tail -30
|
|
```
|
|
|
|
---
|
|
|
|
## STEP 5: Read Test Result Files
|
|
|
|
Use the Read tool:
|
|
|
|
**For pytest:** `Read(file_path="test-results/pytest/junit.xml")`
|
|
- Look for `<testcase>` with `<failure>` or `<error>` children
|
|
- Extract: test name, classname (file path), failure message, **full stack trace**
|
|
|
|
**For Vitest:** `Read(file_path="test-results/vitest/results.json")`
|
|
- Look for `"status": "failed"` entries
|
|
- Extract: test name, file path, failure messages
|
|
|
|
**For Playwright:** `Read(file_path="test-results/playwright/results.json")`
|
|
- Look for specs where `"ok": false`
|
|
- Extract: test title, browser, error message
|
|
|
|
---
|
|
|
|
## STEP 5.5: ANALYSIS PHASE
|
|
|
|
### 5.5a. Test Isolation Analysis
|
|
|
|
Check for potential isolation issues:
|
|
|
|
```bash
|
|
echo "=== Shared State Detection ===" && grep -rn "global\|class.*:$" tests/ 2>/dev/null | grep -v "conftest\|__pycache__" | head -10
|
|
```
|
|
|
|
```bash
|
|
echo "=== Fixture Scope Analysis ===" && grep -rn "@pytest.fixture.*scope=" tests/ 2>/dev/null | head -10
|
|
```
|
|
|
|
```bash
|
|
echo "=== Order Dependency Markers ===" && grep -rn "pytest.mark.order\|pytest.mark.serial" tests/ 2>/dev/null | head -5
|
|
```
|
|
|
|
If isolation issues detected:
|
|
- Add to agent context: "WARNING: Potential test isolation issues detected"
|
|
- List affected files
|
|
|
|
### 5.5b. Flakiness Detection
|
|
|
|
Check for flaky test indicators:
|
|
|
|
```bash
|
|
echo "=== Timing Dependencies ===" && grep -rn "sleep\|time.sleep\|setTimeout" tests/ 2>/dev/null | grep -v "__pycache__" | head -5
|
|
```
|
|
|
|
```bash
|
|
echo "=== Async Race Conditions ===" && grep -rn "asyncio.gather\|Promise.all" tests/ 2>/dev/null | head -5
|
|
```
|
|
|
|
If flakiness indicators found:
|
|
- Add to agent context: "Known flaky patterns detected"
|
|
- Recommend: pytest-rerunfailures or vitest retry
|
|
|
|
### 5.5c. Coverage Analysis (if --coverage)
|
|
|
|
```bash
|
|
test -f "test-results/pytest/coverage.xml" && grep -o 'line-rate="[0-9.]*"' test-results/pytest/coverage.xml | head -1
|
|
```
|
|
|
|
Coverage gates:
|
|
- < 60%: WARN "Critical: Coverage below 60%"
|
|
- 60-80%: INFO "Coverage could be improved"
|
|
- > 80%: OK
|
|
|
|
---
|
|
|
|
## STEP 6: Enhanced Failure Categorization (Regex-Based)
|
|
|
|
Use regex pattern matching for precise categorization:
|
|
|
|
### Unit Test Patterns → unit-test-fixer
|
|
- `/AssertionError:.*expected.*got/` → Assertion mismatch
|
|
- `/Mock.*call_count.*expected/` → Mock verification failure
|
|
- `/fixture.*not found/` → Fixture missing
|
|
- Business logic failures
|
|
|
|
### API Test Patterns → api-test-fixer
|
|
- `/status.*(4\d\d|5\d\d)/` → HTTP error response
|
|
- `/validation.*failed|ValidationError/` → Schema validation
|
|
- `/timeout.*\d+\s*(s|ms)/` → Request timeout
|
|
- FastAPI/Flask/Django endpoint failures
|
|
|
|
### Database Test Patterns → database-test-fixer
|
|
- `/connection.*refused|ConnectionError/` → Connection failure
|
|
- `/relation.*does not exist|table.*not found/` → Schema mismatch
|
|
- `/deadlock.*detected/` → Concurrency issue
|
|
- `/IntegrityError|UniqueViolation/` → Constraint violation
|
|
- Fixture/mock database issues
|
|
|
|
### E2E Test Patterns → e2e-test-fixer
|
|
- `/locator.*timeout|element.*not found/` → Selector failure
|
|
- `/navigation.*failed|page.*crashed/` → Page load issue
|
|
- `/screenshot.*captured/` → Visual regression
|
|
- Playwright/Cypress failures
|
|
|
|
### Type Error Patterns → type-error-fixer
|
|
- `/TypeError:.*expected.*got/` → Type mismatch
|
|
- `/mypy.*error/` → Static type check failure
|
|
- `/TypeScript.*error TS/` → TS compilation error
|
|
|
|
### Import Error Patterns → import-error-fixer
|
|
- `/ModuleNotFoundError|ImportError/` → Missing module
|
|
- `/circular import/` → Circular dependency
|
|
- `/cannot import name/` → Named import failure
|
|
|
|
---
|
|
|
|
## STEP 6.5: FAILURE PRIORITIZATION
|
|
|
|
Assign priority based on test type:
|
|
|
|
| Priority | Criteria | Detection |
|
|
|----------|----------|-----------|
|
|
| P0 Critical | Security/auth tests | `test_auth_*`, `test_security_*`, `test_permission_*` |
|
|
| P1 High | Core business logic | `test_*_service`, `test_*_handler`, most unit tests |
|
|
| P2 Medium | Integration tests | `test_*_integration`, API tests |
|
|
| P3 Low | Edge cases, performance | `test_*_edge_*`, `test_*_perf_*`, `test_*_slow` |
|
|
|
|
Pass priority information to agents:
|
|
- "Priority: P0 - Fix these FIRST (security critical)"
|
|
- "Priority: P1 - High importance (core logic)"
|
|
|
|
---
|
|
|
|
## STEP 7: STRATEGIC MODE (if triggered)
|
|
|
|
If STRATEGIC_MODE=true:
|
|
|
|
### 7a. Launch Test Strategy Analyst
|
|
|
|
```
|
|
Task(subagent_type="test-strategy-analyst",
|
|
model="opus",
|
|
description="Analyze recurring test failures",
|
|
prompt="Analyze test failures in this project using Five Whys methodology.
|
|
|
|
Git history shows $TEST_FIX_COUNT recent test fix attempts.
|
|
Current failures: [FAILURE SUMMARY]
|
|
|
|
Research:
|
|
1. Best practices for the detected failure patterns
|
|
2. Common pitfalls in pytest/vitest testing
|
|
3. Root cause analysis for recurring issues
|
|
|
|
Provide strategic recommendations for systemic fixes.
|
|
|
|
MANDATORY OUTPUT FORMAT - Return ONLY JSON:
|
|
{
|
|
\"root_causes\": [{\"issue\": \"...\", \"five_whys\": [...], \"recommendation\": \"...\"}],
|
|
\"infrastructure_changes\": [\"...\"],
|
|
\"prevention_mechanisms\": [\"...\"],
|
|
\"priority\": \"P0|P1|P2\",
|
|
\"summary\": \"Brief strategic overview\"
|
|
}
|
|
DO NOT include verbose analysis or full code examples.")
|
|
```
|
|
|
|
### 7b. After Strategy Analyst Completes
|
|
|
|
If fixes are recommended, proceed to STEP 8.
|
|
|
|
### 7c. Launch Documentation Generator (optional)
|
|
|
|
If significant insights were found:
|
|
```
|
|
Task(subagent_type="test-documentation-generator",
|
|
model="haiku",
|
|
description="Generate test knowledge documentation",
|
|
prompt="Based on the strategic analysis results, generate:
|
|
1. Test failure runbook (docs/test-failure-runbook.md)
|
|
2. Test strategy summary (docs/test-strategy.md)
|
|
3. Pattern-specific knowledge (docs/test-knowledge/)
|
|
|
|
MANDATORY OUTPUT FORMAT - Return ONLY JSON:
|
|
{
|
|
\"files_created\": [\"docs/test-failure-runbook.md\"],
|
|
\"patterns_documented\": 3,
|
|
\"summary\": \"Created runbook with 5 failure patterns\"
|
|
}
|
|
DO NOT include file contents in response.")
|
|
```
|
|
|
|
---
|
|
|
|
## STEP 7.5: Conflict Detection for Parallel Agents
|
|
|
|
Before launching agents, detect overlapping file scopes to prevent conflicts:
|
|
|
|
**SAFE TO PARALLELIZE (different test domains):**
|
|
- unit-test-fixer + e2e-test-fixer → ✅ Different test directories
|
|
- api-test-fixer + database-test-fixer → ✅ Different concerns
|
|
- vitest tests + pytest tests → ✅ Different frameworks
|
|
|
|
**MUST SERIALIZE (overlapping files):**
|
|
- unit-test-fixer + import-error-fixer → ⚠️ Both may modify conftest.py → SEQUENTIAL
|
|
- type-error-fixer + any test fixer → ⚠️ Type fixes affect test expectations → RUN FIRST
|
|
- Multiple fixers for same test file → ⚠️ RUN SEQUENTIALLY
|
|
|
|
**Execution Phases:**
|
|
```
|
|
PHASE 1 (First): type-error-fixer, import-error-fixer
|
|
└── These fix foundational issues that other agents depend on
|
|
|
|
PHASE 2 (Parallel): unit-test-fixer, api-test-fixer, database-test-fixer
|
|
└── These target different test categories, safe to run together
|
|
|
|
PHASE 3 (Last): e2e-test-fixer
|
|
└── E2E depends on backend fixes being complete
|
|
|
|
PHASE 4 (Validation): Run full test suite to verify all fixes
|
|
```
|
|
|
|
**Conflict Detection Algorithm:**
|
|
```bash
|
|
# Check if multiple agents target same file patterns
|
|
# If conftest.py in scope of multiple agents → serialize them
|
|
# If same test file reported → assign to single agent only
|
|
```
|
|
|
|
---
|
|
|
|
## STEP 8: PARALLEL AGENT DISPATCH
|
|
|
|
### CRITICAL: Launch ALL agents in ONE response with multiple Task calls.
|
|
|
|
### ENHANCED AGENT CONTEXT TEMPLATE
|
|
|
|
For each agent, provide this comprehensive context:
|
|
|
|
```
|
|
Test Specialist Task: [Agent Type] - Test Failure Fix
|
|
|
|
## Context
|
|
- Project: [detected from git remote]
|
|
- Branch: [from git branch --show-current]
|
|
- Framework: pytest [version] / vitest [version]
|
|
- Python/Node version: [detected]
|
|
|
|
## Project Patterns (DISCOVER DYNAMICALLY - Do This First!)
|
|
**CRITICAL - Project Context Discovery:**
|
|
Before making any fixes, you MUST:
|
|
1. Read CLAUDE.md at project root (if exists) for project conventions
|
|
2. Check .claude/rules/ directory for domain-specific rule files:
|
|
- If editing Python test files → read python*.md rules
|
|
- If editing TypeScript tests → read typescript*.md rules
|
|
- If graphiti/temporal patterns exist → read graphiti.md rules
|
|
3. Detect test patterns from config files (pytest.ini, vitest.config.ts)
|
|
4. Apply discovered patterns to ALL your fixes
|
|
|
|
This ensures fixes follow project conventions, not generic patterns.
|
|
|
|
[Include PROJECT_CONTEXT from STEP 2.6 here]
|
|
|
|
## Recent Test Changes
|
|
[git diff HEAD~3 --name-only | grep -E "(test|spec)\.(py|ts|tsx)$"]
|
|
|
|
## Failures to Fix
|
|
[FAILURE LIST with full stack traces]
|
|
|
|
## Test Isolation Status
|
|
[From STEP 5.5a - any warnings]
|
|
|
|
## Flakiness Report
|
|
[From STEP 5.5b - any detected patterns]
|
|
|
|
## Priority
|
|
[From STEP 6.5 - P0/P1/P2/P3 with reasoning]
|
|
|
|
## Framework Configuration
|
|
[From STEP 2.5 - markers, config]
|
|
|
|
## Constraints
|
|
- Follow project's test method length limits (check CLAUDE.md or file-size-guidelines.md)
|
|
- Pre-flight: Verify baseline tests pass
|
|
- Post-flight: Ensure no broken existing tests
|
|
- Cannot modify implementation code (test expectations only unless bug found)
|
|
- Apply project-specific patterns discovered from CLAUDE.md/.claude/rules/
|
|
|
|
## Expected Output
|
|
- Summary of fixes made
|
|
- Files modified with line numbers
|
|
- Verification commands run
|
|
- Remaining issues (if any)
|
|
```
|
|
|
|
### Dispatch Example (with Model Strategy + JSON Output)
|
|
|
|
```
|
|
Task(subagent_type="unit-test-fixer",
|
|
model="sonnet",
|
|
description="Fix unit test failures (P1)",
|
|
prompt="[FULL ENHANCED CONTEXT TEMPLATE]
|
|
|
|
MANDATORY OUTPUT FORMAT - Return ONLY JSON:
|
|
{
|
|
\"status\": \"fixed|partial|failed\",
|
|
\"tests_fixed\": N,
|
|
\"files_modified\": [\"path/to/file.py\"],
|
|
\"remaining_failures\": N,
|
|
\"summary\": \"Brief description of fixes\"
|
|
}
|
|
DO NOT include full file content or verbose logs.")
|
|
|
|
Task(subagent_type="api-test-fixer",
|
|
model="sonnet",
|
|
description="Fix API test failures (P2)",
|
|
prompt="[FULL ENHANCED CONTEXT TEMPLATE]
|
|
|
|
MANDATORY OUTPUT FORMAT - Return ONLY JSON:
|
|
{...same format...}
|
|
DO NOT include full file content or verbose logs.")
|
|
|
|
Task(subagent_type="import-error-fixer",
|
|
model="haiku",
|
|
description="Fix import errors (P1)",
|
|
prompt="[CONTEXT]
|
|
|
|
MANDATORY OUTPUT FORMAT - Return ONLY JSON:
|
|
{...same format...}")
|
|
```
|
|
|
|
### Model Strategy
|
|
|
|
| Agent Type | Model | Rationale |
|
|
|------------|-------|-----------|
|
|
| test-strategy-analyst | opus | Complex research + Five Whys |
|
|
| unit/api/database/e2e-test-fixer | sonnet | Balanced speed + quality |
|
|
| type-error-fixer | sonnet | Type inference complexity |
|
|
| import-error-fixer | haiku | Simple pattern matching |
|
|
| linting-fixer | haiku | Rule-based fixes |
|
|
| test-documentation-generator | haiku | Template-based docs |
|
|
|
|
---
|
|
|
|
## STEP 9: Validate Fixes
|
|
|
|
After agents complete:
|
|
|
|
```bash
|
|
cd apps/api && uv run pytest -v --tb=short --junitxml=../../test-results/pytest/junit.xml 2>&1 | tail -40
|
|
```
|
|
|
|
Check results:
|
|
- If ALL tests pass → Go to STEP 10
|
|
- If SOME tests still fail → Report remaining failures, suggest --strategic
|
|
|
|
---
|
|
|
|
## STEP 10: INTELLIGENT CHAIN INVOCATION
|
|
|
|
### 10a. Check Depth
|
|
If SLASH_DEPTH >= 3:
|
|
- Report: "Maximum depth reached, skipping chain invocation"
|
|
- Go to STEP 11
|
|
|
|
### 10b. Check --no-chain Flag
|
|
If --no-chain present:
|
|
- Report: "Chain invocation disabled by flag"
|
|
- Go to STEP 11
|
|
|
|
### 10c. Determine Chain Action
|
|
|
|
**If ALL tests passing AND changes were made:**
|
|
```
|
|
SlashCommand(skill="/commit_orchestrate",
|
|
args="--message 'fix(tests): resolve test failures'")
|
|
```
|
|
|
|
**If ALL tests passing AND NO changes made:**
|
|
- Report: "All tests passing, no changes needed"
|
|
- Go to STEP 11
|
|
|
|
**If SOME tests still failing:**
|
|
- Report remaining failure count
|
|
- If TACTICAL mode: Suggest "Run with --strategic for root cause analysis"
|
|
- Go to STEP 11
|
|
|
|
---
|
|
|
|
## STEP 11: Report Summary
|
|
|
|
Report:
|
|
- Mode: TACTICAL or STRATEGIC
|
|
- Initial failure count by type
|
|
- Agents dispatched with priorities
|
|
- Strategic insights (if applicable)
|
|
- Current pass/fail status
|
|
- Coverage status (if --coverage)
|
|
- Chain invocation result
|
|
- Remaining issues and recommendations
|
|
|
|
---
|
|
|
|
## Quick Reference
|
|
|
|
| Command | Effect |
|
|
|---------|--------|
|
|
| `/test_orchestrate` | Use cached results if fresh (<15 min) |
|
|
| `/test_orchestrate --run-first` | Run tests fresh, ignore cache |
|
|
| `/test_orchestrate --pytest-only` | Only pytest failures |
|
|
| `/test_orchestrate --strategic` | Force strategic mode (research + analysis) |
|
|
| `/test_orchestrate --coverage` | Include coverage analysis |
|
|
| `/test_orchestrate --no-chain` | Don't auto-invoke /commit_orchestrate |
|
|
|
|
## VS Code Integration
|
|
|
|
pytest.ini must have: `addopts = --junitxml=test-results/pytest/junit.xml`
|
|
|
|
Then: Run tests in VS Code -> `/test_orchestrate` reads cached results -> Fixes applied
|
|
|
|
---
|
|
|
|
## Agent Quick Reference
|
|
|
|
| Failure Pattern | Agent | Model | JSON Output |
|
|
|-----------------|-------|-------|-------------|
|
|
| Assertions, mocks, fixtures | unit-test-fixer | sonnet | Required |
|
|
| HTTP, API contracts, endpoints | api-test-fixer | sonnet | Required |
|
|
| Database, SQL, connections | database-test-fixer | sonnet | Required |
|
|
| Selectors, timeouts, E2E | e2e-test-fixer | sonnet | Required |
|
|
| Type annotations, mypy | type-error-fixer | sonnet | Required |
|
|
| Imports, modules, paths | import-error-fixer | haiku | Required |
|
|
| Strategic analysis | test-strategy-analyst | opus | Required |
|
|
| Documentation | test-documentation-generator | haiku | Required |
|
|
|
|
## Token Efficiency: JSON Output Format
|
|
|
|
**ALL agents MUST return distilled JSON summaries only.**
|
|
|
|
```json
|
|
{
|
|
"status": "fixed|partial|failed",
|
|
"tests_fixed": 3,
|
|
"files_modified": ["tests/test_auth.py", "tests/conftest.py"],
|
|
"remaining_failures": 0,
|
|
"summary": "Fixed mock configuration and assertion order"
|
|
}
|
|
```
|
|
|
|
**DO NOT return:**
|
|
- Full file contents
|
|
- Verbose explanations
|
|
- Step-by-step execution logs
|
|
|
|
This reduces token usage by 80-90% per agent response.
|
|
|
|
---
|
|
|
|
EXECUTE NOW. Start with Step 0a (depth check).
|