1037 lines
32 KiB
Markdown
1037 lines
32 KiB
Markdown
# <!-- Powered by BMAD™ Core -->
|
|
|
|
# Core Development Principles
|
|
|
|
Shared philosophy and principles for all JavaScript/TypeScript development agents. These principles guide decision-making across the entire stack.
|
|
|
|
## Universal Principles
|
|
|
|
**Type Safety First**: Catch errors at compile time, not runtime. Use TypeScript strict mode, explicit types, and leverage inference.
|
|
|
|
**Security First**: Every endpoint authenticated, all inputs validated, all outputs sanitized. Never trust client input.
|
|
|
|
**Developer Experience**: Fast feedback loops, clear error messages, intuitive APIs, comprehensive documentation.
|
|
|
|
**Performance Matters**: Optimize for perceived performance first. Measure before optimizing. Cache strategically.
|
|
|
|
**Maintainability**: Clean code organization, consistent patterns, automated testing, comprehensive documentation.
|
|
|
|
**YAGNI (You Aren't Gonna Need It)**: Don't over-engineer for future needs. Build for 2x scale, not 100x. Prefer boring, proven technology.
|
|
|
|
**Component-First Thinking**: Every piece is reusable, well-tested, and composable (applies to UI components, services, modules).
|
|
|
|
**Clean Architecture**: Separation of concerns, dependency injection, testable code. Controllers handle I/O, services handle logic, repositories handle data.
|
|
|
|
**Observability**: Logging, monitoring, error tracking. You can't improve what you don't measure.
|
|
|
|
## Context Efficiency Philosophy
|
|
|
|
All agents optimize token usage through **high-signal communication**:
|
|
|
|
**Reference, Don't Repeat**: Point to file paths instead of duplicating content
|
|
- ✅ "Implementation in `src/services/auth.service.ts`"
|
|
- ❌ [Pasting entire file contents]
|
|
|
|
**Provide Summaries**: After creating artifacts, give 2-3 sentence overview with file reference
|
|
- ✅ "Created authentication service with JWT + refresh tokens. See `src/services/auth.service.ts` for implementation."
|
|
- ❌ [Explaining every line of code written]
|
|
|
|
**Progressive Detail**: Start with high-level structure, add details only when implementing
|
|
- ✅ Start: "Need authentication with JWT" → Later: "bcrypt rounds, token expiry settings"
|
|
- ❌ Upfront: Every security configuration detail before deciding approach
|
|
|
|
**Archive Verbose Content**: Keep implementations/discussions in files, reference them
|
|
- ✅ Long discussions → `docs/decisions/auth-strategy.md` → Reference in checkpoint
|
|
- ❌ Repeating entire discussion in every message
|
|
|
|
**Checkpoint Summaries**: At phase transitions, compress context into max 100-line summaries containing:
|
|
- Final decisions and rationale
|
|
- Artifact file paths
|
|
- Critical constraints and dependencies
|
|
- What to do next
|
|
|
|
## Just-in-Time Context Retrieval
|
|
|
|
**CRITICAL**: Agents must retrieve context **only when making specific decisions**, not upfront. This prevents token waste and maintains focus.
|
|
|
|
### Start Minimal (Every Task)
|
|
|
|
**Load at Task Start**:
|
|
- ✅ Your role definition and core principles (this file)
|
|
- ✅ The specific task description and requirements
|
|
- ✅ Any explicitly referenced artifacts (checkpoint files, requirements docs)
|
|
- ❌ NO technology-specific guides yet
|
|
- ❌ NO implementation patterns yet
|
|
- ❌ NO best practices yet
|
|
|
|
**Example**: React Developer receives task "Build user profile component"
|
|
- Load: Role definition, task description, requirements
|
|
- Skip: state-management-guide.md, react-patterns.md, testing-strategy.md (load later when needed)
|
|
|
|
### Load on Decision Points (Not Before)
|
|
|
|
**Decision-Triggered Loading**:
|
|
|
|
1. **Technology Selection** → Load technology-stack-guide.md
|
|
- ONLY when choosing framework/library
|
|
- NOT if stack already decided
|
|
|
|
2. **Implementation Pattern** → Load pattern-specific guide
|
|
- ONLY when implementing that specific pattern
|
|
- Example: Loading GraphQL patterns ONLY if building GraphQL API
|
|
|
|
3. **Security Concern** → Load security-guidelines.md
|
|
- ONLY when handling auth, sensitive data, or user input
|
|
- NOT for every task
|
|
|
|
4. **Performance Issue** → Load performance optimization guides
|
|
- ONLY when performance requirements specified
|
|
- NOT preemptively
|
|
|
|
5. **Deployment Decision** → Load deployment-strategies.md
|
|
- ONLY when choosing deployment approach
|
|
- NOT during feature development
|
|
|
|
### Question Before Loading
|
|
|
|
**Before loading any data file, ask**:
|
|
- Is this decision being made RIGHT NOW?
|
|
- Can I defer this decision to implementation?
|
|
- Is this information in the checkpoint/requirements already?
|
|
|
|
**Examples**:
|
|
|
|
❌ **Over-Loading**: "Building auth system" → Loading security-guidelines.md, api-implementation-patterns.md, deployment-strategies.md, database-optimization.md
|
|
- WHY BAD: Most won't be needed yet
|
|
|
|
✅ **JIT Loading**: "Building auth system" → Start with requirements → Load security-guidelines.md when designing auth flow → Load api-implementation-patterns.md when implementing endpoints → Skip deployment/optimization (not needed yet)
|
|
|
|
❌ **Over-Loading**: "Design architecture" → Loading ALL data files (technology-stack-guide.md, deployment-strategies.md, security-guidelines.md, best-practices.md, etc.)
|
|
- WHY BAD: Won't use 80% of it in architecture phase
|
|
|
|
✅ **JIT Loading**: "Design architecture" → Load technology-stack-guide.md for stack decision → Load deployment-strategies.md for hosting decision → Load security-guidelines.md IF handling sensitive data → Skip implementation patterns (for later)
|
|
|
|
### Context Loading Decision Tree
|
|
|
|
```
|
|
Task Received
|
|
├─ Load: Role + Task + Requirements (ALWAYS)
|
|
├─ Checkpoint exists? → Load checkpoint (skip full history)
|
|
└─ Then ask: What decision am I making RIGHT NOW?
|
|
│
|
|
├─ Choosing Stack/Framework?
|
|
│ └─ Load: technology-stack-guide.md
|
|
│
|
|
├─ Implementing API endpoints?
|
|
│ ├─ What type? REST/GraphQL/tRPC?
|
|
│ └─ Load: api-implementation-patterns.md (specific section only)
|
|
│
|
|
├─ Handling Authentication/Authorization?
|
|
│ └─ Load: security-guidelines.md
|
|
│
|
|
├─ Database schema design?
|
|
│ └─ Load: database-design-patterns.md
|
|
│
|
|
├─ Deployment/Hosting decision?
|
|
│ └─ Load: deployment-strategies.md
|
|
│
|
|
├─ Performance optimization?
|
|
│ └─ Load: performance-optimization.md
|
|
│
|
|
├─ Code review/quality check?
|
|
│ └─ Load: development-guidelines.md + relevant checklist
|
|
│
|
|
└─ General implementation?
|
|
└─ Load NOTHING extra - use role knowledge
|
|
(Load specific guides only if encountering decision)
|
|
```
|
|
|
|
### Role-Specific Context Limits
|
|
|
|
**React Developer**:
|
|
- Default load: core-principles.md only
|
|
- Load state-management-guide.md ONLY when choosing state solution
|
|
- Load component-design-guidelines.md ONLY when building complex component
|
|
- SKIP: Backend patterns, API specs (unless integrating), database docs
|
|
|
|
**Node Backend Developer**:
|
|
- Default load: core-principles.md only
|
|
- Load framework comparison ONLY when choosing Express/Fastify/NestJS
|
|
- Load database patterns ONLY when writing queries
|
|
- SKIP: React patterns, frontend state management, CSS guides
|
|
|
|
**Solution Architect**:
|
|
- Default load: core-principles.md + requirements
|
|
- Load technology-stack-guide.md for stack decisions
|
|
- Load deployment-strategies.md for hosting decisions
|
|
- Load security-guidelines.md IF requirements mention sensitive data
|
|
- SKIP: Implementation patterns (delegate to implementation agents)
|
|
|
|
**API Developer**:
|
|
- Default load: core-principles.md only
|
|
- Load api-implementation-patterns.md for specific API type (REST/GraphQL/tRPC section only)
|
|
- SKIP: Frontend patterns, deployment strategies, database optimization
|
|
|
|
**TypeScript Expert**:
|
|
- Default load: core-principles.md only
|
|
- Load tsconfig reference ONLY when configuring TypeScript
|
|
- Load advanced patterns ONLY when implementing complex types
|
|
- SKIP: Framework-specific guides, deployment, testing (unless TS-specific)
|
|
|
|
### Progressive Context Expansion
|
|
|
|
**3-Stage Loading Pattern**:
|
|
|
|
**Stage 1 - Task Start (Minimal)**:
|
|
- Role definition
|
|
- Task requirements
|
|
- Referenced artifacts only
|
|
|
|
**Stage 2 - Decision Point (Targeted)**:
|
|
- Load specific guide for current decision
|
|
- Load ONLY relevant section (not entire file)
|
|
- Example: Load "REST API" section from api-implementation-patterns.md, skip GraphQL/tRPC sections
|
|
|
|
**Stage 3 - Implementation (On-Demand)**:
|
|
- Load patterns as you encounter need
|
|
- Load examples when unclear
|
|
- Load checklists when validating
|
|
|
|
### Anti-Patterns (What NOT to Do)
|
|
|
|
❌ **"Load Everything Just in Case"**
|
|
- Loading all data files at task start
|
|
- "Might need it later" is NOT a valid reason
|
|
|
|
❌ **"Load Full File for One Detail"**
|
|
- Load specific section only
|
|
- Reference file path for full details
|
|
|
|
❌ **"Load Before Decision Needed"**
|
|
- Don't load deployment guide during feature design
|
|
- Don't load implementation patterns during architecture phase
|
|
|
|
❌ **"Load Other Agents' Context"**
|
|
- Backend developer doesn't need React state management
|
|
- Frontend developer doesn't need database optimization
|
|
|
|
### Validation Questions
|
|
|
|
**Before loading any file, ask yourself**:
|
|
1. Am I making THIS specific decision RIGHT NOW?
|
|
2. Is this my role's responsibility?
|
|
3. Can this wait until implementation?
|
|
4. Is this already in checkpoint/requirements?
|
|
|
|
**If answer to #1 or #2 is NO → Don't load it**
|
|
|
|
### Token Budget Awareness
|
|
|
|
**Typical Token Costs**:
|
|
- Role definition: ~200-300 tokens
|
|
- Task description: ~100-500 tokens
|
|
- Data file (full): ~1,000-3,000 tokens
|
|
- Data file (section): ~200-500 tokens
|
|
- Checkpoint: ~500-1,000 tokens
|
|
|
|
**Target Budget**:
|
|
- Task start: <1,000 tokens total
|
|
- Per decision: +200-500 tokens (specific guide section)
|
|
- Full workflow: <10,000 tokens cumulative (use checkpoints)
|
|
|
|
**If approaching budget limit**: Create checkpoint, archive verbose context, start fresh phase.
|
|
|
|
## Runtime Token Monitoring
|
|
|
|
**CRITICAL**: Agents must actively monitor token usage during execution and self-regulate to stay within budgets.
|
|
|
|
### Token Self-Assessment Protocol
|
|
|
|
**At Task Start**:
|
|
```
|
|
Task: [task name]
|
|
Estimated Budget: [X tokens]
|
|
|
|
Loaded:
|
|
- Role definition: ~250 tokens
|
|
- core-principles.md: ~300 tokens
|
|
- Task requirements: ~[Y] tokens
|
|
- [other files]: ~[Z] tokens
|
|
TOTAL LOADED: ~[sum] tokens
|
|
|
|
Remaining Budget: [budget - sum] tokens
|
|
```
|
|
|
|
**During Execution** (every 2-3 decisions):
|
|
```
|
|
Token Check:
|
|
- Starting context: [X] tokens
|
|
- Loaded since last check: [Y] tokens
|
|
- Current estimated total: [X+Y] tokens
|
|
- Budget: [Z] tokens
|
|
- Status: [OK | WARNING | EXCEEDED]
|
|
```
|
|
|
|
**Before Loading Any File**:
|
|
```
|
|
Pre-Load Check:
|
|
- Current context: ~[X] tokens
|
|
- File to load: ~[Y] tokens (estimated)
|
|
- After load: ~[X+Y] tokens
|
|
- Budget: [Z] tokens
|
|
- Decision: [PROCEED | SKIP | CHECKPOINT FIRST]
|
|
```
|
|
|
|
### Checkpoint Triggers
|
|
|
|
**MUST create checkpoint when**:
|
|
- ✅ Context exceeds 3,000 tokens
|
|
- ✅ Completing a major phase (architecture → implementation)
|
|
- ✅ Before loading would exceed budget
|
|
- ✅ After 5+ agent interactions in sequence
|
|
- ✅ Detailed discussion exceeds 1,000 tokens
|
|
|
|
**Checkpoint Process**:
|
|
1. Estimate current context size
|
|
2. Create checkpoint summary (max 100 lines, ~500 tokens)
|
|
3. Archive verbose content to `docs/archive/`
|
|
4. Start next phase with checkpoint only
|
|
5. Token reduction: Expect 80%+ compression
|
|
|
|
**Example**:
|
|
```
|
|
Phase complete. Estimating context:
|
|
- Architecture discussion: ~2,500 tokens
|
|
- Technology decisions: ~1,200 tokens
|
|
- Requirements: ~800 tokens
|
|
TOTAL: ~4,500 tokens
|
|
|
|
Action: Creating checkpoint
|
|
- Checkpoint summary: ~500 tokens
|
|
- Token reduction: 89%
|
|
- Archived: Full discussion to docs/archive/architecture-phase/
|
|
```
|
|
|
|
### Budget Warning System
|
|
|
|
**Token Status Levels**:
|
|
|
|
🟢 **GREEN (0-50% of budget)**:
|
|
- Status: Healthy
|
|
- Action: Continue normally
|
|
- Example: 800/2000 tokens used
|
|
|
|
🟡 **YELLOW (50-75% of budget)**:
|
|
- Status: Warning
|
|
- Action: Be selective with additional loads
|
|
- Example: 1,500/2000 tokens used
|
|
- Strategy: Load only critical guides, skip nice-to-haves
|
|
|
|
🟠 **ORANGE (75-90% of budget)**:
|
|
- Status: Critical
|
|
- Action: Stop loading new context
|
|
- Example: 1,700/2000 tokens used
|
|
- Strategy: Use existing context only, prepare checkpoint
|
|
|
|
🔴 **RED (>90% of budget)**:
|
|
- Status: Exceeded
|
|
- Action: Create checkpoint IMMEDIATELY
|
|
- Example: 1,850/2000 tokens used
|
|
- Strategy: Checkpoint NOW, start fresh phase
|
|
|
|
### Self-Monitoring Format
|
|
|
|
**Maintain Mental Token Accounting**:
|
|
|
|
Throughout task execution, mentally track:
|
|
|
|
```
|
|
=== TOKEN ACCOUNTING ===
|
|
|
|
FIXED CONTEXT (loaded at start):
|
|
- Role definition: ~250
|
|
- Core principles: ~300
|
|
- Task requirements: ~500
|
|
Subtotal: ~1,050 tokens
|
|
|
|
DECISION CONTEXT (loaded during execution):
|
|
- technology-stack-guide.md: ~1,500
|
|
- security-guidelines.md: ~1,000
|
|
Subtotal: ~2,500 tokens
|
|
|
|
CONVERSATION (generated during work):
|
|
- Discussion and analysis: ~800
|
|
- Code examples: ~400
|
|
Subtotal: ~1,200 tokens
|
|
|
|
TOTAL ESTIMATED: ~4,750 tokens
|
|
BUDGET: 5,000 tokens
|
|
STATUS: 🟠 ORANGE - Approaching limit
|
|
ACTION: No more loads, complete task with current context
|
|
|
|
=== END ACCOUNTING ===
|
|
```
|
|
|
|
### Estimation Guidelines
|
|
|
|
**Token Estimation Rules**:
|
|
|
|
- **1 word ≈ 1.3 tokens** (on average)
|
|
- **1 line of code ≈ 15-25 tokens**
|
|
- **1 markdown paragraph ≈ 50-100 tokens**
|
|
- **Agent role file ≈ 200-400 tokens**
|
|
- **Data file (full) ≈ 1,000-3,000 tokens**
|
|
- **Data file (section) ≈ 200-500 tokens**
|
|
- **Checkpoint summary ≈ 500-800 tokens**
|
|
- **Architecture doc ≈ 2,000-4,000 tokens**
|
|
|
|
**Quick Estimation**:
|
|
- Count words in content
|
|
- Multiply by 1.3
|
|
- Round up for safety
|
|
|
|
**File Size Indicators**:
|
|
- Small file (<100 lines): ~500 tokens
|
|
- Medium file (100-300 lines): ~1,000-1,500 tokens
|
|
- Large file (300-500 lines): ~2,000-3,000 tokens
|
|
- Very large file (>500 lines): ~3,000+ tokens
|
|
|
|
### Runtime Optimization Tactics
|
|
|
|
**When Approaching Budget**:
|
|
|
|
1. **Use Section Loading**: Load only relevant section of large file
|
|
- Instead of: Full api-implementation-patterns.md (~1,500 tokens)
|
|
- Load: REST section only (~300 tokens)
|
|
- Savings: 80%
|
|
|
|
2. **Reference Don't Load**: Point to file path instead of loading
|
|
- Instead of: Loading best-practices.md
|
|
- Say: "Following patterns in best-practices.md section 3.2"
|
|
- Savings: 100% (no load needed)
|
|
|
|
3. **Checkpoint Intermediate State**: Don't wait for phase end
|
|
- If discussion getting long, checkpoint NOW
|
|
- Compress verbose analysis into key decisions
|
|
- Continue with lightweight checkpoint
|
|
|
|
4. **Defer Non-Critical Loads**: Skip nice-to-have context
|
|
- If 🟡 YELLOW or above, load ONLY must-haves
|
|
- Skip optimization guides if not performance-critical
|
|
- Skip best practices if pattern already clear
|
|
|
|
5. **Use Role Knowledge**: Lean on built-in expertise
|
|
- Instead of: Loading guide for basic patterns
|
|
- Use: Agent's inherent role knowledge
|
|
- When: Standard, well-known patterns
|
|
|
|
### Multi-Agent Workflow Monitoring
|
|
|
|
**Workflow-Level Tracking**:
|
|
|
|
When multiple agents work sequentially:
|
|
|
|
```
|
|
WORKFLOW TOKEN TRACKING
|
|
|
|
Phase 1: Requirements (Analyst)
|
|
- Context: ~800 tokens
|
|
- Output: requirements.md
|
|
- Checkpoint: ~600 tokens
|
|
- Handoff: Pass checkpoint only
|
|
|
|
Phase 2: Architecture (Solution Architect)
|
|
- Receive: checkpoint (~600)
|
|
- Load: tech-stack-guide (~1,500)
|
|
- Context: ~2,800 tokens
|
|
- Output: architecture.md
|
|
- Checkpoint: ~800 tokens
|
|
- Handoff: Pass checkpoint only
|
|
|
|
Phase 3: Implementation (Developers)
|
|
- Receive: checkpoint (~800)
|
|
- Load: role-specific guides (~500-1,000)
|
|
- Context: ~1,500-2,000 tokens
|
|
- No checkpoint needed (final phase)
|
|
|
|
TOTAL WORKFLOW: ~5,100 tokens (with checkpoints)
|
|
WITHOUT CHECKPOINTS: ~15,000+ tokens (accumulated context)
|
|
SAVINGS: 66%
|
|
```
|
|
|
|
### Checkpoint Quality Metrics
|
|
|
|
**Effective Checkpoint Indicators**:
|
|
- ✅ Reduces context by 80%+ (e.g., 4,000 → 800 tokens)
|
|
- ✅ Contains all critical decisions
|
|
- ✅ Includes all artifact paths
|
|
- ✅ Next agent can proceed without loading previous phase
|
|
- ✅ Max 100 lines (~500-800 tokens)
|
|
|
|
**Poor Checkpoint Indicators**:
|
|
- ❌ Only 30-50% reduction (too verbose)
|
|
- ❌ Missing key decisions or rationale
|
|
- ❌ Next agent needs to reload original context
|
|
- ❌ Over 150 lines (defeats purpose)
|
|
|
|
### Self-Regulation Commitment
|
|
|
|
**Every agent commits to**:
|
|
|
|
1. **Estimate before loading**: Know token cost before loading any file
|
|
2. **Track cumulative usage**: Maintain mental accounting of context size
|
|
3. **Respect budget warnings**: React to 🟡🟠🔴 status appropriately
|
|
4. **Create checkpoints proactively**: Don't wait for red status
|
|
5. **Load sections not files**: When possible, load only relevant sections
|
|
6. **Use role knowledge first**: Load guides only when truly needed
|
|
|
|
### Example: Good Runtime Monitoring
|
|
|
|
```
|
|
Task: Design REST API for user service
|
|
Budget: 2,000 tokens
|
|
|
|
[Start]
|
|
Loaded: Role (250) + core-principles (300) + requirements (400) = 950 tokens
|
|
Status: 🟢 GREEN (48% of budget)
|
|
|
|
[Decision: Need REST patterns]
|
|
Pre-check: Current 950 + api-patterns REST section (300) = 1,250
|
|
Status: 🟢 GREEN (63% of budget) → PROCEED
|
|
|
|
Loaded: REST section
|
|
Context: 1,250 tokens
|
|
|
|
[Decision: Need auth patterns?]
|
|
Pre-check: Current 1,250 + security-guidelines (1,000) = 2,250
|
|
Status: 🔴 RED (113% of budget) → DEFER
|
|
Decision: Use basic auth pattern from role knowledge, skip loading guide
|
|
|
|
[Complete]
|
|
Final context: ~1,400 tokens (with output)
|
|
Status: 🟢 GREEN (70% of budget)
|
|
Result: Task complete, stayed within budget
|
|
```
|
|
|
|
### Example: Bad Runtime Monitoring (Don't Do This)
|
|
|
|
```
|
|
Task: Design REST API for user service
|
|
Budget: 2,000 tokens
|
|
|
|
[Start]
|
|
Loaded: Role + core-principles + requirements + api-patterns (full) +
|
|
security-guidelines + best-practices + deployment-strategies
|
|
Context: ~5,500 tokens
|
|
Status: 🔴 RED (275% of budget) - EXCEEDED
|
|
|
|
Problem: Loaded everything upfront without checking budget
|
|
Result: Token waste, exceeded budget, needed checkpoint immediately
|
|
```
|
|
|
|
## Prompt Caching Optimization
|
|
|
|
**POWERFUL**: Prompt caching can reduce token costs by ~90% for static content and improve response times by caching reusable context.
|
|
|
|
### How Prompt Caching Works
|
|
|
|
**Concept**: AI providers cache static portions of prompts across multiple requests, dramatically reducing token costs and latency.
|
|
|
|
**Benefits**:
|
|
- **Token Savings**: ~90% reduction for cached content (5,000 tokens → 500 tokens charged)
|
|
- **Speed**: Faster response times (cached content pre-processed)
|
|
- **Cost**: Significant cost reduction for repeated context
|
|
|
|
**Requirements**:
|
|
- Content must be **static** (doesn't change between requests)
|
|
- Content must be **large enough** (typically >1,000 tokens) to benefit from caching
|
|
- Content must be in the **prefix** (beginning of prompt, before dynamic content)
|
|
|
|
### Cache-Friendly Content Structure
|
|
|
|
**Optimal Loading Order** (Stable → Dynamic):
|
|
|
|
```
|
|
1. CACHEABLE (Static - Load First)
|
|
├─ Agent role definition (changes rarely)
|
|
├─ core-principles.md (stable reference)
|
|
├─ Data files (technology-stack-guide.md, etc. - stable reference)
|
|
├─ Architecture documents (stable after creation)
|
|
└─ Checkpoints (stable after creation)
|
|
|
|
2. SEMI-CACHEABLE (Changes Occasionally)
|
|
├─ Task templates (stable structure, variable content)
|
|
├─ Workflow definitions (stable structure)
|
|
└─ Requirements docs (stable after approval)
|
|
|
|
3. NON-CACHEABLE (Dynamic - Load Last)
|
|
├─ Current task description (unique per task)
|
|
├─ User questions/requests (unique per interaction)
|
|
├─ Conversation history (constantly changing)
|
|
└─ Real-time data (changes frequently)
|
|
```
|
|
|
|
**Golden Rule**: **Static content first, dynamic content last**
|
|
|
|
### What to Cache vs Not Cache
|
|
|
|
**✅ CACHE (Static Reference Material)**:
|
|
- Agent role definitions
|
|
- core-principles.md
|
|
- technology-stack-guide.md
|
|
- security-guidelines.md
|
|
- api-implementation-patterns.md
|
|
- deployment-strategies.md
|
|
- development-guidelines.md
|
|
- architecture-patterns.md
|
|
- All data/* files (stable reference)
|
|
- Completed architecture documents
|
|
- Finalized checkpoints
|
|
|
|
**⚠️ CACHE WITH CARE (Semi-Static)**:
|
|
- Requirements documents (after finalization)
|
|
- Architecture documents (after approval)
|
|
- API specifications (after freeze)
|
|
- Checkpoints (after phase completion)
|
|
|
|
**❌ DON'T CACHE (Dynamic)**:
|
|
- Current task description
|
|
- User questions/messages
|
|
- Conversation history
|
|
- Work-in-progress documents
|
|
- Code being reviewed
|
|
- Intermediate decisions
|
|
- Debug output
|
|
- Error messages
|
|
|
|
### Cache-Optimized Loading Pattern
|
|
|
|
**Pattern 1: Static Foundation First**
|
|
|
|
```
|
|
LOAD ORDER for cache efficiency:
|
|
|
|
[CACHEABLE BLOCK - Load First]
|
|
1. Agent role definition (~300 tokens) ─┐
|
|
2. core-principles.md (~5,000 tokens) │
|
|
3. technology-stack-guide.md (~2,000) ├─ CACHE THIS (~8,500 tokens)
|
|
4. security-guidelines.md (~1,200) │ Saves ~7,650 tokens on subsequent calls
|
|
─┘
|
|
[DYNAMIC BLOCK - Load Last]
|
|
5. Current task description (~500 tokens) ─ Don't cache (changes every task)
|
|
6. User request (~200 tokens) ─ Don't cache (unique)
|
|
```
|
|
|
|
**Result**:
|
|
- First call: ~8,700 tokens charged
|
|
- Subsequent calls: ~850 tokens charged (only dynamic content)
|
|
- Savings: ~90% per call after first
|
|
|
|
**Pattern 2: Task-Specific Caching**
|
|
|
|
For repetitive tasks (e.g., multiple features in same codebase):
|
|
|
|
```
|
|
[STABLE CACHE - Reuse Across Tasks]
|
|
1. Agent role
|
|
2. core-principles.md
|
|
3. Project architecture (finalized)
|
|
4. Tech stack decisions (finalized)
|
|
5. API specifications (frozen)
|
|
|
|
[TASK CACHE - Reuse Within Task]
|
|
6. Feature requirements (current feature)
|
|
7. Architecture checkpoint (current feature)
|
|
|
|
[NEVER CACHE]
|
|
8. Current subtask
|
|
9. User messages
|
|
```
|
|
|
|
### Cache Invalidation Strategy
|
|
|
|
**When to Refresh Cache**:
|
|
|
|
🔄 **REFRESH IMMEDIATELY** when:
|
|
- core-principles.md updated
|
|
- Architecture significantly changed
|
|
- Technology decisions revised
|
|
- Security guidelines updated
|
|
- Major refactoring completed
|
|
|
|
⏱️ **REFRESH PERIODICALLY**:
|
|
- Data files: Every major version update
|
|
- Agent roles: When capabilities change
|
|
- Best practices: When standards evolve
|
|
|
|
✅ **KEEP CACHED**:
|
|
- Stable reference material
|
|
- Finalized architecture
|
|
- Approved requirements
|
|
- Completed checkpoints
|
|
|
|
**Cache Lifetime Guidance**:
|
|
- **Long-lived** (days/weeks): core-principles, data files, agent roles
|
|
- **Medium-lived** (hours/days): architecture docs, requirements
|
|
- **Short-lived** (minutes): task context, checkpoints
|
|
- **No cache** (per request): user input, conversation
|
|
|
|
### Workflow-Level Caching
|
|
|
|
**Greenfield Project Workflow**:
|
|
|
|
```
|
|
Phase 1: Requirements
|
|
[CACHE]
|
|
- Analyst role
|
|
- core-principles.md
|
|
[NO CACHE]
|
|
- Stakeholder input
|
|
- Requirements being drafted
|
|
|
|
Phase 2: Architecture
|
|
[CACHE]
|
|
- Solution Architect role
|
|
- core-principles.md
|
|
- technology-stack-guide.md
|
|
- Requirements (now finalized)
|
|
[NO CACHE]
|
|
- Architecture being designed
|
|
|
|
Phase 3: Implementation
|
|
[CACHE]
|
|
- Developer roles
|
|
- core-principles.md
|
|
- Architecture docs (now finalized)
|
|
- Checkpoints (now stable)
|
|
[NO CACHE]
|
|
- Current feature
|
|
- Code being written
|
|
|
|
CACHE BENEFIT: Each phase reuses stable context from previous phases
|
|
```
|
|
|
|
### Cache Size Optimization
|
|
|
|
**Optimal Cache Sizes**:
|
|
- **Minimum**: 1,000+ tokens (smaller = not worth caching overhead)
|
|
- **Sweet Spot**: 5,000-15,000 tokens (balance reuse vs memory)
|
|
- **Maximum**: Per provider limits (Claude: up to ~3-4 cache blocks)
|
|
|
|
**Strategies**:
|
|
|
|
1. **Bundle Static Files**: Load related static files together in cacheable block
|
|
```
|
|
GOOD: Load all data/* files together (one cache block)
|
|
BAD: Load data files separately interspersed with dynamic content
|
|
```
|
|
|
|
2. **Stable Prefix**: Keep structure consistent across requests
|
|
```
|
|
GOOD: Always load role → principles → guides in same order
|
|
BAD: Random order each time (breaks cache)
|
|
```
|
|
|
|
3. **Cache Boundaries**: Clear separation between static and dynamic
|
|
```
|
|
GOOD: [Static block] then [Dynamic block]
|
|
BAD: Static → Dynamic → Static → Dynamic (cache thrashing)
|
|
```
|
|
|
|
### Cache-Aware JIT Loading
|
|
|
|
**Modify JIT Pattern for Caching**:
|
|
|
|
**Standard JIT** (No Caching Consideration):
|
|
- Load role + task
|
|
- Load guide when needed
|
|
- Load another guide when needed
|
|
- Load dynamic content
|
|
|
|
**Cache-Aware JIT**:
|
|
- Load ALL static guides upfront (cache block)
|
|
- Load dynamic task content
|
|
- Reference cached guides as needed
|
|
- NO additional loads (all guides pre-cached)
|
|
|
|
**When to Use Each**:
|
|
|
|
Use **Standard JIT** when:
|
|
- One-off task (no cache benefit)
|
|
- Highly dynamic workflow
|
|
- Frequently changing context
|
|
|
|
Use **Cache-Aware Loading** when:
|
|
- Repetitive tasks (multiple features)
|
|
- Long-running project (many interactions)
|
|
- Stable reference material
|
|
- Multiple agents using same context
|
|
|
|
### Cache Hit Indicators
|
|
|
|
**High Cache Hit Scenarios** (Good):
|
|
- ✅ Multiple features in same project
|
|
- ✅ Sequential agent interactions
|
|
- ✅ Iterative development (multiple sprints)
|
|
- ✅ Code review across similar code
|
|
- ✅ Consistent use of same data files
|
|
|
|
**Low Cache Hit Scenarios** (Less Benefit):
|
|
- ❌ One-off greenfield projects
|
|
- ❌ Constantly changing architecture
|
|
- ❌ Different tech stacks per task
|
|
- ❌ Unique context each time
|
|
|
|
### Caching Best Practices
|
|
|
|
**DO**:
|
|
- ✅ Load static content first (agent role, principles, data files)
|
|
- ✅ Keep loading order consistent across requests
|
|
- ✅ Bundle related static files together
|
|
- ✅ Cache finalized documents (architecture, requirements)
|
|
- ✅ Use cache-aware loading for repetitive tasks
|
|
|
|
**DON'T**:
|
|
- ❌ Interleave static and dynamic content
|
|
- ❌ Change loading order randomly
|
|
- ❌ Cache work-in-progress documents
|
|
- ❌ Cache user input or conversation
|
|
- ❌ Over-cache (don't cache tiny files)
|
|
|
|
### Example: Cache-Optimized Workflow
|
|
|
|
**Task**: Implement 5 features in existing project
|
|
|
|
**Without Cache Optimization**:
|
|
```
|
|
Feature 1: Load role + principles + architecture + task → 8,000 tokens
|
|
Feature 2: Load role + principles + architecture + task → 8,000 tokens
|
|
Feature 3: Load role + principles + architecture + task → 8,000 tokens
|
|
Feature 4: Load role + principles + architecture + task → 8,000 tokens
|
|
Feature 5: Load role + principles + architecture + task → 8,000 tokens
|
|
TOTAL: 40,000 tokens
|
|
```
|
|
|
|
**With Cache Optimization**:
|
|
```
|
|
Feature 1: Load [role + principles + architecture]CACHE + task → 8,000 tokens (full charge)
|
|
Feature 2: [Cache hit] + task → 800 tokens (10% charge)
|
|
Feature 3: [Cache hit] + task → 800 tokens
|
|
Feature 4: [Cache hit] + task → 800 tokens
|
|
Feature 5: [Cache hit] + task → 800 tokens
|
|
TOTAL: 11,200 tokens
|
|
SAVINGS: 72% (28,800 tokens saved)
|
|
```
|
|
|
|
### Integration with JIT Loading
|
|
|
|
**Hybrid Strategy** (Best of Both):
|
|
|
|
**Phase 1: Load Static Cache Block**
|
|
```
|
|
[CACHE BLOCK - Load Once]
|
|
- Agent role
|
|
- core-principles.md
|
|
- agent-responsibility-matrix.md
|
|
- context-loading-guide.md (if needed)
|
|
Total: ~7,000 tokens
|
|
```
|
|
|
|
**Phase 2: JIT Load Decision Guides** (Also Cache-Friendly)
|
|
```
|
|
[CONDITIONAL CACHE - Load If Needed]
|
|
- technology-stack-guide.md (if making stack decision)
|
|
- security-guidelines.md (if implementing auth)
|
|
- api-implementation-patterns.md (if designing API)
|
|
Add: ~1,000-3,000 tokens
|
|
```
|
|
|
|
**Phase 3: Dynamic Task Context** (Never Cache)
|
|
```
|
|
[DYNAMIC - Changes Per Task]
|
|
- Current task description
|
|
- User messages
|
|
- Work artifacts
|
|
Add: ~500-1,500 tokens
|
|
```
|
|
|
|
**Result**:
|
|
- Stable foundation cached
|
|
- Decision guides cached when needed
|
|
- Dynamic content fresh
|
|
- Optimal token efficiency
|
|
|
|
### Token Accounting with Caching
|
|
|
|
**Revised Token Accounting Format**:
|
|
|
|
```
|
|
=== TOKEN ACCOUNTING (With Cache) ===
|
|
|
|
CACHEABLE CONTEXT (Static):
|
|
- Role definition: ~300 tokens
|
|
- core-principles.md: ~5,000 tokens
|
|
- architecture.md: ~2,000 tokens
|
|
Subtotal: ~7,300 tokens
|
|
Cache Status: CACHED (first call: full charge, subsequent: ~10% charge)
|
|
|
|
DECISION CONTEXT (Semi-Static):
|
|
- security-guidelines.md: ~1,000 tokens
|
|
Cache Status: CACHED (loaded in previous task)
|
|
|
|
DYNAMIC CONTEXT (Never Cache):
|
|
- Current task: ~500 tokens
|
|
- User messages: ~200 tokens
|
|
Subtotal: ~700 tokens
|
|
|
|
TOTAL ESTIMATED:
|
|
- First call: ~9,000 tokens
|
|
- This call (cache hit): ~1,500 tokens (83% savings)
|
|
```
|
|
|
|
### Summary: Caching Optimization
|
|
|
|
**Key Principles**:
|
|
1. **Static first, dynamic last** - Order matters for caching
|
|
2. **Bundle static files** - Load together for efficient caching
|
|
3. **Consistent structure** - Same order maximizes cache hits
|
|
4. **Finalize before caching** - Don't cache work-in-progress
|
|
5. **Cache-aware JIT** - Load static guides upfront for repetitive tasks
|
|
|
|
**Expected Impact**:
|
|
- **Single task**: 0-30% savings (cache setup cost)
|
|
- **Repetitive tasks**: 70-90% savings (massive benefit)
|
|
- **Long-term project**: 80%+ cumulative savings
|
|
|
|
**Remember**: Caching optimizes INPUT token costs. Output quality remains HIGH with comprehensive, detailed responses.
|
|
|
|
## Code Quality Standards
|
|
|
|
**Naming Conventions**:
|
|
- Files: kebab-case for utilities, PascalCase for components, camelCase for hooks
|
|
- Variables: camelCase for functions/vars, PascalCase for classes, UPPER_SNAKE_CASE for constants
|
|
- Descriptive: `isLoading` not `loading`, `handleSubmit` not `submit`
|
|
|
|
**TypeScript Rules**:
|
|
- No `any` - use `unknown` for truly unknown types
|
|
- `interface` for objects, `type` for unions/intersections
|
|
- Explicit function return types
|
|
- Generics for reusable type-safe code
|
|
|
|
**Testing Philosophy**:
|
|
- Test user interactions, not implementation details
|
|
- Mock external dependencies
|
|
- Integration tests for critical paths
|
|
- Unit tests for complex business logic
|
|
- Aim for >80% coverage on critical code
|
|
|
|
**Error Handling**:
|
|
- Custom error classes for different error types
|
|
- Centralized error handling (middleware for backend, error boundaries for frontend)
|
|
- Structured logging with context
|
|
- Never expose stack traces in production
|
|
- Proper HTTP status codes (400 validation, 401 auth, 404 not found, 500 server error)
|
|
|
|
## Security Standards
|
|
|
|
**Authentication**:
|
|
- bcrypt/argon2 for password hashing (10+ rounds)
|
|
- JWT with short-lived access tokens + long-lived refresh tokens
|
|
- Store refresh tokens in httpOnly cookies or secure DB
|
|
- Implement token rotation on refresh
|
|
|
|
**Input Validation**:
|
|
- Validate ALL user inputs with Zod or Joi
|
|
- Sanitize HTML output to prevent XSS
|
|
- Use parameterized queries to prevent SQL injection
|
|
- Whitelist, never blacklist
|
|
|
|
**API Security**:
|
|
- CORS with specific origins (not *)
|
|
- Rate limiting per user/endpoint
|
|
- CSRF protection for state-changing operations
|
|
- Security headers with Helmet.js
|
|
- HTTPS in production
|
|
|
|
**Secrets Management**:
|
|
- Never commit secrets to version control
|
|
- Use environment variables (.env files)
|
|
- Rotate credentials regularly
|
|
- Minimal privilege principle
|
|
|
|
## Performance Standards
|
|
|
|
**Frontend**:
|
|
- Code splitting and lazy loading for routes/heavy components
|
|
- Memoization (React.memo, useMemo, useCallback) only when measured benefit
|
|
- Virtual scrolling for long lists
|
|
- Image optimization (next/image or similar)
|
|
- Bundle analysis and tree shaking
|
|
|
|
**Backend**:
|
|
- Database indexes on frequently queried fields
|
|
- Connection pooling
|
|
- Redis caching for frequently accessed data
|
|
- Pagination for large result sets
|
|
- Async/await throughout (no blocking operations)
|
|
- Background jobs for heavy processing
|
|
|
|
## Documentation Standards
|
|
|
|
**Code Documentation**:
|
|
- JSDoc for public APIs and complex functions
|
|
- README for setup and architecture overview
|
|
- Inline comments for "why", not "what"
|
|
- Keep docs close to code
|
|
|
|
**API Documentation**:
|
|
- OpenAPI/Swagger for REST APIs
|
|
- GraphQL SDL with type descriptions
|
|
- Code examples for common use cases
|
|
- Authentication flows documented
|
|
- Error codes and messages explained
|
|
|
|
**Architecture Documentation**:
|
|
- Architecture Decision Records (ADRs) for key decisions
|
|
- System diagrams for complex architectures
|
|
- Database schema documentation
|
|
- Deployment guides
|
|
|
|
## Git Standards
|
|
|
|
**Commit Format**: `<type>(<scope>): <subject>`
|
|
|
|
**Types**: feat, fix, docs, style, refactor, test, chore
|
|
|
|
**Example**: `feat(auth): add password reset functionality`
|
|
|
|
**PR Requirements**:
|
|
- TypeScript compiles with no errors
|
|
- ESLint passes
|
|
- All tests pass, coverage >80%
|
|
- No console.logs or debugger statements
|
|
- Meaningful commits
|
|
- Clear PR description
|
|
|
|
## Working Philosophy
|
|
|
|
**Start Simple**: MVP first, add complexity as needed. Monolith before microservices.
|
|
|
|
**Fail Fast**: Validate early, catch errors at compile time, comprehensive error handling.
|
|
|
|
**Iterate Quickly**: Ship small increments, get feedback, improve continuously.
|
|
|
|
**Automate Everything**: Testing, linting, deployment, monitoring. Reduce manual toil.
|
|
|
|
**Monitor and Learn**: Track metrics, analyze errors, learn from production, improve continuously.
|
|
|
|
---
|
|
|
|
**All agents reference these principles.**
|
|
|
|
**Additional References**:
|
|
- Implementation details: `development-guidelines.md`, `best-practices.md`, `security-guidelines.md`
|
|
- Agent boundaries: `agent-responsibility-matrix.md` (who owns what decisions)
|
|
- Context loading: `context-loading-guide.md` (when to load what)
|
|
- Technology guides: Various data/* files (loaded just-in-time)
|