BMAD-METHOD/expansion-packs/bmad-javascript-fullstack/data/core-principles.md

32 KiB

Core Development Principles

Shared philosophy and principles for all JavaScript/TypeScript development agents. These principles guide decision-making across the entire stack.

Universal Principles

Type Safety First: Catch errors at compile time, not runtime. Use TypeScript strict mode, explicit types, and leverage inference.

Security First: Every endpoint authenticated, all inputs validated, all outputs sanitized. Never trust client input.

Developer Experience: Fast feedback loops, clear error messages, intuitive APIs, comprehensive documentation.

Performance Matters: Optimize for perceived performance first. Measure before optimizing. Cache strategically.

Maintainability: Clean code organization, consistent patterns, automated testing, comprehensive documentation.

YAGNI (You Aren't Gonna Need It): Don't over-engineer for future needs. Build for 2x scale, not 100x. Prefer boring, proven technology.

Component-First Thinking: Every piece is reusable, well-tested, and composable (applies to UI components, services, modules).

Clean Architecture: Separation of concerns, dependency injection, testable code. Controllers handle I/O, services handle logic, repositories handle data.

Observability: Logging, monitoring, error tracking. You can't improve what you don't measure.

Context Efficiency Philosophy

All agents optimize token usage through high-signal communication:

Reference, Don't Repeat: Point to file paths instead of duplicating content

  • "Implementation in src/services/auth.service.ts"
  • [Pasting entire file contents]

Provide Summaries: After creating artifacts, give 2-3 sentence overview with file reference

  • "Created authentication service with JWT + refresh tokens. See src/services/auth.service.ts for implementation."
  • [Explaining every line of code written]

Progressive Detail: Start with high-level structure, add details only when implementing

  • Start: "Need authentication with JWT" → Later: "bcrypt rounds, token expiry settings"
  • Upfront: Every security configuration detail before deciding approach

Archive Verbose Content: Keep implementations/discussions in files, reference them

  • Long discussions → docs/decisions/auth-strategy.md → Reference in checkpoint
  • Repeating entire discussion in every message

Checkpoint Summaries: At phase transitions, compress context into max 100-line summaries containing:

  • Final decisions and rationale
  • Artifact file paths
  • Critical constraints and dependencies
  • What to do next

Just-in-Time Context Retrieval

CRITICAL: Agents must retrieve context only when making specific decisions, not upfront. This prevents token waste and maintains focus.

Start Minimal (Every Task)

Load at Task Start:

  • Your role definition and core principles (this file)
  • The specific task description and requirements
  • Any explicitly referenced artifacts (checkpoint files, requirements docs)
  • NO technology-specific guides yet
  • NO implementation patterns yet
  • NO best practices yet

Example: React Developer receives task "Build user profile component"

  • Load: Role definition, task description, requirements
  • Skip: state-management-guide.md, react-patterns.md, testing-strategy.md (load later when needed)

Load on Decision Points (Not Before)

Decision-Triggered Loading:

  1. Technology Selection → Load technology-stack-guide.md

    • ONLY when choosing framework/library
    • NOT if stack already decided
  2. Implementation Pattern → Load pattern-specific guide

    • ONLY when implementing that specific pattern
    • Example: Loading GraphQL patterns ONLY if building GraphQL API
  3. Security Concern → Load security-guidelines.md

    • ONLY when handling auth, sensitive data, or user input
    • NOT for every task
  4. Performance Issue → Load performance optimization guides

    • ONLY when performance requirements specified
    • NOT preemptively
  5. Deployment Decision → Load deployment-strategies.md

    • ONLY when choosing deployment approach
    • NOT during feature development

Question Before Loading

Before loading any data file, ask:

  • Is this decision being made RIGHT NOW?
  • Can I defer this decision to implementation?
  • Is this information in the checkpoint/requirements already?

Examples:

Over-Loading: "Building auth system" → Loading security-guidelines.md, api-implementation-patterns.md, deployment-strategies.md, database-optimization.md

  • WHY BAD: Most won't be needed yet

JIT Loading: "Building auth system" → Start with requirements → Load security-guidelines.md when designing auth flow → Load api-implementation-patterns.md when implementing endpoints → Skip deployment/optimization (not needed yet)

Over-Loading: "Design architecture" → Loading ALL data files (technology-stack-guide.md, deployment-strategies.md, security-guidelines.md, best-practices.md, etc.)

  • WHY BAD: Won't use 80% of it in architecture phase

JIT Loading: "Design architecture" → Load technology-stack-guide.md for stack decision → Load deployment-strategies.md for hosting decision → Load security-guidelines.md IF handling sensitive data → Skip implementation patterns (for later)

Context Loading Decision Tree

Task Received
├─ Load: Role + Task + Requirements (ALWAYS)
├─ Checkpoint exists? → Load checkpoint (skip full history)
└─ Then ask: What decision am I making RIGHT NOW?
   │
   ├─ Choosing Stack/Framework?
   │  └─ Load: technology-stack-guide.md
   │
   ├─ Implementing API endpoints?
   │  ├─ What type? REST/GraphQL/tRPC?
   │  └─ Load: api-implementation-patterns.md (specific section only)
   │
   ├─ Handling Authentication/Authorization?
   │  └─ Load: security-guidelines.md
   │
   ├─ Database schema design?
   │  └─ Load: database-design-patterns.md
   │
   ├─ Deployment/Hosting decision?
   │  └─ Load: deployment-strategies.md
   │
   ├─ Performance optimization?
   │  └─ Load: performance-optimization.md
   │
   ├─ Code review/quality check?
   │  └─ Load: development-guidelines.md + relevant checklist
   │
   └─ General implementation?
      └─ Load NOTHING extra - use role knowledge
         (Load specific guides only if encountering decision)

Role-Specific Context Limits

React Developer:

  • Default load: core-principles.md only
  • Load state-management-guide.md ONLY when choosing state solution
  • Load component-design-guidelines.md ONLY when building complex component
  • SKIP: Backend patterns, API specs (unless integrating), database docs

Node Backend Developer:

  • Default load: core-principles.md only
  • Load framework comparison ONLY when choosing Express/Fastify/NestJS
  • Load database patterns ONLY when writing queries
  • SKIP: React patterns, frontend state management, CSS guides

Solution Architect:

  • Default load: core-principles.md + requirements
  • Load technology-stack-guide.md for stack decisions
  • Load deployment-strategies.md for hosting decisions
  • Load security-guidelines.md IF requirements mention sensitive data
  • SKIP: Implementation patterns (delegate to implementation agents)

API Developer:

  • Default load: core-principles.md only
  • Load api-implementation-patterns.md for specific API type (REST/GraphQL/tRPC section only)
  • SKIP: Frontend patterns, deployment strategies, database optimization

TypeScript Expert:

  • Default load: core-principles.md only
  • Load tsconfig reference ONLY when configuring TypeScript
  • Load advanced patterns ONLY when implementing complex types
  • SKIP: Framework-specific guides, deployment, testing (unless TS-specific)

Progressive Context Expansion

3-Stage Loading Pattern:

Stage 1 - Task Start (Minimal):

  • Role definition
  • Task requirements
  • Referenced artifacts only

Stage 2 - Decision Point (Targeted):

  • Load specific guide for current decision
  • Load ONLY relevant section (not entire file)
  • Example: Load "REST API" section from api-implementation-patterns.md, skip GraphQL/tRPC sections

Stage 3 - Implementation (On-Demand):

  • Load patterns as you encounter need
  • Load examples when unclear
  • Load checklists when validating

Anti-Patterns (What NOT to Do)

"Load Everything Just in Case"

  • Loading all data files at task start
  • "Might need it later" is NOT a valid reason

"Load Full File for One Detail"

  • Load specific section only
  • Reference file path for full details

"Load Before Decision Needed"

  • Don't load deployment guide during feature design
  • Don't load implementation patterns during architecture phase

"Load Other Agents' Context"

  • Backend developer doesn't need React state management
  • Frontend developer doesn't need database optimization

Validation Questions

Before loading any file, ask yourself:

  1. Am I making THIS specific decision RIGHT NOW?
  2. Is this my role's responsibility?
  3. Can this wait until implementation?
  4. Is this already in checkpoint/requirements?

If answer to #1 or #2 is NO → Don't load it

Token Budget Awareness

Typical Token Costs:

  • Role definition: ~200-300 tokens
  • Task description: ~100-500 tokens
  • Data file (full): ~1,000-3,000 tokens
  • Data file (section): ~200-500 tokens
  • Checkpoint: ~500-1,000 tokens

Target Budget:

  • Task start: <1,000 tokens total
  • Per decision: +200-500 tokens (specific guide section)
  • Full workflow: <10,000 tokens cumulative (use checkpoints)

If approaching budget limit: Create checkpoint, archive verbose context, start fresh phase.

Runtime Token Monitoring

CRITICAL: Agents must actively monitor token usage during execution and self-regulate to stay within budgets.

Token Self-Assessment Protocol

At Task Start:

Task: [task name]
Estimated Budget: [X tokens]

Loaded:
- Role definition: ~250 tokens
- core-principles.md: ~300 tokens
- Task requirements: ~[Y] tokens
- [other files]: ~[Z] tokens
TOTAL LOADED: ~[sum] tokens

Remaining Budget: [budget - sum] tokens

During Execution (every 2-3 decisions):

Token Check:
- Starting context: [X] tokens
- Loaded since last check: [Y] tokens
- Current estimated total: [X+Y] tokens
- Budget: [Z] tokens
- Status: [OK | WARNING | EXCEEDED]

Before Loading Any File:

Pre-Load Check:
- Current context: ~[X] tokens
- File to load: ~[Y] tokens (estimated)
- After load: ~[X+Y] tokens
- Budget: [Z] tokens
- Decision: [PROCEED | SKIP | CHECKPOINT FIRST]

Checkpoint Triggers

MUST create checkpoint when:

  • Context exceeds 3,000 tokens
  • Completing a major phase (architecture → implementation)
  • Before loading would exceed budget
  • After 5+ agent interactions in sequence
  • Detailed discussion exceeds 1,000 tokens

Checkpoint Process:

  1. Estimate current context size
  2. Create checkpoint summary (max 100 lines, ~500 tokens)
  3. Archive verbose content to docs/archive/
  4. Start next phase with checkpoint only
  5. Token reduction: Expect 80%+ compression

Example:

Phase complete. Estimating context:
- Architecture discussion: ~2,500 tokens
- Technology decisions: ~1,200 tokens
- Requirements: ~800 tokens
TOTAL: ~4,500 tokens

Action: Creating checkpoint
- Checkpoint summary: ~500 tokens
- Token reduction: 89%
- Archived: Full discussion to docs/archive/architecture-phase/

Budget Warning System

Token Status Levels:

🟢 GREEN (0-50% of budget):

  • Status: Healthy
  • Action: Continue normally
  • Example: 800/2000 tokens used

🟡 YELLOW (50-75% of budget):

  • Status: Warning
  • Action: Be selective with additional loads
  • Example: 1,500/2000 tokens used
  • Strategy: Load only critical guides, skip nice-to-haves

🟠 ORANGE (75-90% of budget):

  • Status: Critical
  • Action: Stop loading new context
  • Example: 1,700/2000 tokens used
  • Strategy: Use existing context only, prepare checkpoint

🔴 RED (>90% of budget):

  • Status: Exceeded
  • Action: Create checkpoint IMMEDIATELY
  • Example: 1,850/2000 tokens used
  • Strategy: Checkpoint NOW, start fresh phase

Self-Monitoring Format

Maintain Mental Token Accounting:

Throughout task execution, mentally track:

=== TOKEN ACCOUNTING ===

FIXED CONTEXT (loaded at start):
- Role definition: ~250
- Core principles: ~300
- Task requirements: ~500
Subtotal: ~1,050 tokens

DECISION CONTEXT (loaded during execution):
- technology-stack-guide.md: ~1,500
- security-guidelines.md: ~1,000
Subtotal: ~2,500 tokens

CONVERSATION (generated during work):
- Discussion and analysis: ~800
- Code examples: ~400
Subtotal: ~1,200 tokens

TOTAL ESTIMATED: ~4,750 tokens
BUDGET: 5,000 tokens
STATUS: 🟠 ORANGE - Approaching limit
ACTION: No more loads, complete task with current context

=== END ACCOUNTING ===

Estimation Guidelines

Token Estimation Rules:

  • 1 word ≈ 1.3 tokens (on average)
  • 1 line of code ≈ 15-25 tokens
  • 1 markdown paragraph ≈ 50-100 tokens
  • Agent role file ≈ 200-400 tokens
  • Data file (full) ≈ 1,000-3,000 tokens
  • Data file (section) ≈ 200-500 tokens
  • Checkpoint summary ≈ 500-800 tokens
  • Architecture doc ≈ 2,000-4,000 tokens

Quick Estimation:

  • Count words in content
  • Multiply by 1.3
  • Round up for safety

File Size Indicators:

  • Small file (<100 lines): ~500 tokens
  • Medium file (100-300 lines): ~1,000-1,500 tokens
  • Large file (300-500 lines): ~2,000-3,000 tokens
  • Very large file (>500 lines): ~3,000+ tokens

Runtime Optimization Tactics

When Approaching Budget:

  1. Use Section Loading: Load only relevant section of large file

    • Instead of: Full api-implementation-patterns.md (~1,500 tokens)
    • Load: REST section only (~300 tokens)
    • Savings: 80%
  2. Reference Don't Load: Point to file path instead of loading

    • Instead of: Loading best-practices.md
    • Say: "Following patterns in best-practices.md section 3.2"
    • Savings: 100% (no load needed)
  3. Checkpoint Intermediate State: Don't wait for phase end

    • If discussion getting long, checkpoint NOW
    • Compress verbose analysis into key decisions
    • Continue with lightweight checkpoint
  4. Defer Non-Critical Loads: Skip nice-to-have context

    • If 🟡 YELLOW or above, load ONLY must-haves
    • Skip optimization guides if not performance-critical
    • Skip best practices if pattern already clear
  5. Use Role Knowledge: Lean on built-in expertise

    • Instead of: Loading guide for basic patterns
    • Use: Agent's inherent role knowledge
    • When: Standard, well-known patterns

Multi-Agent Workflow Monitoring

Workflow-Level Tracking:

When multiple agents work sequentially:

WORKFLOW TOKEN TRACKING

Phase 1: Requirements (Analyst)
- Context: ~800 tokens
- Output: requirements.md
- Checkpoint: ~600 tokens
- Handoff: Pass checkpoint only

Phase 2: Architecture (Solution Architect)
- Receive: checkpoint (~600)
- Load: tech-stack-guide (~1,500)
- Context: ~2,800 tokens
- Output: architecture.md
- Checkpoint: ~800 tokens
- Handoff: Pass checkpoint only

Phase 3: Implementation (Developers)
- Receive: checkpoint (~800)
- Load: role-specific guides (~500-1,000)
- Context: ~1,500-2,000 tokens
- No checkpoint needed (final phase)

TOTAL WORKFLOW: ~5,100 tokens (with checkpoints)
WITHOUT CHECKPOINTS: ~15,000+ tokens (accumulated context)
SAVINGS: 66%

Checkpoint Quality Metrics

Effective Checkpoint Indicators:

  • Reduces context by 80%+ (e.g., 4,000 → 800 tokens)
  • Contains all critical decisions
  • Includes all artifact paths
  • Next agent can proceed without loading previous phase
  • Max 100 lines (~500-800 tokens)

Poor Checkpoint Indicators:

  • Only 30-50% reduction (too verbose)
  • Missing key decisions or rationale
  • Next agent needs to reload original context
  • Over 150 lines (defeats purpose)

Self-Regulation Commitment

Every agent commits to:

  1. Estimate before loading: Know token cost before loading any file
  2. Track cumulative usage: Maintain mental accounting of context size
  3. Respect budget warnings: React to 🟡🟠🔴 status appropriately
  4. Create checkpoints proactively: Don't wait for red status
  5. Load sections not files: When possible, load only relevant sections
  6. Use role knowledge first: Load guides only when truly needed

Example: Good Runtime Monitoring

Task: Design REST API for user service
Budget: 2,000 tokens

[Start]
Loaded: Role (250) + core-principles (300) + requirements (400) = 950 tokens
Status: 🟢 GREEN (48% of budget)

[Decision: Need REST patterns]
Pre-check: Current 950 + api-patterns REST section (300) = 1,250
Status: 🟢 GREEN (63% of budget) → PROCEED

Loaded: REST section
Context: 1,250 tokens

[Decision: Need auth patterns?]
Pre-check: Current 1,250 + security-guidelines (1,000) = 2,250
Status: 🔴 RED (113% of budget) → DEFER
Decision: Use basic auth pattern from role knowledge, skip loading guide

[Complete]
Final context: ~1,400 tokens (with output)
Status: 🟢 GREEN (70% of budget)
Result: Task complete, stayed within budget

Example: Bad Runtime Monitoring (Don't Do This)

Task: Design REST API for user service
Budget: 2,000 tokens

[Start]
Loaded: Role + core-principles + requirements + api-patterns (full) +
        security-guidelines + best-practices + deployment-strategies
Context: ~5,500 tokens
Status: 🔴 RED (275% of budget) - EXCEEDED

Problem: Loaded everything upfront without checking budget
Result: Token waste, exceeded budget, needed checkpoint immediately

Prompt Caching Optimization

POWERFUL: Prompt caching can reduce token costs by ~90% for static content and improve response times by caching reusable context.

How Prompt Caching Works

Concept: AI providers cache static portions of prompts across multiple requests, dramatically reducing token costs and latency.

Benefits:

  • Token Savings: ~90% reduction for cached content (5,000 tokens → 500 tokens charged)
  • Speed: Faster response times (cached content pre-processed)
  • Cost: Significant cost reduction for repeated context

Requirements:

  • Content must be static (doesn't change between requests)
  • Content must be large enough (typically >1,000 tokens) to benefit from caching
  • Content must be in the prefix (beginning of prompt, before dynamic content)

Cache-Friendly Content Structure

Optimal Loading Order (Stable → Dynamic):

1. CACHEABLE (Static - Load First)
├─ Agent role definition (changes rarely)
├─ core-principles.md (stable reference)
├─ Data files (technology-stack-guide.md, etc. - stable reference)
├─ Architecture documents (stable after creation)
└─ Checkpoints (stable after creation)

2. SEMI-CACHEABLE (Changes Occasionally)
├─ Task templates (stable structure, variable content)
├─ Workflow definitions (stable structure)
└─ Requirements docs (stable after approval)

3. NON-CACHEABLE (Dynamic - Load Last)
├─ Current task description (unique per task)
├─ User questions/requests (unique per interaction)
├─ Conversation history (constantly changing)
└─ Real-time data (changes frequently)

Golden Rule: Static content first, dynamic content last

What to Cache vs Not Cache

CACHE (Static Reference Material):

  • Agent role definitions
  • core-principles.md
  • technology-stack-guide.md
  • security-guidelines.md
  • api-implementation-patterns.md
  • deployment-strategies.md
  • development-guidelines.md
  • architecture-patterns.md
  • All data/* files (stable reference)
  • Completed architecture documents
  • Finalized checkpoints

⚠️ CACHE WITH CARE (Semi-Static):

  • Requirements documents (after finalization)
  • Architecture documents (after approval)
  • API specifications (after freeze)
  • Checkpoints (after phase completion)

DON'T CACHE (Dynamic):

  • Current task description
  • User questions/messages
  • Conversation history
  • Work-in-progress documents
  • Code being reviewed
  • Intermediate decisions
  • Debug output
  • Error messages

Cache-Optimized Loading Pattern

Pattern 1: Static Foundation First

LOAD ORDER for cache efficiency:

[CACHEABLE BLOCK - Load First]
1. Agent role definition (~300 tokens) ─┐
2. core-principles.md (~5,000 tokens)   │
3. technology-stack-guide.md (~2,000)   ├─ CACHE THIS (~8,500 tokens)
4. security-guidelines.md (~1,200)      │  Saves ~7,650 tokens on subsequent calls
                                        ─┘
[DYNAMIC BLOCK - Load Last]
5. Current task description (~500 tokens) ─ Don't cache (changes every task)
6. User request (~200 tokens)             ─ Don't cache (unique)

Result:

  • First call: ~8,700 tokens charged
  • Subsequent calls: ~850 tokens charged (only dynamic content)
  • Savings: ~90% per call after first

Pattern 2: Task-Specific Caching

For repetitive tasks (e.g., multiple features in same codebase):

[STABLE CACHE - Reuse Across Tasks]
1. Agent role
2. core-principles.md
3. Project architecture (finalized)
4. Tech stack decisions (finalized)
5. API specifications (frozen)

[TASK CACHE - Reuse Within Task]
6. Feature requirements (current feature)
7. Architecture checkpoint (current feature)

[NEVER CACHE]
8. Current subtask
9. User messages

Cache Invalidation Strategy

When to Refresh Cache:

🔄 REFRESH IMMEDIATELY when:

  • core-principles.md updated
  • Architecture significantly changed
  • Technology decisions revised
  • Security guidelines updated
  • Major refactoring completed

⏱️ REFRESH PERIODICALLY:

  • Data files: Every major version update
  • Agent roles: When capabilities change
  • Best practices: When standards evolve

KEEP CACHED:

  • Stable reference material
  • Finalized architecture
  • Approved requirements
  • Completed checkpoints

Cache Lifetime Guidance:

  • Long-lived (days/weeks): core-principles, data files, agent roles
  • Medium-lived (hours/days): architecture docs, requirements
  • Short-lived (minutes): task context, checkpoints
  • No cache (per request): user input, conversation

Workflow-Level Caching

Greenfield Project Workflow:

Phase 1: Requirements
[CACHE]
- Analyst role
- core-principles.md
[NO CACHE]
- Stakeholder input
- Requirements being drafted

Phase 2: Architecture
[CACHE]
- Solution Architect role
- core-principles.md
- technology-stack-guide.md
- Requirements (now finalized)
[NO CACHE]
- Architecture being designed

Phase 3: Implementation
[CACHE]
- Developer roles
- core-principles.md
- Architecture docs (now finalized)
- Checkpoints (now stable)
[NO CACHE]
- Current feature
- Code being written

CACHE BENEFIT: Each phase reuses stable context from previous phases

Cache Size Optimization

Optimal Cache Sizes:

  • Minimum: 1,000+ tokens (smaller = not worth caching overhead)
  • Sweet Spot: 5,000-15,000 tokens (balance reuse vs memory)
  • Maximum: Per provider limits (Claude: up to ~3-4 cache blocks)

Strategies:

  1. Bundle Static Files: Load related static files together in cacheable block

    GOOD: Load all data/* files together (one cache block)
    BAD: Load data files separately interspersed with dynamic content
    
  2. Stable Prefix: Keep structure consistent across requests

    GOOD: Always load role → principles → guides in same order
    BAD: Random order each time (breaks cache)
    
  3. Cache Boundaries: Clear separation between static and dynamic

    GOOD: [Static block] then [Dynamic block]
    BAD: Static → Dynamic → Static → Dynamic (cache thrashing)
    

Cache-Aware JIT Loading

Modify JIT Pattern for Caching:

Standard JIT (No Caching Consideration):

  • Load role + task
  • Load guide when needed
  • Load another guide when needed
  • Load dynamic content

Cache-Aware JIT:

  • Load ALL static guides upfront (cache block)
  • Load dynamic task content
  • Reference cached guides as needed
  • NO additional loads (all guides pre-cached)

When to Use Each:

Use Standard JIT when:

  • One-off task (no cache benefit)
  • Highly dynamic workflow
  • Frequently changing context

Use Cache-Aware Loading when:

  • Repetitive tasks (multiple features)
  • Long-running project (many interactions)
  • Stable reference material
  • Multiple agents using same context

Cache Hit Indicators

High Cache Hit Scenarios (Good):

  • Multiple features in same project
  • Sequential agent interactions
  • Iterative development (multiple sprints)
  • Code review across similar code
  • Consistent use of same data files

Low Cache Hit Scenarios (Less Benefit):

  • One-off greenfield projects
  • Constantly changing architecture
  • Different tech stacks per task
  • Unique context each time

Caching Best Practices

DO:

  • Load static content first (agent role, principles, data files)
  • Keep loading order consistent across requests
  • Bundle related static files together
  • Cache finalized documents (architecture, requirements)
  • Use cache-aware loading for repetitive tasks

DON'T:

  • Interleave static and dynamic content
  • Change loading order randomly
  • Cache work-in-progress documents
  • Cache user input or conversation
  • Over-cache (don't cache tiny files)

Example: Cache-Optimized Workflow

Task: Implement 5 features in existing project

Without Cache Optimization:

Feature 1: Load role + principles + architecture + task → 8,000 tokens
Feature 2: Load role + principles + architecture + task → 8,000 tokens
Feature 3: Load role + principles + architecture + task → 8,000 tokens
Feature 4: Load role + principles + architecture + task → 8,000 tokens
Feature 5: Load role + principles + architecture + task → 8,000 tokens
TOTAL: 40,000 tokens

With Cache Optimization:

Feature 1: Load [role + principles + architecture]CACHE + task → 8,000 tokens (full charge)
Feature 2: [Cache hit] + task                            →   800 tokens (10% charge)
Feature 3: [Cache hit] + task                            →   800 tokens
Feature 4: [Cache hit] + task                            →   800 tokens
Feature 5: [Cache hit] + task                            →   800 tokens
TOTAL: 11,200 tokens
SAVINGS: 72% (28,800 tokens saved)

Integration with JIT Loading

Hybrid Strategy (Best of Both):

Phase 1: Load Static Cache Block

[CACHE BLOCK - Load Once]
- Agent role
- core-principles.md
- agent-responsibility-matrix.md
- context-loading-guide.md (if needed)
Total: ~7,000 tokens

Phase 2: JIT Load Decision Guides (Also Cache-Friendly)

[CONDITIONAL CACHE - Load If Needed]
- technology-stack-guide.md (if making stack decision)
- security-guidelines.md (if implementing auth)
- api-implementation-patterns.md (if designing API)
Add: ~1,000-3,000 tokens

Phase 3: Dynamic Task Context (Never Cache)

[DYNAMIC - Changes Per Task]
- Current task description
- User messages
- Work artifacts
Add: ~500-1,500 tokens

Result:

  • Stable foundation cached
  • Decision guides cached when needed
  • Dynamic content fresh
  • Optimal token efficiency

Token Accounting with Caching

Revised Token Accounting Format:

=== TOKEN ACCOUNTING (With Cache) ===

CACHEABLE CONTEXT (Static):
- Role definition: ~300 tokens
- core-principles.md: ~5,000 tokens
- architecture.md: ~2,000 tokens
Subtotal: ~7,300 tokens
Cache Status: CACHED (first call: full charge, subsequent: ~10% charge)

DECISION CONTEXT (Semi-Static):
- security-guidelines.md: ~1,000 tokens
Cache Status: CACHED (loaded in previous task)

DYNAMIC CONTEXT (Never Cache):
- Current task: ~500 tokens
- User messages: ~200 tokens
Subtotal: ~700 tokens

TOTAL ESTIMATED:
- First call: ~9,000 tokens
- This call (cache hit): ~1,500 tokens (83% savings)

Summary: Caching Optimization

Key Principles:

  1. Static first, dynamic last - Order matters for caching
  2. Bundle static files - Load together for efficient caching
  3. Consistent structure - Same order maximizes cache hits
  4. Finalize before caching - Don't cache work-in-progress
  5. Cache-aware JIT - Load static guides upfront for repetitive tasks

Expected Impact:

  • Single task: 0-30% savings (cache setup cost)
  • Repetitive tasks: 70-90% savings (massive benefit)
  • Long-term project: 80%+ cumulative savings

Remember: Caching optimizes INPUT token costs. Output quality remains HIGH with comprehensive, detailed responses.

Code Quality Standards

Naming Conventions:

  • Files: kebab-case for utilities, PascalCase for components, camelCase for hooks
  • Variables: camelCase for functions/vars, PascalCase for classes, UPPER_SNAKE_CASE for constants
  • Descriptive: isLoading not loading, handleSubmit not submit

TypeScript Rules:

  • No any - use unknown for truly unknown types
  • interface for objects, type for unions/intersections
  • Explicit function return types
  • Generics for reusable type-safe code

Testing Philosophy:

  • Test user interactions, not implementation details
  • Mock external dependencies
  • Integration tests for critical paths
  • Unit tests for complex business logic
  • Aim for >80% coverage on critical code

Error Handling:

  • Custom error classes for different error types
  • Centralized error handling (middleware for backend, error boundaries for frontend)
  • Structured logging with context
  • Never expose stack traces in production
  • Proper HTTP status codes (400 validation, 401 auth, 404 not found, 500 server error)

Security Standards

Authentication:

  • bcrypt/argon2 for password hashing (10+ rounds)
  • JWT with short-lived access tokens + long-lived refresh tokens
  • Store refresh tokens in httpOnly cookies or secure DB
  • Implement token rotation on refresh

Input Validation:

  • Validate ALL user inputs with Zod or Joi
  • Sanitize HTML output to prevent XSS
  • Use parameterized queries to prevent SQL injection
  • Whitelist, never blacklist

API Security:

  • CORS with specific origins (not *)
  • Rate limiting per user/endpoint
  • CSRF protection for state-changing operations
  • Security headers with Helmet.js
  • HTTPS in production

Secrets Management:

  • Never commit secrets to version control
  • Use environment variables (.env files)
  • Rotate credentials regularly
  • Minimal privilege principle

Performance Standards

Frontend:

  • Code splitting and lazy loading for routes/heavy components
  • Memoization (React.memo, useMemo, useCallback) only when measured benefit
  • Virtual scrolling for long lists
  • Image optimization (next/image or similar)
  • Bundle analysis and tree shaking

Backend:

  • Database indexes on frequently queried fields
  • Connection pooling
  • Redis caching for frequently accessed data
  • Pagination for large result sets
  • Async/await throughout (no blocking operations)
  • Background jobs for heavy processing

Documentation Standards

Code Documentation:

  • JSDoc for public APIs and complex functions
  • README for setup and architecture overview
  • Inline comments for "why", not "what"
  • Keep docs close to code

API Documentation:

  • OpenAPI/Swagger for REST APIs
  • GraphQL SDL with type descriptions
  • Code examples for common use cases
  • Authentication flows documented
  • Error codes and messages explained

Architecture Documentation:

  • Architecture Decision Records (ADRs) for key decisions
  • System diagrams for complex architectures
  • Database schema documentation
  • Deployment guides

Git Standards

Commit Format: <type>(<scope>): <subject>

Types: feat, fix, docs, style, refactor, test, chore

Example: feat(auth): add password reset functionality

PR Requirements:

  • TypeScript compiles with no errors
  • ESLint passes
  • All tests pass, coverage >80%
  • No console.logs or debugger statements
  • Meaningful commits
  • Clear PR description

Working Philosophy

Start Simple: MVP first, add complexity as needed. Monolith before microservices.

Fail Fast: Validate early, catch errors at compile time, comprehensive error handling.

Iterate Quickly: Ship small increments, get feedback, improve continuously.

Automate Everything: Testing, linting, deployment, monitoring. Reduce manual toil.

Monitor and Learn: Track metrics, analyze errors, learn from production, improve continuously.


All agents reference these principles.

Additional References:

  • Implementation details: development-guidelines.md, best-practices.md, security-guidelines.md
  • Agent boundaries: agent-responsibility-matrix.md (who owns what decisions)
  • Context loading: context-loading-guide.md (when to load what)
  • Technology guides: Various data/* files (loaded just-in-time)