# # Core Development Principles Shared philosophy and principles for all JavaScript/TypeScript development agents. These principles guide decision-making across the entire stack. ## Universal Principles **Type Safety First**: Catch errors at compile time, not runtime. Use TypeScript strict mode, explicit types, and leverage inference. **Security First**: Every endpoint authenticated, all inputs validated, all outputs sanitized. Never trust client input. **Developer Experience**: Fast feedback loops, clear error messages, intuitive APIs, comprehensive documentation. **Performance Matters**: Optimize for perceived performance first. Measure before optimizing. Cache strategically. **Maintainability**: Clean code organization, consistent patterns, automated testing, comprehensive documentation. **YAGNI (You Aren't Gonna Need It)**: Don't over-engineer for future needs. Build for 2x scale, not 100x. Prefer boring, proven technology. **Component-First Thinking**: Every piece is reusable, well-tested, and composable (applies to UI components, services, modules). **Clean Architecture**: Separation of concerns, dependency injection, testable code. Controllers handle I/O, services handle logic, repositories handle data. **Observability**: Logging, monitoring, error tracking. You can't improve what you don't measure. ## Context Efficiency Philosophy All agents optimize token usage through **high-signal communication**: **Reference, Don't Repeat**: Point to file paths instead of duplicating content - ✅ "Implementation in `src/services/auth.service.ts`" - ❌ [Pasting entire file contents] **Provide Summaries**: After creating artifacts, give 2-3 sentence overview with file reference - ✅ "Created authentication service with JWT + refresh tokens. See `src/services/auth.service.ts` for implementation." - ❌ [Explaining every line of code written] **Progressive Detail**: Start with high-level structure, add details only when implementing - ✅ Start: "Need authentication with JWT" → Later: "bcrypt rounds, token expiry settings" - ❌ Upfront: Every security configuration detail before deciding approach **Archive Verbose Content**: Keep implementations/discussions in files, reference them - ✅ Long discussions → `docs/decisions/auth-strategy.md` → Reference in checkpoint - ❌ Repeating entire discussion in every message **Checkpoint Summaries**: At phase transitions, compress context into max 100-line summaries containing: - Final decisions and rationale - Artifact file paths - Critical constraints and dependencies - What to do next ## Just-in-Time Context Retrieval **CRITICAL**: Agents must retrieve context **only when making specific decisions**, not upfront. This prevents token waste and maintains focus. ### Start Minimal (Every Task) **Load at Task Start**: - ✅ Your role definition and core principles (this file) - ✅ The specific task description and requirements - ✅ Any explicitly referenced artifacts (checkpoint files, requirements docs) - ❌ NO technology-specific guides yet - ❌ NO implementation patterns yet - ❌ NO best practices yet **Example**: React Developer receives task "Build user profile component" - Load: Role definition, task description, requirements - Skip: state-management-guide.md, react-patterns.md, testing-strategy.md (load later when needed) ### Load on Decision Points (Not Before) **Decision-Triggered Loading**: 1. **Technology Selection** → Load technology-stack-guide.md - ONLY when choosing framework/library - NOT if stack already decided 2. **Implementation Pattern** → Load pattern-specific guide - ONLY when implementing that specific pattern - Example: Loading GraphQL patterns ONLY if building GraphQL API 3. **Security Concern** → Load security-guidelines.md - ONLY when handling auth, sensitive data, or user input - NOT for every task 4. **Performance Issue** → Load performance optimization guides - ONLY when performance requirements specified - NOT preemptively 5. **Deployment Decision** → Load deployment-strategies.md - ONLY when choosing deployment approach - NOT during feature development ### Question Before Loading **Before loading any data file, ask**: - Is this decision being made RIGHT NOW? - Can I defer this decision to implementation? - Is this information in the checkpoint/requirements already? **Examples**: ❌ **Over-Loading**: "Building auth system" → Loading security-guidelines.md, api-implementation-patterns.md, deployment-strategies.md, database-optimization.md - WHY BAD: Most won't be needed yet ✅ **JIT Loading**: "Building auth system" → Start with requirements → Load security-guidelines.md when designing auth flow → Load api-implementation-patterns.md when implementing endpoints → Skip deployment/optimization (not needed yet) ❌ **Over-Loading**: "Design architecture" → Loading ALL data files (technology-stack-guide.md, deployment-strategies.md, security-guidelines.md, best-practices.md, etc.) - WHY BAD: Won't use 80% of it in architecture phase ✅ **JIT Loading**: "Design architecture" → Load technology-stack-guide.md for stack decision → Load deployment-strategies.md for hosting decision → Load security-guidelines.md IF handling sensitive data → Skip implementation patterns (for later) ### Context Loading Decision Tree ``` Task Received ├─ Load: Role + Task + Requirements (ALWAYS) ├─ Checkpoint exists? → Load checkpoint (skip full history) └─ Then ask: What decision am I making RIGHT NOW? │ ├─ Choosing Stack/Framework? │ └─ Load: technology-stack-guide.md │ ├─ Implementing API endpoints? │ ├─ What type? REST/GraphQL/tRPC? │ └─ Load: api-implementation-patterns.md (specific section only) │ ├─ Handling Authentication/Authorization? │ └─ Load: security-guidelines.md │ ├─ Database schema design? │ └─ Load: database-design-patterns.md │ ├─ Deployment/Hosting decision? │ └─ Load: deployment-strategies.md │ ├─ Performance optimization? │ └─ Load: performance-optimization.md │ ├─ Code review/quality check? │ └─ Load: development-guidelines.md + relevant checklist │ └─ General implementation? └─ Load NOTHING extra - use role knowledge (Load specific guides only if encountering decision) ``` ### Role-Specific Context Limits **React Developer**: - Default load: core-principles.md only - Load state-management-guide.md ONLY when choosing state solution - Load component-design-guidelines.md ONLY when building complex component - SKIP: Backend patterns, API specs (unless integrating), database docs **Node Backend Developer**: - Default load: core-principles.md only - Load framework comparison ONLY when choosing Express/Fastify/NestJS - Load database patterns ONLY when writing queries - SKIP: React patterns, frontend state management, CSS guides **Solution Architect**: - Default load: core-principles.md + requirements - Load technology-stack-guide.md for stack decisions - Load deployment-strategies.md for hosting decisions - Load security-guidelines.md IF requirements mention sensitive data - SKIP: Implementation patterns (delegate to implementation agents) **API Developer**: - Default load: core-principles.md only - Load api-implementation-patterns.md for specific API type (REST/GraphQL/tRPC section only) - SKIP: Frontend patterns, deployment strategies, database optimization **TypeScript Expert**: - Default load: core-principles.md only - Load tsconfig reference ONLY when configuring TypeScript - Load advanced patterns ONLY when implementing complex types - SKIP: Framework-specific guides, deployment, testing (unless TS-specific) ### Progressive Context Expansion **3-Stage Loading Pattern**: **Stage 1 - Task Start (Minimal)**: - Role definition - Task requirements - Referenced artifacts only **Stage 2 - Decision Point (Targeted)**: - Load specific guide for current decision - Load ONLY relevant section (not entire file) - Example: Load "REST API" section from api-implementation-patterns.md, skip GraphQL/tRPC sections **Stage 3 - Implementation (On-Demand)**: - Load patterns as you encounter need - Load examples when unclear - Load checklists when validating ### Anti-Patterns (What NOT to Do) ❌ **"Load Everything Just in Case"** - Loading all data files at task start - "Might need it later" is NOT a valid reason ❌ **"Load Full File for One Detail"** - Load specific section only - Reference file path for full details ❌ **"Load Before Decision Needed"** - Don't load deployment guide during feature design - Don't load implementation patterns during architecture phase ❌ **"Load Other Agents' Context"** - Backend developer doesn't need React state management - Frontend developer doesn't need database optimization ### Validation Questions **Before loading any file, ask yourself**: 1. Am I making THIS specific decision RIGHT NOW? 2. Is this my role's responsibility? 3. Can this wait until implementation? 4. Is this already in checkpoint/requirements? **If answer to #1 or #2 is NO → Don't load it** ### Token Budget Awareness **Typical Token Costs**: - Role definition: ~200-300 tokens - Task description: ~100-500 tokens - Data file (full): ~1,000-3,000 tokens - Data file (section): ~200-500 tokens - Checkpoint: ~500-1,000 tokens **Target Budget**: - Task start: <1,000 tokens total - Per decision: +200-500 tokens (specific guide section) - Full workflow: <10,000 tokens cumulative (use checkpoints) **If approaching budget limit**: Create checkpoint, archive verbose context, start fresh phase. ## Runtime Token Monitoring **CRITICAL**: Agents must actively monitor token usage during execution and self-regulate to stay within budgets. ### Token Self-Assessment Protocol **At Task Start**: ``` Task: [task name] Estimated Budget: [X tokens] Loaded: - Role definition: ~250 tokens - core-principles.md: ~300 tokens - Task requirements: ~[Y] tokens - [other files]: ~[Z] tokens TOTAL LOADED: ~[sum] tokens Remaining Budget: [budget - sum] tokens ``` **During Execution** (every 2-3 decisions): ``` Token Check: - Starting context: [X] tokens - Loaded since last check: [Y] tokens - Current estimated total: [X+Y] tokens - Budget: [Z] tokens - Status: [OK | WARNING | EXCEEDED] ``` **Before Loading Any File**: ``` Pre-Load Check: - Current context: ~[X] tokens - File to load: ~[Y] tokens (estimated) - After load: ~[X+Y] tokens - Budget: [Z] tokens - Decision: [PROCEED | SKIP | CHECKPOINT FIRST] ``` ### Checkpoint Triggers **MUST create checkpoint when**: - ✅ Context exceeds 3,000 tokens - ✅ Completing a major phase (architecture → implementation) - ✅ Before loading would exceed budget - ✅ After 5+ agent interactions in sequence - ✅ Detailed discussion exceeds 1,000 tokens **Checkpoint Process**: 1. Estimate current context size 2. Create checkpoint summary (max 100 lines, ~500 tokens) 3. Archive verbose content to `docs/archive/` 4. Start next phase with checkpoint only 5. Token reduction: Expect 80%+ compression **Example**: ``` Phase complete. Estimating context: - Architecture discussion: ~2,500 tokens - Technology decisions: ~1,200 tokens - Requirements: ~800 tokens TOTAL: ~4,500 tokens Action: Creating checkpoint - Checkpoint summary: ~500 tokens - Token reduction: 89% - Archived: Full discussion to docs/archive/architecture-phase/ ``` ### Budget Warning System **Token Status Levels**: 🟢 **GREEN (0-50% of budget)**: - Status: Healthy - Action: Continue normally - Example: 800/2000 tokens used 🟡 **YELLOW (50-75% of budget)**: - Status: Warning - Action: Be selective with additional loads - Example: 1,500/2000 tokens used - Strategy: Load only critical guides, skip nice-to-haves 🟠 **ORANGE (75-90% of budget)**: - Status: Critical - Action: Stop loading new context - Example: 1,700/2000 tokens used - Strategy: Use existing context only, prepare checkpoint 🔴 **RED (>90% of budget)**: - Status: Exceeded - Action: Create checkpoint IMMEDIATELY - Example: 1,850/2000 tokens used - Strategy: Checkpoint NOW, start fresh phase ### Self-Monitoring Format **Maintain Mental Token Accounting**: Throughout task execution, mentally track: ``` === TOKEN ACCOUNTING === FIXED CONTEXT (loaded at start): - Role definition: ~250 - Core principles: ~300 - Task requirements: ~500 Subtotal: ~1,050 tokens DECISION CONTEXT (loaded during execution): - technology-stack-guide.md: ~1,500 - security-guidelines.md: ~1,000 Subtotal: ~2,500 tokens CONVERSATION (generated during work): - Discussion and analysis: ~800 - Code examples: ~400 Subtotal: ~1,200 tokens TOTAL ESTIMATED: ~4,750 tokens BUDGET: 5,000 tokens STATUS: 🟠 ORANGE - Approaching limit ACTION: No more loads, complete task with current context === END ACCOUNTING === ``` ### Estimation Guidelines **Token Estimation Rules**: - **1 word ≈ 1.3 tokens** (on average) - **1 line of code ≈ 15-25 tokens** - **1 markdown paragraph ≈ 50-100 tokens** - **Agent role file ≈ 200-400 tokens** - **Data file (full) ≈ 1,000-3,000 tokens** - **Data file (section) ≈ 200-500 tokens** - **Checkpoint summary ≈ 500-800 tokens** - **Architecture doc ≈ 2,000-4,000 tokens** **Quick Estimation**: - Count words in content - Multiply by 1.3 - Round up for safety **File Size Indicators**: - Small file (<100 lines): ~500 tokens - Medium file (100-300 lines): ~1,000-1,500 tokens - Large file (300-500 lines): ~2,000-3,000 tokens - Very large file (>500 lines): ~3,000+ tokens ### Runtime Optimization Tactics **When Approaching Budget**: 1. **Use Section Loading**: Load only relevant section of large file - Instead of: Full api-implementation-patterns.md (~1,500 tokens) - Load: REST section only (~300 tokens) - Savings: 80% 2. **Reference Don't Load**: Point to file path instead of loading - Instead of: Loading best-practices.md - Say: "Following patterns in best-practices.md section 3.2" - Savings: 100% (no load needed) 3. **Checkpoint Intermediate State**: Don't wait for phase end - If discussion getting long, checkpoint NOW - Compress verbose analysis into key decisions - Continue with lightweight checkpoint 4. **Defer Non-Critical Loads**: Skip nice-to-have context - If 🟡 YELLOW or above, load ONLY must-haves - Skip optimization guides if not performance-critical - Skip best practices if pattern already clear 5. **Use Role Knowledge**: Lean on built-in expertise - Instead of: Loading guide for basic patterns - Use: Agent's inherent role knowledge - When: Standard, well-known patterns ### Multi-Agent Workflow Monitoring **Workflow-Level Tracking**: When multiple agents work sequentially: ``` WORKFLOW TOKEN TRACKING Phase 1: Requirements (Analyst) - Context: ~800 tokens - Output: requirements.md - Checkpoint: ~600 tokens - Handoff: Pass checkpoint only Phase 2: Architecture (Solution Architect) - Receive: checkpoint (~600) - Load: tech-stack-guide (~1,500) - Context: ~2,800 tokens - Output: architecture.md - Checkpoint: ~800 tokens - Handoff: Pass checkpoint only Phase 3: Implementation (Developers) - Receive: checkpoint (~800) - Load: role-specific guides (~500-1,000) - Context: ~1,500-2,000 tokens - No checkpoint needed (final phase) TOTAL WORKFLOW: ~5,100 tokens (with checkpoints) WITHOUT CHECKPOINTS: ~15,000+ tokens (accumulated context) SAVINGS: 66% ``` ### Checkpoint Quality Metrics **Effective Checkpoint Indicators**: - ✅ Reduces context by 80%+ (e.g., 4,000 → 800 tokens) - ✅ Contains all critical decisions - ✅ Includes all artifact paths - ✅ Next agent can proceed without loading previous phase - ✅ Max 100 lines (~500-800 tokens) **Poor Checkpoint Indicators**: - ❌ Only 30-50% reduction (too verbose) - ❌ Missing key decisions or rationale - ❌ Next agent needs to reload original context - ❌ Over 150 lines (defeats purpose) ### Self-Regulation Commitment **Every agent commits to**: 1. **Estimate before loading**: Know token cost before loading any file 2. **Track cumulative usage**: Maintain mental accounting of context size 3. **Respect budget warnings**: React to 🟡🟠🔴 status appropriately 4. **Create checkpoints proactively**: Don't wait for red status 5. **Load sections not files**: When possible, load only relevant sections 6. **Use role knowledge first**: Load guides only when truly needed ### Example: Good Runtime Monitoring ``` Task: Design REST API for user service Budget: 2,000 tokens [Start] Loaded: Role (250) + core-principles (300) + requirements (400) = 950 tokens Status: 🟢 GREEN (48% of budget) [Decision: Need REST patterns] Pre-check: Current 950 + api-patterns REST section (300) = 1,250 Status: 🟢 GREEN (63% of budget) → PROCEED Loaded: REST section Context: 1,250 tokens [Decision: Need auth patterns?] Pre-check: Current 1,250 + security-guidelines (1,000) = 2,250 Status: 🔴 RED (113% of budget) → DEFER Decision: Use basic auth pattern from role knowledge, skip loading guide [Complete] Final context: ~1,400 tokens (with output) Status: 🟢 GREEN (70% of budget) Result: Task complete, stayed within budget ``` ### Example: Bad Runtime Monitoring (Don't Do This) ``` Task: Design REST API for user service Budget: 2,000 tokens [Start] Loaded: Role + core-principles + requirements + api-patterns (full) + security-guidelines + best-practices + deployment-strategies Context: ~5,500 tokens Status: 🔴 RED (275% of budget) - EXCEEDED Problem: Loaded everything upfront without checking budget Result: Token waste, exceeded budget, needed checkpoint immediately ``` ## Prompt Caching Optimization **POWERFUL**: Prompt caching can reduce token costs by ~90% for static content and improve response times by caching reusable context. ### How Prompt Caching Works **Concept**: AI providers cache static portions of prompts across multiple requests, dramatically reducing token costs and latency. **Benefits**: - **Token Savings**: ~90% reduction for cached content (5,000 tokens → 500 tokens charged) - **Speed**: Faster response times (cached content pre-processed) - **Cost**: Significant cost reduction for repeated context **Requirements**: - Content must be **static** (doesn't change between requests) - Content must be **large enough** (typically >1,000 tokens) to benefit from caching - Content must be in the **prefix** (beginning of prompt, before dynamic content) ### Cache-Friendly Content Structure **Optimal Loading Order** (Stable → Dynamic): ``` 1. CACHEABLE (Static - Load First) ├─ Agent role definition (changes rarely) ├─ core-principles.md (stable reference) ├─ Data files (technology-stack-guide.md, etc. - stable reference) ├─ Architecture documents (stable after creation) └─ Checkpoints (stable after creation) 2. SEMI-CACHEABLE (Changes Occasionally) ├─ Task templates (stable structure, variable content) ├─ Workflow definitions (stable structure) └─ Requirements docs (stable after approval) 3. NON-CACHEABLE (Dynamic - Load Last) ├─ Current task description (unique per task) ├─ User questions/requests (unique per interaction) ├─ Conversation history (constantly changing) └─ Real-time data (changes frequently) ``` **Golden Rule**: **Static content first, dynamic content last** ### What to Cache vs Not Cache **✅ CACHE (Static Reference Material)**: - Agent role definitions - core-principles.md - technology-stack-guide.md - security-guidelines.md - api-implementation-patterns.md - deployment-strategies.md - development-guidelines.md - architecture-patterns.md - All data/* files (stable reference) - Completed architecture documents - Finalized checkpoints **⚠️ CACHE WITH CARE (Semi-Static)**: - Requirements documents (after finalization) - Architecture documents (after approval) - API specifications (after freeze) - Checkpoints (after phase completion) **❌ DON'T CACHE (Dynamic)**: - Current task description - User questions/messages - Conversation history - Work-in-progress documents - Code being reviewed - Intermediate decisions - Debug output - Error messages ### Cache-Optimized Loading Pattern **Pattern 1: Static Foundation First** ``` LOAD ORDER for cache efficiency: [CACHEABLE BLOCK - Load First] 1. Agent role definition (~300 tokens) ─┐ 2. core-principles.md (~5,000 tokens) │ 3. technology-stack-guide.md (~2,000) ├─ CACHE THIS (~8,500 tokens) 4. security-guidelines.md (~1,200) │ Saves ~7,650 tokens on subsequent calls ─┘ [DYNAMIC BLOCK - Load Last] 5. Current task description (~500 tokens) ─ Don't cache (changes every task) 6. User request (~200 tokens) ─ Don't cache (unique) ``` **Result**: - First call: ~8,700 tokens charged - Subsequent calls: ~850 tokens charged (only dynamic content) - Savings: ~90% per call after first **Pattern 2: Task-Specific Caching** For repetitive tasks (e.g., multiple features in same codebase): ``` [STABLE CACHE - Reuse Across Tasks] 1. Agent role 2. core-principles.md 3. Project architecture (finalized) 4. Tech stack decisions (finalized) 5. API specifications (frozen) [TASK CACHE - Reuse Within Task] 6. Feature requirements (current feature) 7. Architecture checkpoint (current feature) [NEVER CACHE] 8. Current subtask 9. User messages ``` ### Cache Invalidation Strategy **When to Refresh Cache**: 🔄 **REFRESH IMMEDIATELY** when: - core-principles.md updated - Architecture significantly changed - Technology decisions revised - Security guidelines updated - Major refactoring completed ⏱️ **REFRESH PERIODICALLY**: - Data files: Every major version update - Agent roles: When capabilities change - Best practices: When standards evolve ✅ **KEEP CACHED**: - Stable reference material - Finalized architecture - Approved requirements - Completed checkpoints **Cache Lifetime Guidance**: - **Long-lived** (days/weeks): core-principles, data files, agent roles - **Medium-lived** (hours/days): architecture docs, requirements - **Short-lived** (minutes): task context, checkpoints - **No cache** (per request): user input, conversation ### Workflow-Level Caching **Greenfield Project Workflow**: ``` Phase 1: Requirements [CACHE] - Analyst role - core-principles.md [NO CACHE] - Stakeholder input - Requirements being drafted Phase 2: Architecture [CACHE] - Solution Architect role - core-principles.md - technology-stack-guide.md - Requirements (now finalized) [NO CACHE] - Architecture being designed Phase 3: Implementation [CACHE] - Developer roles - core-principles.md - Architecture docs (now finalized) - Checkpoints (now stable) [NO CACHE] - Current feature - Code being written CACHE BENEFIT: Each phase reuses stable context from previous phases ``` ### Cache Size Optimization **Optimal Cache Sizes**: - **Minimum**: 1,000+ tokens (smaller = not worth caching overhead) - **Sweet Spot**: 5,000-15,000 tokens (balance reuse vs memory) - **Maximum**: Per provider limits (Claude: up to ~3-4 cache blocks) **Strategies**: 1. **Bundle Static Files**: Load related static files together in cacheable block ``` GOOD: Load all data/* files together (one cache block) BAD: Load data files separately interspersed with dynamic content ``` 2. **Stable Prefix**: Keep structure consistent across requests ``` GOOD: Always load role → principles → guides in same order BAD: Random order each time (breaks cache) ``` 3. **Cache Boundaries**: Clear separation between static and dynamic ``` GOOD: [Static block] then [Dynamic block] BAD: Static → Dynamic → Static → Dynamic (cache thrashing) ``` ### Cache-Aware JIT Loading **Modify JIT Pattern for Caching**: **Standard JIT** (No Caching Consideration): - Load role + task - Load guide when needed - Load another guide when needed - Load dynamic content **Cache-Aware JIT**: - Load ALL static guides upfront (cache block) - Load dynamic task content - Reference cached guides as needed - NO additional loads (all guides pre-cached) **When to Use Each**: Use **Standard JIT** when: - One-off task (no cache benefit) - Highly dynamic workflow - Frequently changing context Use **Cache-Aware Loading** when: - Repetitive tasks (multiple features) - Long-running project (many interactions) - Stable reference material - Multiple agents using same context ### Cache Hit Indicators **High Cache Hit Scenarios** (Good): - ✅ Multiple features in same project - ✅ Sequential agent interactions - ✅ Iterative development (multiple sprints) - ✅ Code review across similar code - ✅ Consistent use of same data files **Low Cache Hit Scenarios** (Less Benefit): - ❌ One-off greenfield projects - ❌ Constantly changing architecture - ❌ Different tech stacks per task - ❌ Unique context each time ### Caching Best Practices **DO**: - ✅ Load static content first (agent role, principles, data files) - ✅ Keep loading order consistent across requests - ✅ Bundle related static files together - ✅ Cache finalized documents (architecture, requirements) - ✅ Use cache-aware loading for repetitive tasks **DON'T**: - ❌ Interleave static and dynamic content - ❌ Change loading order randomly - ❌ Cache work-in-progress documents - ❌ Cache user input or conversation - ❌ Over-cache (don't cache tiny files) ### Example: Cache-Optimized Workflow **Task**: Implement 5 features in existing project **Without Cache Optimization**: ``` Feature 1: Load role + principles + architecture + task → 8,000 tokens Feature 2: Load role + principles + architecture + task → 8,000 tokens Feature 3: Load role + principles + architecture + task → 8,000 tokens Feature 4: Load role + principles + architecture + task → 8,000 tokens Feature 5: Load role + principles + architecture + task → 8,000 tokens TOTAL: 40,000 tokens ``` **With Cache Optimization**: ``` Feature 1: Load [role + principles + architecture]CACHE + task → 8,000 tokens (full charge) Feature 2: [Cache hit] + task → 800 tokens (10% charge) Feature 3: [Cache hit] + task → 800 tokens Feature 4: [Cache hit] + task → 800 tokens Feature 5: [Cache hit] + task → 800 tokens TOTAL: 11,200 tokens SAVINGS: 72% (28,800 tokens saved) ``` ### Integration with JIT Loading **Hybrid Strategy** (Best of Both): **Phase 1: Load Static Cache Block** ``` [CACHE BLOCK - Load Once] - Agent role - core-principles.md - agent-responsibility-matrix.md - context-loading-guide.md (if needed) Total: ~7,000 tokens ``` **Phase 2: JIT Load Decision Guides** (Also Cache-Friendly) ``` [CONDITIONAL CACHE - Load If Needed] - technology-stack-guide.md (if making stack decision) - security-guidelines.md (if implementing auth) - api-implementation-patterns.md (if designing API) Add: ~1,000-3,000 tokens ``` **Phase 3: Dynamic Task Context** (Never Cache) ``` [DYNAMIC - Changes Per Task] - Current task description - User messages - Work artifacts Add: ~500-1,500 tokens ``` **Result**: - Stable foundation cached - Decision guides cached when needed - Dynamic content fresh - Optimal token efficiency ### Token Accounting with Caching **Revised Token Accounting Format**: ``` === TOKEN ACCOUNTING (With Cache) === CACHEABLE CONTEXT (Static): - Role definition: ~300 tokens - core-principles.md: ~5,000 tokens - architecture.md: ~2,000 tokens Subtotal: ~7,300 tokens Cache Status: CACHED (first call: full charge, subsequent: ~10% charge) DECISION CONTEXT (Semi-Static): - security-guidelines.md: ~1,000 tokens Cache Status: CACHED (loaded in previous task) DYNAMIC CONTEXT (Never Cache): - Current task: ~500 tokens - User messages: ~200 tokens Subtotal: ~700 tokens TOTAL ESTIMATED: - First call: ~9,000 tokens - This call (cache hit): ~1,500 tokens (83% savings) ``` ### Summary: Caching Optimization **Key Principles**: 1. **Static first, dynamic last** - Order matters for caching 2. **Bundle static files** - Load together for efficient caching 3. **Consistent structure** - Same order maximizes cache hits 4. **Finalize before caching** - Don't cache work-in-progress 5. **Cache-aware JIT** - Load static guides upfront for repetitive tasks **Expected Impact**: - **Single task**: 0-30% savings (cache setup cost) - **Repetitive tasks**: 70-90% savings (massive benefit) - **Long-term project**: 80%+ cumulative savings **Remember**: Caching optimizes INPUT token costs. Output quality remains HIGH with comprehensive, detailed responses. ## Code Quality Standards **Naming Conventions**: - Files: kebab-case for utilities, PascalCase for components, camelCase for hooks - Variables: camelCase for functions/vars, PascalCase for classes, UPPER_SNAKE_CASE for constants - Descriptive: `isLoading` not `loading`, `handleSubmit` not `submit` **TypeScript Rules**: - No `any` - use `unknown` for truly unknown types - `interface` for objects, `type` for unions/intersections - Explicit function return types - Generics for reusable type-safe code **Testing Philosophy**: - Test user interactions, not implementation details - Mock external dependencies - Integration tests for critical paths - Unit tests for complex business logic - Aim for >80% coverage on critical code **Error Handling**: - Custom error classes for different error types - Centralized error handling (middleware for backend, error boundaries for frontend) - Structured logging with context - Never expose stack traces in production - Proper HTTP status codes (400 validation, 401 auth, 404 not found, 500 server error) ## Security Standards **Authentication**: - bcrypt/argon2 for password hashing (10+ rounds) - JWT with short-lived access tokens + long-lived refresh tokens - Store refresh tokens in httpOnly cookies or secure DB - Implement token rotation on refresh **Input Validation**: - Validate ALL user inputs with Zod or Joi - Sanitize HTML output to prevent XSS - Use parameterized queries to prevent SQL injection - Whitelist, never blacklist **API Security**: - CORS with specific origins (not *) - Rate limiting per user/endpoint - CSRF protection for state-changing operations - Security headers with Helmet.js - HTTPS in production **Secrets Management**: - Never commit secrets to version control - Use environment variables (.env files) - Rotate credentials regularly - Minimal privilege principle ## Performance Standards **Frontend**: - Code splitting and lazy loading for routes/heavy components - Memoization (React.memo, useMemo, useCallback) only when measured benefit - Virtual scrolling for long lists - Image optimization (next/image or similar) - Bundle analysis and tree shaking **Backend**: - Database indexes on frequently queried fields - Connection pooling - Redis caching for frequently accessed data - Pagination for large result sets - Async/await throughout (no blocking operations) - Background jobs for heavy processing ## Documentation Standards **Code Documentation**: - JSDoc for public APIs and complex functions - README for setup and architecture overview - Inline comments for "why", not "what" - Keep docs close to code **API Documentation**: - OpenAPI/Swagger for REST APIs - GraphQL SDL with type descriptions - Code examples for common use cases - Authentication flows documented - Error codes and messages explained **Architecture Documentation**: - Architecture Decision Records (ADRs) for key decisions - System diagrams for complex architectures - Database schema documentation - Deployment guides ## Git Standards **Commit Format**: `(): ` **Types**: feat, fix, docs, style, refactor, test, chore **Example**: `feat(auth): add password reset functionality` **PR Requirements**: - TypeScript compiles with no errors - ESLint passes - All tests pass, coverage >80% - No console.logs or debugger statements - Meaningful commits - Clear PR description ## Working Philosophy **Start Simple**: MVP first, add complexity as needed. Monolith before microservices. **Fail Fast**: Validate early, catch errors at compile time, comprehensive error handling. **Iterate Quickly**: Ship small increments, get feedback, improve continuously. **Automate Everything**: Testing, linting, deployment, monitoring. Reduce manual toil. **Monitor and Learn**: Track metrics, analyze errors, learn from production, improve continuously. --- **All agents reference these principles.** **Additional References**: - Implementation details: `development-guidelines.md`, `best-practices.md`, `security-guidelines.md` - Agent boundaries: `agent-responsibility-matrix.md` (who owns what decisions) - Context loading: `context-loading-guide.md` (when to load what) - Technology guides: Various data/* files (loaded just-in-time)