32 KiB

Raw Blame History

Core Development Principles

Shared philosophy and principles for all JavaScript/TypeScript development agents. These principles guide decision-making across the entire stack.

Universal Principles

Type Safety First: Catch errors at compile time, not runtime. Use TypeScript strict mode, explicit types, and leverage inference.

Security First: Every endpoint authenticated, all inputs validated, all outputs sanitized. Never trust client input.

Developer Experience: Fast feedback loops, clear error messages, intuitive APIs, comprehensive documentation.

Performance Matters: Optimize for perceived performance first. Measure before optimizing. Cache strategically.

Maintainability: Clean code organization, consistent patterns, automated testing, comprehensive documentation.

YAGNI (You Aren't Gonna Need It): Don't over-engineer for future needs. Build for 2x scale, not 100x. Prefer boring, proven technology.

Component-First Thinking: Every piece is reusable, well-tested, and composable (applies to UI components, services, modules).

Clean Architecture: Separation of concerns, dependency injection, testable code. Controllers handle I/O, services handle logic, repositories handle data.

Observability: Logging, monitoring, error tracking. You can't improve what you don't measure.

Context Efficiency Philosophy

All agents optimize token usage through high-signal communication:

Reference, Don't Repeat: Point to file paths instead of duplicating content

✅ "Implementation in src/services/auth.service.ts"
❌ [Pasting entire file contents]

Provide Summaries: After creating artifacts, give 2-3 sentence overview with file reference

✅ "Created authentication service with JWT + refresh tokens. See src/services/auth.service.ts for implementation."
❌ [Explaining every line of code written]

Progressive Detail: Start with high-level structure, add details only when implementing

✅ Start: "Need authentication with JWT" → Later: "bcrypt rounds, token expiry settings"
❌ Upfront: Every security configuration detail before deciding approach

Archive Verbose Content: Keep implementations/discussions in files, reference them

✅ Long discussions → docs/decisions/auth-strategy.md → Reference in checkpoint
❌ Repeating entire discussion in every message

Checkpoint Summaries: At phase transitions, compress context into max 100-line summaries containing:

Final decisions and rationale
Artifact file paths
Critical constraints and dependencies
What to do next

Just-in-Time Context Retrieval

CRITICAL: Agents must retrieve context only when making specific decisions, not upfront. This prevents token waste and maintains focus.

Start Minimal (Every Task)

Load at Task Start:

✅ Your role definition and core principles (this file)
✅ The specific task description and requirements
✅ Any explicitly referenced artifacts (checkpoint files, requirements docs)
❌ NO technology-specific guides yet
❌ NO implementation patterns yet
❌ NO best practices yet

Example: React Developer receives task "Build user profile component"

Load: Role definition, task description, requirements
Skip: state-management-guide.md, react-patterns.md, testing-strategy.md (load later when needed)

Load on Decision Points (Not Before)

Decision-Triggered Loading:

Technology Selection → Load technology-stack-guide.md
- ONLY when choosing framework/library
- NOT if stack already decided
Implementation Pattern → Load pattern-specific guide
- ONLY when implementing that specific pattern
- Example: Loading GraphQL patterns ONLY if building GraphQL API
Security Concern → Load security-guidelines.md
- ONLY when handling auth, sensitive data, or user input
- NOT for every task
Performance Issue → Load performance optimization guides
- ONLY when performance requirements specified
- NOT preemptively
Deployment Decision → Load deployment-strategies.md
- ONLY when choosing deployment approach
- NOT during feature development

Question Before Loading

Before loading any data file, ask:

Is this decision being made RIGHT NOW?
Can I defer this decision to implementation?
Is this information in the checkpoint/requirements already?

Examples:

❌ Over-Loading: "Building auth system" → Loading security-guidelines.md, api-implementation-patterns.md, deployment-strategies.md, database-optimization.md

WHY BAD: Most won't be needed yet

✅ JIT Loading: "Building auth system" → Start with requirements → Load security-guidelines.md when designing auth flow → Load api-implementation-patterns.md when implementing endpoints → Skip deployment/optimization (not needed yet)

❌ Over-Loading: "Design architecture" → Loading ALL data files (technology-stack-guide.md, deployment-strategies.md, security-guidelines.md, best-practices.md, etc.)

WHY BAD: Won't use 80% of it in architecture phase

✅ JIT Loading: "Design architecture" → Load technology-stack-guide.md for stack decision → Load deployment-strategies.md for hosting decision → Load security-guidelines.md IF handling sensitive data → Skip implementation patterns (for later)

Context Loading Decision Tree

Task Received
├─ Load: Role + Task + Requirements (ALWAYS)
├─ Checkpoint exists? → Load checkpoint (skip full history)
└─ Then ask: What decision am I making RIGHT NOW?
   │
   ├─ Choosing Stack/Framework?
   │  └─ Load: technology-stack-guide.md
   │
   ├─ Implementing API endpoints?
   │  ├─ What type? REST/GraphQL/tRPC?
   │  └─ Load: api-implementation-patterns.md (specific section only)
   │
   ├─ Handling Authentication/Authorization?
   │  └─ Load: security-guidelines.md
   │
   ├─ Database schema design?
   │  └─ Load: database-design-patterns.md
   │
   ├─ Deployment/Hosting decision?
   │  └─ Load: deployment-strategies.md
   │
   ├─ Performance optimization?
   │  └─ Load: performance-optimization.md
   │
   ├─ Code review/quality check?
   │  └─ Load: development-guidelines.md + relevant checklist
   │
   └─ General implementation?
      └─ Load NOTHING extra - use role knowledge
         (Load specific guides only if encountering decision)

Role-Specific Context Limits

React Developer:

Default load: core-principles.md only
Load state-management-guide.md ONLY when choosing state solution
Load component-design-guidelines.md ONLY when building complex component
SKIP: Backend patterns, API specs (unless integrating), database docs

Node Backend Developer:

Default load: core-principles.md only
Load framework comparison ONLY when choosing Express/Fastify/NestJS
Load database patterns ONLY when writing queries
SKIP: React patterns, frontend state management, CSS guides

Solution Architect:

Default load: core-principles.md + requirements
Load technology-stack-guide.md for stack decisions
Load deployment-strategies.md for hosting decisions
Load security-guidelines.md IF requirements mention sensitive data
SKIP: Implementation patterns (delegate to implementation agents)

API Developer:

Default load: core-principles.md only
Load api-implementation-patterns.md for specific API type (REST/GraphQL/tRPC section only)
SKIP: Frontend patterns, deployment strategies, database optimization

TypeScript Expert:

Default load: core-principles.md only
Load tsconfig reference ONLY when configuring TypeScript
Load advanced patterns ONLY when implementing complex types
SKIP: Framework-specific guides, deployment, testing (unless TS-specific)

Progressive Context Expansion

3-Stage Loading Pattern:

Stage 1 - Task Start (Minimal):

Role definition
Task requirements
Referenced artifacts only

Stage 2 - Decision Point (Targeted):

Load specific guide for current decision
Load ONLY relevant section (not entire file)
Example: Load "REST API" section from api-implementation-patterns.md, skip GraphQL/tRPC sections

Stage 3 - Implementation (On-Demand):

Load patterns as you encounter need
Load examples when unclear
Load checklists when validating

Anti-Patterns (What NOT to Do)

❌ "Load Everything Just in Case"

Loading all data files at task start
"Might need it later" is NOT a valid reason

❌ "Load Full File for One Detail"

Load specific section only
Reference file path for full details

❌ "Load Before Decision Needed"

Don't load deployment guide during feature design
Don't load implementation patterns during architecture phase

❌ "Load Other Agents' Context"

Backend developer doesn't need React state management
Frontend developer doesn't need database optimization

Validation Questions

Before loading any file, ask yourself:

Am I making THIS specific decision RIGHT NOW?
Is this my role's responsibility?
Can this wait until implementation?
Is this already in checkpoint/requirements?

If answer to #1 or #2 is NO → Don't load it

Token Budget Awareness

Typical Token Costs:

Role definition: ~200-300 tokens
Task description: ~100-500 tokens
Data file (full): ~1,000-3,000 tokens
Data file (section): ~200-500 tokens
Checkpoint: ~500-1,000 tokens

Target Budget:

Task start: <1,000 tokens total
Per decision: +200-500 tokens (specific guide section)
Full workflow: <10,000 tokens cumulative (use checkpoints)

If approaching budget limit: Create checkpoint, archive verbose context, start fresh phase.

Runtime Token Monitoring

CRITICAL: Agents must actively monitor token usage during execution and self-regulate to stay within budgets.

Token Self-Assessment Protocol

At Task Start:

Task: [task name]
Estimated Budget: [X tokens]

Loaded:
- Role definition: ~250 tokens
- core-principles.md: ~300 tokens
- Task requirements: ~[Y] tokens
- [other files]: ~[Z] tokens
TOTAL LOADED: ~[sum] tokens

Remaining Budget: [budget - sum] tokens

During Execution (every 2-3 decisions):

Token Check:
- Starting context: [X] tokens
- Loaded since last check: [Y] tokens
- Current estimated total: [X+Y] tokens
- Budget: [Z] tokens
- Status: [OK | WARNING | EXCEEDED]

Before Loading Any File:

Pre-Load Check:
- Current context: ~[X] tokens
- File to load: ~[Y] tokens (estimated)
- After load: ~[X+Y] tokens
- Budget: [Z] tokens
- Decision: [PROCEED | SKIP | CHECKPOINT FIRST]

Checkpoint Triggers

MUST create checkpoint when:

✅ Context exceeds 3,000 tokens
✅ Completing a major phase (architecture → implementation)
✅ Before loading would exceed budget
✅ After 5+ agent interactions in sequence
✅ Detailed discussion exceeds 1,000 tokens

Checkpoint Process:

Estimate current context size
Create checkpoint summary (max 100 lines, ~500 tokens)
Archive verbose content to docs/archive/
Start next phase with checkpoint only
Token reduction: Expect 80%+ compression

Example:

Phase complete. Estimating context:
- Architecture discussion: ~2,500 tokens
- Technology decisions: ~1,200 tokens
- Requirements: ~800 tokens
TOTAL: ~4,500 tokens

Action: Creating checkpoint
- Checkpoint summary: ~500 tokens
- Token reduction: 89%
- Archived: Full discussion to docs/archive/architecture-phase/

Budget Warning System

Token Status Levels:

🟢 GREEN (0-50% of budget):

Status: Healthy
Action: Continue normally
Example: 800/2000 tokens used

🟡 YELLOW (50-75% of budget):

Status: Warning
Action: Be selective with additional loads
Example: 1,500/2000 tokens used
Strategy: Load only critical guides, skip nice-to-haves

🟠 ORANGE (75-90% of budget):

Status: Critical
Action: Stop loading new context
Example: 1,700/2000 tokens used
Strategy: Use existing context only, prepare checkpoint

🔴 RED (>90% of budget):

Status: Exceeded
Action: Create checkpoint IMMEDIATELY
Example: 1,850/2000 tokens used
Strategy: Checkpoint NOW, start fresh phase

Self-Monitoring Format

Maintain Mental Token Accounting:

Throughout task execution, mentally track:

=== TOKEN ACCOUNTING ===

FIXED CONTEXT (loaded at start):
- Role definition: ~250
- Core principles: ~300
- Task requirements: ~500
Subtotal: ~1,050 tokens

DECISION CONTEXT (loaded during execution):
- technology-stack-guide.md: ~1,500
- security-guidelines.md: ~1,000
Subtotal: ~2,500 tokens

CONVERSATION (generated during work):
- Discussion and analysis: ~800
- Code examples: ~400
Subtotal: ~1,200 tokens

TOTAL ESTIMATED: ~4,750 tokens
BUDGET: 5,000 tokens
STATUS: 🟠 ORANGE - Approaching limit
ACTION: No more loads, complete task with current context

=== END ACCOUNTING ===

Estimation Guidelines

Token Estimation Rules:

1 word ≈ 1.3 tokens (on average)
1 line of code ≈ 15-25 tokens
1 markdown paragraph ≈ 50-100 tokens
Agent role file ≈ 200-400 tokens
Data file (full) ≈ 1,000-3,000 tokens
Data file (section) ≈ 200-500 tokens
Checkpoint summary ≈ 500-800 tokens
Architecture doc ≈ 2,000-4,000 tokens

Quick Estimation:

Count words in content
Multiply by 1.3
Round up for safety

File Size Indicators:

Small file (<100 lines): ~500 tokens
Medium file (100-300 lines): ~1,000-1,500 tokens
Large file (300-500 lines): ~2,000-3,000 tokens
Very large file (>500 lines): ~3,000+ tokens

Runtime Optimization Tactics

When Approaching Budget:

Use Section Loading: Load only relevant section of large file
- Instead of: Full api-implementation-patterns.md (~1,500 tokens)
- Load: REST section only (~300 tokens)
- Savings: 80%
Reference Don't Load: Point to file path instead of loading
- Instead of: Loading best-practices.md
- Say: "Following patterns in best-practices.md section 3.2"
- Savings: 100% (no load needed)
Checkpoint Intermediate State: Don't wait for phase end
- If discussion getting long, checkpoint NOW
- Compress verbose analysis into key decisions
- Continue with lightweight checkpoint
Defer Non-Critical Loads: Skip nice-to-have context
- If 🟡 YELLOW or above, load ONLY must-haves
- Skip optimization guides if not performance-critical
- Skip best practices if pattern already clear
Use Role Knowledge: Lean on built-in expertise
- Instead of: Loading guide for basic patterns
- Use: Agent's inherent role knowledge
- When: Standard, well-known patterns

Multi-Agent Workflow Monitoring

Workflow-Level Tracking:

When multiple agents work sequentially:

WORKFLOW TOKEN TRACKING

Phase 1: Requirements (Analyst)
- Context: ~800 tokens
- Output: requirements.md
- Checkpoint: ~600 tokens
- Handoff: Pass checkpoint only

Phase 2: Architecture (Solution Architect)
- Receive: checkpoint (~600)
- Load: tech-stack-guide (~1,500)
- Context: ~2,800 tokens
- Output: architecture.md
- Checkpoint: ~800 tokens
- Handoff: Pass checkpoint only

Phase 3: Implementation (Developers)
- Receive: checkpoint (~800)
- Load: role-specific guides (~500-1,000)
- Context: ~1,500-2,000 tokens
- No checkpoint needed (final phase)

TOTAL WORKFLOW: ~5,100 tokens (with checkpoints)
WITHOUT CHECKPOINTS: ~15,000+ tokens (accumulated context)
SAVINGS: 66%

Checkpoint Quality Metrics

Effective Checkpoint Indicators:

✅ Reduces context by 80%+ (e.g., 4,000 → 800 tokens)
✅ Contains all critical decisions
✅ Includes all artifact paths
✅ Next agent can proceed without loading previous phase
✅ Max 100 lines (~500-800 tokens)

Poor Checkpoint Indicators:

❌ Only 30-50% reduction (too verbose)
❌ Missing key decisions or rationale
❌ Next agent needs to reload original context
❌ Over 150 lines (defeats purpose)

Self-Regulation Commitment

Every agent commits to:

Estimate before loading: Know token cost before loading any file
Track cumulative usage: Maintain mental accounting of context size
Respect budget warnings: React to 🟡🟠🔴 status appropriately
Create checkpoints proactively: Don't wait for red status
Load sections not files: When possible, load only relevant sections
Use role knowledge first: Load guides only when truly needed

Example: Good Runtime Monitoring

Task: Design REST API for user service
Budget: 2,000 tokens

[Start]
Loaded: Role (250) + core-principles (300) + requirements (400) = 950 tokens
Status: 🟢 GREEN (48% of budget)

[Decision: Need REST patterns]
Pre-check: Current 950 + api-patterns REST section (300) = 1,250
Status: 🟢 GREEN (63% of budget) → PROCEED

Loaded: REST section
Context: 1,250 tokens

[Decision: Need auth patterns?]
Pre-check: Current 1,250 + security-guidelines (1,000) = 2,250
Status: 🔴 RED (113% of budget) → DEFER
Decision: Use basic auth pattern from role knowledge, skip loading guide

[Complete]
Final context: ~1,400 tokens (with output)
Status: 🟢 GREEN (70% of budget)
Result: Task complete, stayed within budget

Example: Bad Runtime Monitoring (Don't Do This)

Task: Design REST API for user service
Budget: 2,000 tokens

[Start]
Loaded: Role + core-principles + requirements + api-patterns (full) +
        security-guidelines + best-practices + deployment-strategies
Context: ~5,500 tokens
Status: 🔴 RED (275% of budget) - EXCEEDED

Problem: Loaded everything upfront without checking budget
Result: Token waste, exceeded budget, needed checkpoint immediately

Prompt Caching Optimization

POWERFUL: Prompt caching can reduce token costs by ~90% for static content and improve response times by caching reusable context.

How Prompt Caching Works

Concept: AI providers cache static portions of prompts across multiple requests, dramatically reducing token costs and latency.

Benefits:

Token Savings: ~90% reduction for cached content (5,000 tokens → 500 tokens charged)
Speed: Faster response times (cached content pre-processed)
Cost: Significant cost reduction for repeated context

Requirements:

Content must be static (doesn't change between requests)
Content must be large enough (typically >1,000 tokens) to benefit from caching
Content must be in the prefix (beginning of prompt, before dynamic content)

Cache-Friendly Content Structure

Optimal Loading Order (Stable → Dynamic):

1. CACHEABLE (Static - Load First)
├─ Agent role definition (changes rarely)
├─ core-principles.md (stable reference)
├─ Data files (technology-stack-guide.md, etc. - stable reference)
├─ Architecture documents (stable after creation)
└─ Checkpoints (stable after creation)

2. SEMI-CACHEABLE (Changes Occasionally)
├─ Task templates (stable structure, variable content)
├─ Workflow definitions (stable structure)
└─ Requirements docs (stable after approval)

3. NON-CACHEABLE (Dynamic - Load Last)
├─ Current task description (unique per task)
├─ User questions/requests (unique per interaction)
├─ Conversation history (constantly changing)
└─ Real-time data (changes frequently)

Golden Rule: Static content first, dynamic content last

What to Cache vs Not Cache

✅ CACHE (Static Reference Material):

Agent role definitions
core-principles.md
technology-stack-guide.md
security-guidelines.md
api-implementation-patterns.md
deployment-strategies.md
development-guidelines.md
architecture-patterns.md
All data/* files (stable reference)
Completed architecture documents
Finalized checkpoints

⚠️ CACHE WITH CARE (Semi-Static):

Requirements documents (after finalization)
Architecture documents (after approval)
API specifications (after freeze)
Checkpoints (after phase completion)

❌ DON'T CACHE (Dynamic):

Current task description
User questions/messages
Conversation history
Work-in-progress documents
Code being reviewed
Intermediate decisions
Debug output
Error messages

Cache-Optimized Loading Pattern

Pattern 1: Static Foundation First

LOAD ORDER for cache efficiency:

[CACHEABLE BLOCK - Load First]
1. Agent role definition (~300 tokens) ─┐
2. core-principles.md (~5,000 tokens)   │
3. technology-stack-guide.md (~2,000)   ├─ CACHE THIS (~8,500 tokens)
4. security-guidelines.md (~1,200)      │  Saves ~7,650 tokens on subsequent calls
                                        ─┘
[DYNAMIC BLOCK - Load Last]
5. Current task description (~500 tokens) ─ Don't cache (changes every task)
6. User request (~200 tokens)             ─ Don't cache (unique)

Result:

First call: ~8,700 tokens charged
Subsequent calls: ~850 tokens charged (only dynamic content)
Savings: ~90% per call after first

Pattern 2: Task-Specific Caching

For repetitive tasks (e.g., multiple features in same codebase):

[STABLE CACHE - Reuse Across Tasks]
1. Agent role
2. core-principles.md
3. Project architecture (finalized)
4. Tech stack decisions (finalized)
5. API specifications (frozen)

[TASK CACHE - Reuse Within Task]
6. Feature requirements (current feature)
7. Architecture checkpoint (current feature)

[NEVER CACHE]
8. Current subtask
9. User messages

Cache Invalidation Strategy

When to Refresh Cache:

🔄 REFRESH IMMEDIATELY when:

core-principles.md updated
Architecture significantly changed
Technology decisions revised
Security guidelines updated
Major refactoring completed

⏱️ REFRESH PERIODICALLY:

Data files: Every major version update
Agent roles: When capabilities change
Best practices: When standards evolve

✅ KEEP CACHED:

Stable reference material
Finalized architecture
Approved requirements
Completed checkpoints

Cache Lifetime Guidance:

Long-lived (days/weeks): core-principles, data files, agent roles
Medium-lived (hours/days): architecture docs, requirements
Short-lived (minutes): task context, checkpoints
No cache (per request): user input, conversation

Workflow-Level Caching

Greenfield Project Workflow:

Phase 1: Requirements
[CACHE]
- Analyst role
- core-principles.md
[NO CACHE]
- Stakeholder input
- Requirements being drafted

Phase 2: Architecture
[CACHE]
- Solution Architect role
- core-principles.md
- technology-stack-guide.md
- Requirements (now finalized)
[NO CACHE]
- Architecture being designed

Phase 3: Implementation
[CACHE]
- Developer roles
- core-principles.md
- Architecture docs (now finalized)
- Checkpoints (now stable)
[NO CACHE]
- Current feature
- Code being written

CACHE BENEFIT: Each phase reuses stable context from previous phases

Cache Size Optimization

Optimal Cache Sizes:

Minimum: 1,000+ tokens (smaller = not worth caching overhead)
Sweet Spot: 5,000-15,000 tokens (balance reuse vs memory)
Maximum: Per provider limits (Claude: up to ~3-4 cache blocks)

Strategies:

Bundle Static Files: Load related static files together in cacheable block

GOOD: Load all data/* files together (one cache block)
BAD: Load data files separately interspersed with dynamic content

Stable Prefix: Keep structure consistent across requests

GOOD: Always load role → principles → guides in same order
BAD: Random order each time (breaks cache)

Cache Boundaries: Clear separation between static and dynamic

GOOD: [Static block] then [Dynamic block]
BAD: Static → Dynamic → Static → Dynamic (cache thrashing)

Cache-Aware JIT Loading

Modify JIT Pattern for Caching:

Standard JIT (No Caching Consideration):

Load role + task
Load guide when needed
Load another guide when needed
Load dynamic content

Cache-Aware JIT:

Load ALL static guides upfront (cache block)
Load dynamic task content
Reference cached guides as needed
NO additional loads (all guides pre-cached)

When to Use Each:

Use Standard JIT when:

One-off task (no cache benefit)
Highly dynamic workflow
Frequently changing context

Use Cache-Aware Loading when:

Repetitive tasks (multiple features)
Long-running project (many interactions)
Stable reference material
Multiple agents using same context

Cache Hit Indicators

High Cache Hit Scenarios (Good):

✅ Multiple features in same project
✅ Sequential agent interactions
✅ Iterative development (multiple sprints)
✅ Code review across similar code
✅ Consistent use of same data files

Low Cache Hit Scenarios (Less Benefit):

❌ One-off greenfield projects
❌ Constantly changing architecture
❌ Different tech stacks per task
❌ Unique context each time

Caching Best Practices

DO:

✅ Load static content first (agent role, principles, data files)
✅ Keep loading order consistent across requests
✅ Bundle related static files together
✅ Cache finalized documents (architecture, requirements)
✅ Use cache-aware loading for repetitive tasks

DON'T:

❌ Interleave static and dynamic content
❌ Change loading order randomly
❌ Cache work-in-progress documents
❌ Cache user input or conversation
❌ Over-cache (don't cache tiny files)

Example: Cache-Optimized Workflow

Task: Implement 5 features in existing project

Without Cache Optimization:

Feature 1: Load role + principles + architecture + task → 8,000 tokens
Feature 2: Load role + principles + architecture + task → 8,000 tokens
Feature 3: Load role + principles + architecture + task → 8,000 tokens
Feature 4: Load role + principles + architecture + task → 8,000 tokens
Feature 5: Load role + principles + architecture + task → 8,000 tokens
TOTAL: 40,000 tokens

With Cache Optimization:

Feature 1: Load [role + principles + architecture]CACHE + task → 8,000 tokens (full charge)
Feature 2: [Cache hit] + task                            →   800 tokens (10% charge)
Feature 3: [Cache hit] + task                            →   800 tokens
Feature 4: [Cache hit] + task                            →   800 tokens
Feature 5: [Cache hit] + task                            →   800 tokens
TOTAL: 11,200 tokens
SAVINGS: 72% (28,800 tokens saved)

Integration with JIT Loading

Hybrid Strategy (Best of Both):

Phase 1: Load Static Cache Block

[CACHE BLOCK - Load Once]
- Agent role
- core-principles.md
- agent-responsibility-matrix.md
- context-loading-guide.md (if needed)
Total: ~7,000 tokens

Phase 2: JIT Load Decision Guides (Also Cache-Friendly)

[CONDITIONAL CACHE - Load If Needed]
- technology-stack-guide.md (if making stack decision)
- security-guidelines.md (if implementing auth)
- api-implementation-patterns.md (if designing API)
Add: ~1,000-3,000 tokens

Phase 3: Dynamic Task Context (Never Cache)

[DYNAMIC - Changes Per Task]
- Current task description
- User messages
- Work artifacts
Add: ~500-1,500 tokens

Result:

Stable foundation cached
Decision guides cached when needed
Dynamic content fresh
Optimal token efficiency

Token Accounting with Caching

Revised Token Accounting Format:

=== TOKEN ACCOUNTING (With Cache) ===

CACHEABLE CONTEXT (Static):
- Role definition: ~300 tokens
- core-principles.md: ~5,000 tokens
- architecture.md: ~2,000 tokens
Subtotal: ~7,300 tokens
Cache Status: CACHED (first call: full charge, subsequent: ~10% charge)

DECISION CONTEXT (Semi-Static):
- security-guidelines.md: ~1,000 tokens
Cache Status: CACHED (loaded in previous task)

DYNAMIC CONTEXT (Never Cache):
- Current task: ~500 tokens
- User messages: ~200 tokens
Subtotal: ~700 tokens

TOTAL ESTIMATED:
- First call: ~9,000 tokens
- This call (cache hit): ~1,500 tokens (83% savings)

Summary: Caching Optimization

Key Principles:

Static first, dynamic last - Order matters for caching
Bundle static files - Load together for efficient caching
Consistent structure - Same order maximizes cache hits
Finalize before caching - Don't cache work-in-progress
Cache-aware JIT - Load static guides upfront for repetitive tasks

Expected Impact:

Single task: 0-30% savings (cache setup cost)
Repetitive tasks: 70-90% savings (massive benefit)
Long-term project: 80%+ cumulative savings

Remember: Caching optimizes INPUT token costs. Output quality remains HIGH with comprehensive, detailed responses.

Code Quality Standards

Naming Conventions:

Files: kebab-case for utilities, PascalCase for components, camelCase for hooks
Variables: camelCase for functions/vars, PascalCase for classes, UPPER_SNAKE_CASE for constants
Descriptive: isLoading not loading, handleSubmit not submit

TypeScript Rules:

No any - use unknown for truly unknown types
interface for objects, type for unions/intersections
Explicit function return types
Generics for reusable type-safe code

Testing Philosophy:

Test user interactions, not implementation details
Mock external dependencies
Integration tests for critical paths
Unit tests for complex business logic
Aim for >80% coverage on critical code

Error Handling:

Custom error classes for different error types
Centralized error handling (middleware for backend, error boundaries for frontend)
Structured logging with context
Never expose stack traces in production
Proper HTTP status codes (400 validation, 401 auth, 404 not found, 500 server error)

Security Standards

Authentication:

bcrypt/argon2 for password hashing (10+ rounds)
JWT with short-lived access tokens + long-lived refresh tokens
Store refresh tokens in httpOnly cookies or secure DB
Implement token rotation on refresh

Input Validation:

Validate ALL user inputs with Zod or Joi
Sanitize HTML output to prevent XSS
Use parameterized queries to prevent SQL injection
Whitelist, never blacklist

API Security:

CORS with specific origins (not *)
Rate limiting per user/endpoint
CSRF protection for state-changing operations
Security headers with Helmet.js
HTTPS in production

Secrets Management:

Never commit secrets to version control
Use environment variables (.env files)
Rotate credentials regularly
Minimal privilege principle

Performance Standards

Frontend:

Code splitting and lazy loading for routes/heavy components
Memoization (React.memo, useMemo, useCallback) only when measured benefit
Virtual scrolling for long lists
Image optimization (next/image or similar)
Bundle analysis and tree shaking

Backend:

Database indexes on frequently queried fields
Connection pooling
Redis caching for frequently accessed data
Pagination for large result sets
Async/await throughout (no blocking operations)
Background jobs for heavy processing

Documentation Standards

Code Documentation:

JSDoc for public APIs and complex functions
README for setup and architecture overview
Inline comments for "why", not "what"
Keep docs close to code

API Documentation:

OpenAPI/Swagger for REST APIs
GraphQL SDL with type descriptions
Code examples for common use cases
Authentication flows documented
Error codes and messages explained

Architecture Documentation:

Architecture Decision Records (ADRs) for key decisions
System diagrams for complex architectures
Database schema documentation
Deployment guides

Git Standards

Commit Format: <type>(<scope>): <subject>

Types: feat, fix, docs, style, refactor, test, chore

Example: feat(auth): add password reset functionality

PR Requirements:

TypeScript compiles with no errors
ESLint passes
All tests pass, coverage >80%
No console.logs or debugger statements
Meaningful commits
Clear PR description

Working Philosophy

Start Simple: MVP first, add complexity as needed. Monolith before microservices.

Fail Fast: Validate early, catch errors at compile time, comprehensive error handling.

Iterate Quickly: Ship small increments, get feedback, improve continuously.

Automate Everything: Testing, linting, deployment, monitoring. Reduce manual toil.

Monitor and Learn: Track metrics, analyze errors, learn from production, improve continuously.

All agents reference these principles.

Additional References:

Implementation details: development-guidelines.md, best-practices.md, security-guidelines.md
Agent boundaries: agent-responsibility-matrix.md (who owns what decisions)
Context loading: context-loading-guide.md (when to load what)
Technology guides: Various data/* files (loaded just-in-time)

32 KiB Raw Blame History