Merge 2e1949df76 into 44972d62b9
This commit is contained in:
commit
e21e33cd02
|
|
@ -0,0 +1,3 @@
|
||||||
|
# Rules
|
||||||
|
* Never creates PRs for altering code after review. Always offer a fix and the option to commit.
|
||||||
|
* Qualify the severity of the change requested. NORMAL | IMPROVEMENT | FIX | CRITICAL
|
||||||
|
|
@ -41,3 +41,7 @@ agent:
|
||||||
- trigger: DP or fuzzy match on document-project
|
- trigger: DP or fuzzy match on document-project
|
||||||
workflow: "{project-root}/_bmad/bmm/workflows/document-project/workflow.yaml"
|
workflow: "{project-root}/_bmad/bmm/workflows/document-project/workflow.yaml"
|
||||||
description: "[DP] Document Project: Analyze an existing project to produce useful documentation for both human and LLM"
|
description: "[DP] Document Project: Analyze an existing project to produce useful documentation for both human and LLM"
|
||||||
|
|
||||||
|
- trigger: KS or fuzzy match on knowledge-sync
|
||||||
|
exec: "{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync/workflow.md"
|
||||||
|
description: "[KS] Knowledge Sync: Build a RAG-ready knowledge index from project artifacts for optimized AI agent retrieval"
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,86 @@
|
||||||
|
---
|
||||||
|
project_name: ''
|
||||||
|
user_name: ''
|
||||||
|
date: ''
|
||||||
|
total_chunks: 0
|
||||||
|
sources_indexed: 0
|
||||||
|
tag_vocabulary_size: 0
|
||||||
|
retrieval_tested: false
|
||||||
|
status: 'draft'
|
||||||
|
---
|
||||||
|
|
||||||
|
# Knowledge Index for {{project_name}}
|
||||||
|
|
||||||
|
_RAG-optimized knowledge base for AI agent retrieval. Each chunk is self-contained and tagged for semantic search._
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Index Summary
|
||||||
|
|
||||||
|
- **Total Chunks:** {{total_count}}
|
||||||
|
- **Critical:** {{critical_count}} | **High:** {{high_count}} | **Standard:** {{standard_count}} | **Reference:** {{ref_count}}
|
||||||
|
- **Sources Indexed:** {{source_count}}
|
||||||
|
- **Last Synced:** {{date}}
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Critical Knowledge
|
||||||
|
|
||||||
|
<!-- Critical-priority chunks go here. These are retrieved for every implementation task. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture Knowledge
|
||||||
|
|
||||||
|
<!-- Architecture decisions, system design patterns, and technology choices. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Requirements Knowledge
|
||||||
|
|
||||||
|
<!-- Business rules, acceptance criteria, and constraints. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Knowledge
|
||||||
|
|
||||||
|
<!-- Coding patterns, conventions, and implementation rules. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Domain Knowledge
|
||||||
|
|
||||||
|
<!-- Business domain concepts, terminology, and definitions. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Operations Knowledge
|
||||||
|
|
||||||
|
<!-- Deployment, monitoring, and workflow rules. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quality Knowledge
|
||||||
|
|
||||||
|
<!-- Testing patterns, review standards, and anti-patterns. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Retrieval Configuration
|
||||||
|
|
||||||
|
### Query Mapping
|
||||||
|
|
||||||
|
| Query Pattern | Target Categories | Priority Filter | Expected Chunks |
|
||||||
|
|---|---|---|---|
|
||||||
|
| "how to implement \*" | implementation, architecture | critical, high | 3-5 |
|
||||||
|
| "testing requirements for \*" | quality, implementation | critical, high | 2-4 |
|
||||||
|
| "business rules for \*" | requirements, domain | all | 2-3 |
|
||||||
|
| "architecture decision for \*" | architecture | all | 1-3 |
|
||||||
|
| "deployment process for \*" | operations | all | 1-2 |
|
||||||
|
|
||||||
|
### Embedding Recommendations
|
||||||
|
|
||||||
|
- **Model:** Use an embedding model that handles technical content well
|
||||||
|
- **Chunk Overlap:** 50-100 characters overlap between adjacent chunks from the same source
|
||||||
|
- **Metadata Filters:** Always filter by category and priority for focused retrieval
|
||||||
|
- **Top-K:** Retrieve 3-5 chunks per query for optimal context balance
|
||||||
|
|
@ -0,0 +1,179 @@
|
||||||
|
# Step 1: Artifact Discovery & Catalog
|
||||||
|
|
||||||
|
## MANDATORY EXECUTION RULES (READ FIRST):
|
||||||
|
|
||||||
|
- 🛑 NEVER generate content without user input
|
||||||
|
- ✅ ALWAYS treat this as collaborative discovery between technical peers
|
||||||
|
- 📋 YOU ARE A FACILITATOR, not a content generator
|
||||||
|
- 💬 FOCUS on discovering and cataloging all relevant project artifacts
|
||||||
|
- 🎯 IDENTIFY sources that provide high-value knowledge for RAG retrieval
|
||||||
|
- ⚠️ ABSOLUTELY NO TIME ESTIMATES - AI development speed has fundamentally changed
|
||||||
|
- ✅ YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`
|
||||||
|
|
||||||
|
## EXECUTION PROTOCOLS:
|
||||||
|
|
||||||
|
- 🎯 Show your analysis before taking any action
|
||||||
|
- 📖 Read existing project files to catalog available artifacts
|
||||||
|
- 💾 Initialize document and update frontmatter
|
||||||
|
- 🚫 FORBIDDEN to load next step until discovery is complete
|
||||||
|
|
||||||
|
## CONTEXT BOUNDARIES:
|
||||||
|
|
||||||
|
- Variables from workflow.md are available in memory
|
||||||
|
- Focus on existing project artifacts and documentation
|
||||||
|
- Identify documents that contain reusable knowledge for AI agents
|
||||||
|
- Prioritize artifacts that prevent implementation mistakes and provide domain context
|
||||||
|
|
||||||
|
## YOUR TASK:
|
||||||
|
|
||||||
|
Discover, catalog, and classify all project artifacts that should be indexed for RAG retrieval by AI agents.
|
||||||
|
|
||||||
|
## DISCOVERY SEQUENCE:
|
||||||
|
|
||||||
|
### 1. Check for Existing Knowledge Index
|
||||||
|
|
||||||
|
First, check if a knowledge index already exists:
|
||||||
|
|
||||||
|
- Look for file at `{project_knowledge}/knowledge-index.md` or `{project-root}/**/knowledge-index.md`
|
||||||
|
- If exists: Read complete file to understand existing index
|
||||||
|
- Present to user: "Found existing knowledge index with {{chunk_count}} chunks across {{source_count}} sources. Would you like to update this or create a new one?"
|
||||||
|
|
||||||
|
### 2. Scan Planning Artifacts
|
||||||
|
|
||||||
|
Search `{planning_artifacts}` for documents containing project knowledge:
|
||||||
|
|
||||||
|
**Product Requirements:**
|
||||||
|
|
||||||
|
- Look for PRD files (`*prd*`, `*requirements*`)
|
||||||
|
- Extract key decisions, constraints, and acceptance criteria
|
||||||
|
- Note sections with high reuse value for agents
|
||||||
|
|
||||||
|
**Architecture Documents:**
|
||||||
|
|
||||||
|
- Look for architecture files (`*architecture*`, `*design*`)
|
||||||
|
- Extract technology decisions, patterns, and trade-offs
|
||||||
|
- Identify integration points and system boundaries
|
||||||
|
|
||||||
|
**Epic and Story Files:**
|
||||||
|
|
||||||
|
- Look for epic/story definitions (`*epic*`, `*stories*`)
|
||||||
|
- Extract acceptance criteria, implementation notes, and dependencies
|
||||||
|
- Identify cross-cutting concerns that appear across stories
|
||||||
|
|
||||||
|
### 3. Scan Implementation Artifacts
|
||||||
|
|
||||||
|
Search `{implementation_artifacts}` for implementation knowledge:
|
||||||
|
|
||||||
|
**Sprint and Status Files:**
|
||||||
|
|
||||||
|
- Look for sprint status, retrospectives, and course corrections
|
||||||
|
- Extract lessons learned and pattern changes
|
||||||
|
- Identify recurring issues and their resolutions
|
||||||
|
|
||||||
|
**Code Review Findings:**
|
||||||
|
|
||||||
|
- Look for code review artifacts
|
||||||
|
- Extract quality patterns and anti-patterns discovered
|
||||||
|
- Note corrections that should inform future implementation
|
||||||
|
|
||||||
|
### 4. Scan Project Knowledge
|
||||||
|
|
||||||
|
Search `{project_knowledge}` for existing knowledge assets:
|
||||||
|
|
||||||
|
**Project Context:**
|
||||||
|
|
||||||
|
- Look for `project-context.md` and similar files
|
||||||
|
- Extract implementation rules and coding conventions
|
||||||
|
- These are high-priority sources for RAG retrieval
|
||||||
|
|
||||||
|
**Research Documents:**
|
||||||
|
|
||||||
|
- Look for research outputs (market, domain, technical)
|
||||||
|
- Extract findings that inform implementation decisions
|
||||||
|
- Identify domain terminology and definitions
|
||||||
|
|
||||||
|
### 5. Scan Source Code for Patterns
|
||||||
|
|
||||||
|
Identify key code patterns worth indexing:
|
||||||
|
|
||||||
|
**Configuration Files:**
|
||||||
|
|
||||||
|
- Package manifests, build configs, linting rules
|
||||||
|
- Extract version constraints and tool configurations
|
||||||
|
- These provide critical context for code generation
|
||||||
|
|
||||||
|
**Key Source Files:**
|
||||||
|
|
||||||
|
- Identify entry points, shared utilities, and core modules
|
||||||
|
- Extract patterns that define the project's coding style
|
||||||
|
- Note any non-obvious conventions visible only in code
|
||||||
|
|
||||||
|
### 6. Classify and Prioritize Sources
|
||||||
|
|
||||||
|
For each discovered artifact, assign:
|
||||||
|
|
||||||
|
**Knowledge Category:**
|
||||||
|
|
||||||
|
- `architecture` - System design decisions and patterns
|
||||||
|
- `requirements` - Business rules and acceptance criteria
|
||||||
|
- `implementation` - Coding patterns and conventions
|
||||||
|
- `domain` - Business domain concepts and terminology
|
||||||
|
- `operations` - Deployment, monitoring, and workflow rules
|
||||||
|
- `quality` - Testing patterns, review standards, and anti-patterns
|
||||||
|
|
||||||
|
**Retrieval Priority:**
|
||||||
|
|
||||||
|
- `critical` - Must be retrieved for every implementation task
|
||||||
|
- `high` - Should be retrieved for related implementation tasks
|
||||||
|
- `standard` - Available when specifically relevant
|
||||||
|
- `reference` - Background context when explicitly needed
|
||||||
|
|
||||||
|
### 7. Present Discovery Summary
|
||||||
|
|
||||||
|
Report findings to user:
|
||||||
|
|
||||||
|
"Welcome {{user_name}}! I've scanned your project {{project_name}} to catalog artifacts for your RAG knowledge base.
|
||||||
|
|
||||||
|
**Artifacts Discovered:**
|
||||||
|
|
||||||
|
| Category | Count | Priority Breakdown |
|
||||||
|
|---|---|---|
|
||||||
|
| Architecture | {{count}} | {{critical}}/{{high}}/{{standard}} |
|
||||||
|
| Requirements | {{count}} | {{critical}}/{{high}}/{{standard}} |
|
||||||
|
| Implementation | {{count}} | {{critical}}/{{high}}/{{standard}} |
|
||||||
|
| Domain | {{count}} | {{critical}}/{{high}}/{{standard}} |
|
||||||
|
| Operations | {{count}} | {{critical}}/{{high}}/{{standard}} |
|
||||||
|
| Quality | {{count}} | {{critical}}/{{high}}/{{standard}} |
|
||||||
|
|
||||||
|
**Source Files Cataloged:** {{total_files}}
|
||||||
|
|
||||||
|
**Recommended Chunking Strategy:**
|
||||||
|
Based on your artifact types, I recommend {{strategy}} chunking:
|
||||||
|
- {{strategy_rationale}}
|
||||||
|
|
||||||
|
Ready to index and chunk your project knowledge for RAG retrieval.
|
||||||
|
|
||||||
|
[C] Continue to knowledge indexing"
|
||||||
|
|
||||||
|
## SUCCESS METRICS:
|
||||||
|
|
||||||
|
✅ All relevant project artifacts discovered and cataloged
|
||||||
|
✅ Each artifact classified by category and retrieval priority
|
||||||
|
✅ Source file paths accurately recorded
|
||||||
|
✅ Chunking strategy recommended based on artifact analysis
|
||||||
|
✅ Discovery findings clearly presented to user
|
||||||
|
✅ User ready to proceed with indexing
|
||||||
|
|
||||||
|
## FAILURE MODES:
|
||||||
|
|
||||||
|
❌ Missing critical artifacts in planning or implementation directories
|
||||||
|
❌ Not checking for existing knowledge index before creating new one
|
||||||
|
❌ Incorrect classification of artifact categories or priorities
|
||||||
|
❌ Not scanning source code for pattern-level knowledge
|
||||||
|
❌ Not presenting clear discovery summary to user
|
||||||
|
|
||||||
|
## NEXT STEP:
|
||||||
|
|
||||||
|
After user selects [C] to continue, load `{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md` to index and chunk the discovered artifacts.
|
||||||
|
|
||||||
|
Remember: Do NOT proceed to step-02 until user explicitly selects [C] from the menu and discovery catalog is confirmed!
|
||||||
|
|
@ -0,0 +1,243 @@
|
||||||
|
# Step 2: Knowledge Indexing & Chunking
|
||||||
|
|
||||||
|
## MANDATORY EXECUTION RULES (READ FIRST):
|
||||||
|
|
||||||
|
- 🛑 NEVER generate content without user input
|
||||||
|
- ✅ ALWAYS treat this as collaborative indexing between technical peers
|
||||||
|
- 📋 YOU ARE A FACILITATOR, not a content generator
|
||||||
|
- 💬 FOCUS on creating self-contained, retrievable knowledge chunks
|
||||||
|
- 🎯 EACH CHUNK must be independently useful without requiring full document context
|
||||||
|
- ⚠️ ABSOLUTELY NO TIME ESTIMATES - AI development speed has fundamentally changed
|
||||||
|
- ✅ YOU MUST ALWAYS SPEAK OUTPUT In your Agent communication style with the config `{communication_language}`
|
||||||
|
|
||||||
|
## EXECUTION PROTOCOLS:
|
||||||
|
|
||||||
|
- 🎯 Show your analysis before taking any action
|
||||||
|
- 📝 Focus on creating atomic, self-contained knowledge chunks
|
||||||
|
- ⚠️ Present A/P/C menu after each major category
|
||||||
|
- 💾 ONLY save when user chooses C (Continue)
|
||||||
|
- 📖 Update frontmatter with completed categories
|
||||||
|
- 🚫 FORBIDDEN to load next step until all categories are indexed
|
||||||
|
|
||||||
|
## COLLABORATION MENUS (A/P/C):
|
||||||
|
|
||||||
|
This step will generate content and present choices for each knowledge category:
|
||||||
|
|
||||||
|
- **A (Advanced Elicitation)**: Use discovery protocols to explore nuanced knowledge relationships
|
||||||
|
- **P (Party Mode)**: Bring multiple perspectives to identify missing knowledge connections
|
||||||
|
- **C (Continue)**: Save the current chunks and proceed to next category
|
||||||
|
|
||||||
|
## PROTOCOL INTEGRATION:
|
||||||
|
|
||||||
|
- When 'A' selected: Execute {project-root}/_bmad/core/workflows/advanced-elicitation/workflow.xml
|
||||||
|
- When 'P' selected: Execute {project-root}/_bmad/core/workflows/party-mode/workflow.md
|
||||||
|
- PROTOCOLS always return to display this step's A/P/C menu after the A or P have completed
|
||||||
|
- User accepts/rejects protocol changes before proceeding
|
||||||
|
|
||||||
|
## CONTEXT BOUNDARIES:
|
||||||
|
|
||||||
|
- Discovery catalog from step-1 is available
|
||||||
|
- All artifact paths and classifications are identified
|
||||||
|
- Focus on creating chunks optimized for embedding and retrieval
|
||||||
|
- Each chunk must carry enough context to be useful in isolation
|
||||||
|
|
||||||
|
## YOUR TASK:
|
||||||
|
|
||||||
|
Index each discovered artifact into self-contained knowledge chunks with metadata tags, source tracing, and retrieval-optimized formatting.
|
||||||
|
|
||||||
|
## CHUNKING PRINCIPLES:
|
||||||
|
|
||||||
|
### Chunk Design Rules
|
||||||
|
|
||||||
|
1. **Self-Contained**: Each chunk must be understandable without reading the source document
|
||||||
|
2. **Tagged**: Every chunk has category, priority, source path, and semantic tags
|
||||||
|
3. **Atomic**: One concept or decision per chunk - no compound knowledge
|
||||||
|
4. **Traceable**: Every chunk links back to its source artifact and section
|
||||||
|
5. **Contextual**: Include enough surrounding context for accurate retrieval
|
||||||
|
6. **Deduplicated**: Avoid redundant chunks across different source artifacts
|
||||||
|
|
||||||
|
### Chunk Format
|
||||||
|
|
||||||
|
Each chunk follows this standard format:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
### [CHUNK-ID] Chunk Title
|
||||||
|
|
||||||
|
- **Source:** `{relative_path_to_source_file}`
|
||||||
|
- **Category:** architecture | requirements | implementation | domain | operations | quality
|
||||||
|
- **Priority:** critical | high | standard | reference
|
||||||
|
- **Tags:** comma-separated semantic tags for retrieval matching
|
||||||
|
|
||||||
|
**Context:** One-line description of when this knowledge is relevant.
|
||||||
|
|
||||||
|
**Content:**
|
||||||
|
The actual knowledge content - specific, actionable, self-contained.
|
||||||
|
```
|
||||||
|
|
||||||
|
## INDEXING SEQUENCE:
|
||||||
|
|
||||||
|
### 1. Index Critical-Priority Artifacts
|
||||||
|
|
||||||
|
Process all artifacts marked as `critical` priority first:
|
||||||
|
|
||||||
|
**For each critical artifact:**
|
||||||
|
|
||||||
|
- Read the complete source file
|
||||||
|
- Identify distinct knowledge units (decisions, rules, constraints)
|
||||||
|
- Create one chunk per knowledge unit
|
||||||
|
- Apply semantic tags for retrieval matching
|
||||||
|
- Present chunks to user for validation
|
||||||
|
|
||||||
|
**Present results:**
|
||||||
|
"I've created {{chunk_count}} critical-priority chunks from {{source_count}} sources:
|
||||||
|
|
||||||
|
{{list_of_chunk_titles_with_tags}}
|
||||||
|
|
||||||
|
These chunks will be prioritized in every retrieval query.
|
||||||
|
|
||||||
|
[A] Advanced Elicitation - Explore deeper knowledge connections
|
||||||
|
[P] Party Mode - Review from multiple implementation perspectives
|
||||||
|
[C] Continue - Save these chunks and proceed"
|
||||||
|
|
||||||
|
### 2. Index High-Priority Artifacts
|
||||||
|
|
||||||
|
Process all `high` priority artifacts:
|
||||||
|
|
||||||
|
**For each high-priority artifact:**
|
||||||
|
|
||||||
|
- Read source file and identify knowledge units
|
||||||
|
- Create chunks with appropriate tags
|
||||||
|
- Cross-reference with critical chunks for consistency
|
||||||
|
- Identify any overlaps and deduplicate
|
||||||
|
|
||||||
|
### 3. Index Standard-Priority Artifacts
|
||||||
|
|
||||||
|
Process `standard` priority artifacts:
|
||||||
|
|
||||||
|
**For each standard artifact:**
|
||||||
|
|
||||||
|
- Read source file for domain-specific knowledge
|
||||||
|
- Create chunks focused on contextual information
|
||||||
|
- Tag for specific retrieval scenarios
|
||||||
|
|
||||||
|
### 4. Index Reference-Priority Artifacts
|
||||||
|
|
||||||
|
Process `reference` priority artifacts:
|
||||||
|
|
||||||
|
**For each reference artifact:**
|
||||||
|
|
||||||
|
- Extract background context and terminology
|
||||||
|
- Create lighter-weight chunks for supplementary retrieval
|
||||||
|
- Tag for broad topic matching
|
||||||
|
|
||||||
|
### 5. Cross-Reference and Deduplicate
|
||||||
|
|
||||||
|
After all categories are indexed:
|
||||||
|
|
||||||
|
**Deduplication Analysis:**
|
||||||
|
|
||||||
|
- Identify chunks with overlapping content across sources
|
||||||
|
- Merge or consolidate redundant chunks
|
||||||
|
- Ensure cross-references between related chunks are tagged
|
||||||
|
- Present deduplication summary to user
|
||||||
|
|
||||||
|
**Relationship Mapping:**
|
||||||
|
|
||||||
|
- Identify chunks that frequently co-occur in implementation contexts
|
||||||
|
- Tag related chunks for retrieval grouping
|
||||||
|
- Create chunk clusters for common query patterns
|
||||||
|
|
||||||
|
### 6. Generate Knowledge Index Document
|
||||||
|
|
||||||
|
Compile all validated chunks into the knowledge index file:
|
||||||
|
|
||||||
|
**Document Structure:**
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
# Knowledge Index for {{project_name}}
|
||||||
|
|
||||||
|
_RAG-optimized knowledge base for AI agent retrieval. Each chunk is self-contained and tagged for semantic search._
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Index Summary
|
||||||
|
|
||||||
|
- **Total Chunks:** {{total_count}}
|
||||||
|
- **Critical:** {{critical_count}} | **High:** {{high_count}} | **Standard:** {{standard_count}} | **Reference:** {{ref_count}}
|
||||||
|
- **Sources Indexed:** {{source_count}}
|
||||||
|
- **Last Synced:** {{date}}
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Critical Knowledge
|
||||||
|
|
||||||
|
{{critical_chunks}}
|
||||||
|
|
||||||
|
## Architecture Knowledge
|
||||||
|
|
||||||
|
{{architecture_chunks}}
|
||||||
|
|
||||||
|
## Requirements Knowledge
|
||||||
|
|
||||||
|
{{requirements_chunks}}
|
||||||
|
|
||||||
|
## Implementation Knowledge
|
||||||
|
|
||||||
|
{{implementation_chunks}}
|
||||||
|
|
||||||
|
## Domain Knowledge
|
||||||
|
|
||||||
|
{{domain_chunks}}
|
||||||
|
|
||||||
|
## Operations Knowledge
|
||||||
|
|
||||||
|
{{operations_chunks}}
|
||||||
|
|
||||||
|
## Quality Knowledge
|
||||||
|
|
||||||
|
{{quality_chunks}}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 7. Present Indexing Summary
|
||||||
|
|
||||||
|
"Knowledge indexing complete for {{project_name}}!
|
||||||
|
|
||||||
|
**Chunks Created:**
|
||||||
|
|
||||||
|
| Category | Critical | High | Standard | Reference | Total |
|
||||||
|
|---|---|---|---|---|---|
|
||||||
|
| Architecture | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
|
||||||
|
| Requirements | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
|
||||||
|
| Implementation | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
|
||||||
|
| Domain | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
|
||||||
|
| Operations | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
|
||||||
|
| Quality | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
|
||||||
|
|
||||||
|
**Deduplication:** Removed {{removed_count}} redundant chunks
|
||||||
|
**Cross-References:** {{xref_count}} chunk relationships mapped
|
||||||
|
|
||||||
|
[C] Continue to optimization"
|
||||||
|
|
||||||
|
## SUCCESS METRICS:
|
||||||
|
|
||||||
|
✅ All discovered artifacts indexed into self-contained chunks
|
||||||
|
✅ Each chunk has proper metadata tags and source tracing
|
||||||
|
✅ No redundant or overlapping chunks remain
|
||||||
|
✅ Cross-references between related chunks are mapped
|
||||||
|
✅ A/P/C menu presented and handled correctly for each category
|
||||||
|
✅ Knowledge index document properly structured
|
||||||
|
|
||||||
|
## FAILURE MODES:
|
||||||
|
|
||||||
|
❌ Creating chunks that require reading the full source document
|
||||||
|
❌ Missing semantic tags that prevent accurate retrieval
|
||||||
|
❌ Not deduplicating overlapping chunks from different sources
|
||||||
|
❌ Not cross-referencing related knowledge units
|
||||||
|
❌ Not getting user validation for each category
|
||||||
|
❌ Creating overly large chunks that reduce retrieval precision
|
||||||
|
|
||||||
|
## NEXT STEP:
|
||||||
|
|
||||||
|
After completing all categories and user selects [C], load `{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` to optimize the knowledge base for retrieval quality.
|
||||||
|
|
||||||
|
Remember: Do NOT proceed to step-03 until all categories are indexed and user explicitly selects [C]!
|
||||||
|
|
@ -0,0 +1,289 @@
|
||||||
|
# Step 3: Knowledge Base Optimization & Completion
|
||||||
|
|
||||||
|
## MANDATORY EXECUTION RULES (READ FIRST):
|
||||||
|
|
||||||
|
- 🛑 NEVER generate content without user input
|
||||||
|
- ✅ ALWAYS treat this as collaborative optimization between technical peers
|
||||||
|
- 📋 YOU ARE A FACILITATOR, not a content generator
|
||||||
|
- 💬 FOCUS on optimizing chunks for retrieval quality and accuracy
|
||||||
|
- 🎯 ENSURE every chunk is retrieval-ready and well-tagged
|
||||||
|
- ⚠️ ABSOLUTELY NO TIME ESTIMATES - AI development speed has fundamentally changed
|
||||||
|
- ✅ YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`
|
||||||
|
|
||||||
|
## EXECUTION PROTOCOLS:
|
||||||
|
|
||||||
|
- 🎯 Show your analysis before taking any action
|
||||||
|
- 📝 Review and optimize chunks for retrieval precision
|
||||||
|
- 📖 Update frontmatter with completion status
|
||||||
|
- 🚫 NO MORE STEPS - this is the final step
|
||||||
|
|
||||||
|
## CONTEXT BOUNDARIES:
|
||||||
|
|
||||||
|
- All knowledge chunks from step-2 are indexed
|
||||||
|
- Cross-references and deduplication are complete
|
||||||
|
- Focus on retrieval quality optimization and finalization
|
||||||
|
- Ensure the knowledge index is ready for RAG pipeline integration
|
||||||
|
|
||||||
|
## YOUR TASK:
|
||||||
|
|
||||||
|
Optimize the knowledge index for retrieval quality, validate chunk completeness, and finalize the knowledge base for AI agent consumption.
|
||||||
|
|
||||||
|
## OPTIMIZATION SEQUENCE:
|
||||||
|
|
||||||
|
### 1. Retrieval Quality Analysis
|
||||||
|
|
||||||
|
Analyze the indexed chunks for retrieval effectiveness:
|
||||||
|
|
||||||
|
**Tag Coverage Analysis:**
|
||||||
|
|
||||||
|
- Review semantic tags across all chunks
|
||||||
|
- Identify gaps where common queries would miss relevant chunks
|
||||||
|
- Suggest additional tags for better retrieval matching
|
||||||
|
- Present tag coverage report to user
|
||||||
|
|
||||||
|
**Chunk Size Analysis:**
|
||||||
|
|
||||||
|
- Identify chunks that are too large (reduce retrieval precision)
|
||||||
|
- Identify chunks that are too small (lack sufficient context)
|
||||||
|
- Recommend splits or merges for optimal retrieval size
|
||||||
|
- Target: Each chunk should be 100-500 words for optimal embedding
|
||||||
|
|
||||||
|
**Context Sufficiency Check:**
|
||||||
|
|
||||||
|
- Verify each chunk is understandable without its source document
|
||||||
|
- Add missing context where chunks reference undefined terms
|
||||||
|
- Ensure technical terms are defined or tagged for glossary lookup
|
||||||
|
|
||||||
|
### 2. Semantic Tag Optimization
|
||||||
|
|
||||||
|
Optimize tags for retrieval accuracy:
|
||||||
|
|
||||||
|
**Tag Standardization:**
|
||||||
|
|
||||||
|
- Normalize similar tags (e.g., "api-design" and "api-patterns" → single standard)
|
||||||
|
- Create a tag vocabulary for the project
|
||||||
|
- Apply consistent tag format across all chunks
|
||||||
|
|
||||||
|
**Tag Enrichment:**
|
||||||
|
|
||||||
|
- Add technology-specific tags (framework names, library names)
|
||||||
|
- Add pattern-type tags (e.g., "error-handling", "state-management")
|
||||||
|
- Add lifecycle tags (e.g., "setup", "implementation", "testing", "deployment")
|
||||||
|
|
||||||
|
**Present Tag Summary:**
|
||||||
|
"I've optimized the semantic tags across {{chunk_count}} chunks:
|
||||||
|
|
||||||
|
**Tag Vocabulary:** {{unique_tag_count}} standardized tags
|
||||||
|
**Most Connected Tags:** {{top_tags_by_frequency}}
|
||||||
|
**Coverage Gaps Fixed:** {{gaps_fixed}}
|
||||||
|
|
||||||
|
Would you like to review the tag vocabulary? (y/n)"
|
||||||
|
|
||||||
|
### 3. Retrieval Scenario Testing
|
||||||
|
|
||||||
|
Validate retrieval quality with common query scenarios:
|
||||||
|
|
||||||
|
**Test Queries:**
|
||||||
|
|
||||||
|
Simulate these common developer queries against the knowledge index:
|
||||||
|
|
||||||
|
1. "How should I structure a new feature?" → Should retrieve: architecture + implementation chunks
|
||||||
|
2. "What are the testing requirements?" → Should retrieve: quality + implementation chunks
|
||||||
|
3. "What technology versions are we using?" → Should retrieve: critical implementation chunks
|
||||||
|
4. "How do I handle errors in this project?" → Should retrieve: implementation + quality chunks
|
||||||
|
5. "What are the business rules for {{core_feature}}?" → Should retrieve: requirements + domain chunks
|
||||||
|
|
||||||
|
**For each query, report:**
|
||||||
|
|
||||||
|
- Chunks that would be retrieved (by tag matching)
|
||||||
|
- Missing chunks that should be retrieved but aren't
|
||||||
|
- False positive chunks that would be retrieved incorrectly
|
||||||
|
- Recommended tag adjustments
|
||||||
|
|
||||||
|
### 4. Generate Retrieval Configuration
|
||||||
|
|
||||||
|
Create a retrieval configuration section in the knowledge index:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Retrieval Configuration
|
||||||
|
|
||||||
|
### Query Mapping
|
||||||
|
|
||||||
|
| Query Pattern | Target Categories | Priority Filter | Expected Chunks |
|
||||||
|
|---|---|---|---|
|
||||||
|
| "how to implement *" | implementation, architecture | critical, high | 3-5 |
|
||||||
|
| "testing requirements for *" | quality, implementation | critical, high | 2-4 |
|
||||||
|
| "business rules for *" | requirements, domain | all | 2-3 |
|
||||||
|
| "architecture decision for *" | architecture | all | 1-3 |
|
||||||
|
| "deployment process for *" | operations | all | 1-2 |
|
||||||
|
|
||||||
|
### Embedding Recommendations
|
||||||
|
|
||||||
|
- **Model:** Use an embedding model that handles technical content well
|
||||||
|
- **Chunk Overlap:** 50-100 characters overlap between adjacent chunks from the same source
|
||||||
|
- **Metadata Filters:** Always filter by category and priority for focused retrieval
|
||||||
|
- **Top-K:** Retrieve 3-5 chunks per query for optimal context balance
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Finalize Knowledge Index
|
||||||
|
|
||||||
|
Complete the knowledge index with optimization results:
|
||||||
|
|
||||||
|
**Update Frontmatter:**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
---
|
||||||
|
project_name: '{{project_name}}'
|
||||||
|
user_name: '{{user_name}}'
|
||||||
|
date: '{{date}}'
|
||||||
|
total_chunks: {{total_count}}
|
||||||
|
sources_indexed: {{source_count}}
|
||||||
|
tag_vocabulary_size: {{tag_count}}
|
||||||
|
retrieval_tested: true
|
||||||
|
status: 'complete'
|
||||||
|
---
|
||||||
|
```
|
||||||
|
|
||||||
|
**Append Usage Guidelines:**
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
---
|
||||||
|
|
||||||
|
## Usage Guidelines
|
||||||
|
|
||||||
|
**For AI Agents (RAG Retrieval):**
|
||||||
|
|
||||||
|
- Query this index using semantic search against chunk tags and content
|
||||||
|
- Always include critical-priority chunks in implementation context
|
||||||
|
- Filter by category when the task type is known
|
||||||
|
- Cross-reference related chunks using shared tags
|
||||||
|
|
||||||
|
**For Humans (Maintenance):**
|
||||||
|
|
||||||
|
- Re-run this workflow when new artifacts are created or significantly updated
|
||||||
|
- Add new chunks manually using the standard chunk format above
|
||||||
|
- Review and prune quarterly to remove outdated knowledge
|
||||||
|
- Update tags when new patterns or technologies are adopted
|
||||||
|
|
||||||
|
**For RAG Pipeline Integration:**
|
||||||
|
|
||||||
|
- Parse chunks by the `### [CHUNK-ID]` delimiter
|
||||||
|
- Extract metadata from the bullet-point headers (Source, Category, Priority, Tags)
|
||||||
|
- Use Tags field for semantic search indexing
|
||||||
|
- Use Priority field for retrieval ranking
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6. Present Completion Summary
|
||||||
|
|
||||||
|
Based on user skill level, present the completion:
|
||||||
|
|
||||||
|
**Expert Mode:**
|
||||||
|
"Knowledge index complete. {{chunk_count}} chunks across {{source_count}} sources, {{tag_count}} semantic tags. Retrieval-tested and RAG-ready.
|
||||||
|
|
||||||
|
File saved to: `{project_knowledge}/knowledge-index.md`"
|
||||||
|
|
||||||
|
**Intermediate Mode:**
|
||||||
|
"Your project knowledge base is indexed and retrieval-ready!
|
||||||
|
|
||||||
|
**What we created:**
|
||||||
|
|
||||||
|
- {{chunk_count}} self-contained knowledge chunks
|
||||||
|
- {{source_count}} source artifacts indexed
|
||||||
|
- {{tag_count}} semantic tags for retrieval matching
|
||||||
|
- Retrieval configuration for RAG pipeline integration
|
||||||
|
|
||||||
|
**How it works:**
|
||||||
|
AI agents can now search this index to find exactly the project knowledge they need for any implementation task, instead of loading entire documents.
|
||||||
|
|
||||||
|
**Next steps:**
|
||||||
|
|
||||||
|
- Integrate with your RAG pipeline using the retrieval configuration
|
||||||
|
- Re-run this workflow when artifacts change significantly
|
||||||
|
- Review quarterly to keep knowledge current"
|
||||||
|
|
||||||
|
**Beginner Mode:**
|
||||||
|
"Your project knowledge base is ready! 🎉
|
||||||
|
|
||||||
|
**What this does:**
|
||||||
|
Think of this as a smart library catalog for your project. Instead of AI agents reading every document from start to finish, they can now search for exactly the knowledge they need.
|
||||||
|
|
||||||
|
**What's included:**
|
||||||
|
|
||||||
|
- {{chunk_count}} bite-sized knowledge pieces from your project documents
|
||||||
|
- Smart tags so agents can find the right knowledge quickly
|
||||||
|
- Priority labels so the most important knowledge comes first
|
||||||
|
|
||||||
|
**How AI agents use it:**
|
||||||
|
When an agent needs to implement something, it searches this index for relevant knowledge chunks instead of reading entire documents. This makes them faster and more accurate!"
|
||||||
|
|
||||||
|
### 7. Completion Validation
|
||||||
|
|
||||||
|
Final checks before completion:
|
||||||
|
|
||||||
|
**Content Validation:**
|
||||||
|
✅ All discovered artifacts indexed into chunks
|
||||||
|
✅ Each chunk has proper metadata and source tracing
|
||||||
|
✅ Semantic tags are standardized and comprehensive
|
||||||
|
✅ No redundant chunks remain after deduplication
|
||||||
|
✅ Retrieval scenarios tested successfully
|
||||||
|
✅ Retrieval configuration generated
|
||||||
|
|
||||||
|
**Format Validation:**
|
||||||
|
✅ Consistent chunk format throughout
|
||||||
|
✅ Frontmatter properly updated
|
||||||
|
✅ Tag vocabulary is standardized
|
||||||
|
✅ Document is well-structured and scannable
|
||||||
|
|
||||||
|
### 8. Completion Message
|
||||||
|
|
||||||
|
"✅ **GenAI Knowledge Sync Complete!**
|
||||||
|
|
||||||
|
Your retrieval-optimized knowledge index is ready at:
|
||||||
|
`{project_knowledge}/knowledge-index.md`
|
||||||
|
|
||||||
|
**📊 Knowledge Base Summary:**
|
||||||
|
|
||||||
|
- {{chunk_count}} indexed knowledge chunks
|
||||||
|
- {{source_count}} source artifacts cataloged
|
||||||
|
- {{tag_count}} semantic tags for retrieval
|
||||||
|
- {{category_count}} knowledge categories covered
|
||||||
|
- Retrieval-tested with {{test_count}} query scenarios
|
||||||
|
|
||||||
|
**🎯 RAG Integration Ready:**
|
||||||
|
|
||||||
|
- Self-contained chunks with metadata headers
|
||||||
|
- Standardized tag vocabulary for semantic search
|
||||||
|
- Priority-based retrieval ranking
|
||||||
|
- Query mapping configuration included
|
||||||
|
|
||||||
|
**📋 Maintenance:**
|
||||||
|
|
||||||
|
1. Re-sync when artifacts change significantly: run this workflow again
|
||||||
|
2. Add individual chunks manually using the standard format
|
||||||
|
3. Review quarterly to prune outdated knowledge
|
||||||
|
4. Update tags when new patterns emerge
|
||||||
|
|
||||||
|
Your AI agents can now retrieve precisely the project knowledge they need for any task!"
|
||||||
|
|
||||||
|
## SUCCESS METRICS:
|
||||||
|
|
||||||
|
✅ Knowledge index fully optimized for retrieval quality
|
||||||
|
✅ Semantic tags standardized and comprehensive
|
||||||
|
✅ Retrieval scenarios tested with good coverage
|
||||||
|
✅ Retrieval configuration generated for RAG pipeline
|
||||||
|
✅ Usage guidelines included for agents, humans, and pipelines
|
||||||
|
✅ Frontmatter properly updated with completion status
|
||||||
|
✅ User provided with clear maintenance guidance
|
||||||
|
|
||||||
|
## FAILURE MODES:
|
||||||
|
|
||||||
|
❌ Chunks too large or too small for effective retrieval
|
||||||
|
❌ Semantic tags inconsistent or too sparse
|
||||||
|
❌ Not testing retrieval scenarios before finalizing
|
||||||
|
❌ Missing retrieval configuration for pipeline integration
|
||||||
|
❌ Not providing maintenance and usage guidelines
|
||||||
|
❌ Frontmatter not properly updated
|
||||||
|
|
||||||
|
## WORKFLOW COMPLETE:
|
||||||
|
|
||||||
|
This is the final step of the GenAI Knowledge Sync workflow. The user now has a retrieval-optimized knowledge index that enables AI agents to find and use exactly the project knowledge they need for any implementation task, improving both speed and accuracy of AI-assisted development.
|
||||||
|
|
@ -0,0 +1,50 @@
|
||||||
|
---
|
||||||
|
name: genai-knowledge-sync
|
||||||
|
description: 'Build and maintain a RAG-ready knowledge base from project artifacts. Use when the user says "build knowledge base", "sync knowledge", or "create RAG context"'
|
||||||
|
---
|
||||||
|
|
||||||
|
# GenAI Knowledge Sync Workflow
|
||||||
|
|
||||||
|
**Goal:** Create a structured, chunked knowledge index (`knowledge-index.md`) from project artifacts that is optimized for Retrieval-Augmented Generation (RAG) pipelines and AI agent context loading. This enables AI agents to retrieve the most relevant project knowledge at inference time rather than loading entire documents.
|
||||||
|
|
||||||
|
**Your Role:** You are a technical knowledge architect working with a peer to catalog, chunk, and index project artifacts into a retrieval-optimized format. You ensure every knowledge chunk is self-contained, well-tagged, and traceable to its source.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## WORKFLOW ARCHITECTURE
|
||||||
|
|
||||||
|
This uses **micro-file architecture** for disciplined execution:
|
||||||
|
|
||||||
|
- Each step is a self-contained file with embedded rules
|
||||||
|
- Sequential progression with user control at each step
|
||||||
|
- Document state tracked in frontmatter
|
||||||
|
- Focus on lean, retrieval-optimized content generation
|
||||||
|
- You NEVER proceed to a step file if the current step file indicates the user must approve and indicate continuation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## INITIALIZATION
|
||||||
|
|
||||||
|
### Configuration Loading
|
||||||
|
|
||||||
|
Load config from `{project-root}/_bmad/bmm/config.yaml` and resolve:
|
||||||
|
|
||||||
|
- `project_name`, `output_folder`, `user_name`
|
||||||
|
- `communication_language`, `document_output_language`, `user_skill_level`
|
||||||
|
- `planning_artifacts`, `implementation_artifacts`, `project_knowledge`
|
||||||
|
- `date` as system-generated current datetime
|
||||||
|
- ✅ YOU MUST ALWAYS SPEAK OUTPUT In your Agent communication style with the config `{communication_language}`
|
||||||
|
|
||||||
|
### Paths
|
||||||
|
|
||||||
|
- `installed_path` = `{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync`
|
||||||
|
- `template_path` = `{installed_path}/knowledge-index-template.md`
|
||||||
|
- `output_file` = `{project_knowledge}/knowledge-index.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## EXECUTION
|
||||||
|
|
||||||
|
Load and execute `{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-01-discover.md` to begin the workflow.
|
||||||
|
|
||||||
|
**Note:** Artifact discovery, source cataloging, and chunking strategy selection are handled in step-01-discover.md.
|
||||||
|
|
@ -11,7 +11,7 @@ const ui = new UI();
|
||||||
module.exports = {
|
module.exports = {
|
||||||
command: 'status',
|
command: 'status',
|
||||||
description: 'Display BMAD installation status and module versions',
|
description: 'Display BMAD installation status and module versions',
|
||||||
options: [],
|
options: [['-v, --verbose', 'Show detailed status including agent and workflow counts']],
|
||||||
action: async (options) => {
|
action: async (options) => {
|
||||||
try {
|
try {
|
||||||
// Find the bmad directory
|
// Find the bmad directory
|
||||||
|
|
@ -53,6 +53,23 @@ module.exports = {
|
||||||
bmadDir,
|
bmadDir,
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// Verbose mode: show agent and workflow counts per module
|
||||||
|
if (options.verbose) {
|
||||||
|
const { glob } = require('glob');
|
||||||
|
for (const mod of modules) {
|
||||||
|
const moduleName = typeof mod === 'string' ? mod : (mod.id || mod.name || '');
|
||||||
|
if (!moduleName) continue;
|
||||||
|
|
||||||
|
const modDir = path.join(bmadDir, moduleName);
|
||||||
|
if (!(await fs.pathExists(modDir))) continue;
|
||||||
|
|
||||||
|
const agents = await glob('agents/**/*.agent.yaml', { cwd: modDir });
|
||||||
|
const workflows = await glob('workflows/**/*.{yaml,yml,md}', { cwd: modDir });
|
||||||
|
|
||||||
|
await prompts.log.info(`Module "${moduleName}": ${agents.length} agent(s), ${workflows.length} workflow(s)`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
process.exit(0);
|
process.exit(0);
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
await prompts.log.error(`Status check failed: ${error.message}`);
|
await prompts.log.error(`Status check failed: ${error.message}`);
|
||||||
|
|
|
||||||
|
|
@ -7,8 +7,14 @@ const packageJson = require('../../../package.json');
|
||||||
* Configuration utility class
|
* Configuration utility class
|
||||||
*/
|
*/
|
||||||
class Config {
|
class Config {
|
||||||
|
/** @type {Map<string, { data: Object, mtime: number }>} */
|
||||||
|
#cache = new Map();
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Load a YAML configuration file
|
* Load a YAML configuration file with in-memory caching.
|
||||||
|
* Cached entries are automatically invalidated when the file's
|
||||||
|
* modification time changes, so callers always receive fresh data
|
||||||
|
* after a file is written.
|
||||||
* @param {string} configPath - Path to config file
|
* @param {string} configPath - Path to config file
|
||||||
* @returns {Object} Parsed configuration
|
* @returns {Object} Parsed configuration
|
||||||
*/
|
*/
|
||||||
|
|
@ -17,8 +23,26 @@ class Config {
|
||||||
throw new Error(`Configuration file not found: ${configPath}`);
|
throw new Error(`Configuration file not found: ${configPath}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
const content = await fs.readFile(configPath, 'utf8');
|
const resolved = path.resolve(configPath);
|
||||||
return yaml.parse(content);
|
const stat = await fs.stat(resolved);
|
||||||
|
const mtime = stat.mtimeMs;
|
||||||
|
|
||||||
|
const cached = this.#cache.get(resolved);
|
||||||
|
if (cached && cached.mtime === mtime) {
|
||||||
|
return cached.data;
|
||||||
|
}
|
||||||
|
|
||||||
|
const content = await fs.readFile(resolved, 'utf8');
|
||||||
|
const data = yaml.parse(content);
|
||||||
|
this.#cache.set(resolved, { data, mtime });
|
||||||
|
return data;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Clear the in-memory YAML cache.
|
||||||
|
*/
|
||||||
|
clearCache() {
|
||||||
|
this.#cache.clear();
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue