This commit is contained in:
Marcus Bergo 2026-02-27 16:16:14 -03:00 committed by GitHub
commit 106e429020
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 899 additions and 4 deletions

View File

@ -0,0 +1,3 @@
# Rules
* Never creates PRs for altering code after review. Always offer a fix and the option to commit.
* Qualify the severity of the change requested. NORMAL | IMPROVEMENT | FIX | CRITICAL

View File

@ -41,3 +41,7 @@ agent:
- trigger: DP or fuzzy match on document-project - trigger: DP or fuzzy match on document-project
workflow: "{project-root}/_bmad/bmm/workflows/document-project/workflow.yaml" workflow: "{project-root}/_bmad/bmm/workflows/document-project/workflow.yaml"
description: "[DP] Document Project: Analyze an existing project to produce useful documentation for both human and LLM" description: "[DP] Document Project: Analyze an existing project to produce useful documentation for both human and LLM"
- trigger: KS or fuzzy match on knowledge-sync
exec: "{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync/workflow.md"
description: "[KS] Knowledge Sync: Build a RAG-ready knowledge index from project artifacts for optimized AI agent retrieval"

View File

@ -0,0 +1,86 @@
---
project_name: ''
user_name: ''
date: ''
total_chunks: 0
sources_indexed: 0
tag_vocabulary_size: 0
retrieval_tested: false
status: 'draft'
---
# Knowledge Index for {{project_name}}
_RAG-optimized knowledge base for AI agent retrieval. Each chunk is self-contained and tagged for semantic search._
---
## Index Summary
- **Total Chunks:** {{total_count}}
- **Critical:** {{critical_count}} | **High:** {{high_count}} | **Standard:** {{standard_count}} | **Reference:** {{ref_count}}
- **Sources Indexed:** {{source_count}}
- **Last Synced:** {{date}}
---
## Critical Knowledge
<!-- Critical-priority chunks go here. These are retrieved for every implementation task. -->
---
## Architecture Knowledge
<!-- Architecture decisions, system design patterns, and technology choices. -->
---
## Requirements Knowledge
<!-- Business rules, acceptance criteria, and constraints. -->
---
## Implementation Knowledge
<!-- Coding patterns, conventions, and implementation rules. -->
---
## Domain Knowledge
<!-- Business domain concepts, terminology, and definitions. -->
---
## Operations Knowledge
<!-- Deployment, monitoring, and workflow rules. -->
---
## Quality Knowledge
<!-- Testing patterns, review standards, and anti-patterns. -->
---
## Retrieval Configuration
### Query Mapping
| Query Pattern | Target Categories | Priority Filter | Expected Chunks |
|---|---|---|---|
| "how to implement \*" | implementation, architecture | critical, high | 3-5 |
| "testing requirements for \*" | quality, implementation | critical, high | 2-4 |
| "business rules for \*" | requirements, domain | all | 2-3 |
| "architecture decision for \*" | architecture | all | 1-3 |
| "deployment process for \*" | operations | all | 1-2 |
### Embedding Recommendations
- **Model:** Use an embedding model that handles technical content well
- **Chunk Overlap:** 50-100 characters overlap between adjacent chunks from the same source
- **Metadata Filters:** Always filter by category and priority for focused retrieval
- **Top-K:** Retrieve 3-5 chunks per query for optimal context balance

View File

@ -0,0 +1,179 @@
# Step 1: Artifact Discovery & Catalog
## MANDATORY EXECUTION RULES (READ FIRST):
- 🛑 NEVER generate content without user input
- ✅ ALWAYS treat this as collaborative discovery between technical peers
- 📋 YOU ARE A FACILITATOR, not a content generator
- 💬 FOCUS on discovering and cataloging all relevant project artifacts
- 🎯 IDENTIFY sources that provide high-value knowledge for RAG retrieval
- ⚠️ ABSOLUTELY NO TIME ESTIMATES - AI development speed has fundamentally changed
- ✅ YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`
## EXECUTION PROTOCOLS:
- 🎯 Show your analysis before taking any action
- 📖 Read existing project files to catalog available artifacts
- 💾 Initialize document and update frontmatter
- 🚫 FORBIDDEN to load next step until discovery is complete
## CONTEXT BOUNDARIES:
- Variables from workflow.md are available in memory
- Focus on existing project artifacts and documentation
- Identify documents that contain reusable knowledge for AI agents
- Prioritize artifacts that prevent implementation mistakes and provide domain context
## YOUR TASK:
Discover, catalog, and classify all project artifacts that should be indexed for RAG retrieval by AI agents.
## DISCOVERY SEQUENCE:
### 1. Check for Existing Knowledge Index
First, check if a knowledge index already exists:
- Look for file at `{project_knowledge}/knowledge-index.md` or `{project-root}/**/knowledge-index.md`
- If exists: Read complete file to understand existing index
- Present to user: "Found existing knowledge index with {{chunk_count}} chunks across {{source_count}} sources. Would you like to update this or create a new one?"
### 2. Scan Planning Artifacts
Search `{planning_artifacts}` for documents containing project knowledge:
**Product Requirements:**
- Look for PRD files (`*prd*`, `*requirements*`)
- Extract key decisions, constraints, and acceptance criteria
- Note sections with high reuse value for agents
**Architecture Documents:**
- Look for architecture files (`*architecture*`, `*design*`)
- Extract technology decisions, patterns, and trade-offs
- Identify integration points and system boundaries
**Epic and Story Files:**
- Look for epic/story definitions (`*epic*`, `*stories*`)
- Extract acceptance criteria, implementation notes, and dependencies
- Identify cross-cutting concerns that appear across stories
### 3. Scan Implementation Artifacts
Search `{implementation_artifacts}` for implementation knowledge:
**Sprint and Status Files:**
- Look for sprint status, retrospectives, and course corrections
- Extract lessons learned and pattern changes
- Identify recurring issues and their resolutions
**Code Review Findings:**
- Look for code review artifacts
- Extract quality patterns and anti-patterns discovered
- Note corrections that should inform future implementation
### 4. Scan Project Knowledge
Search `{project_knowledge}` for existing knowledge assets:
**Project Context:**
- Look for `project-context.md` and similar files
- Extract implementation rules and coding conventions
- These are high-priority sources for RAG retrieval
**Research Documents:**
- Look for research outputs (market, domain, technical)
- Extract findings that inform implementation decisions
- Identify domain terminology and definitions
### 5. Scan Source Code for Patterns
Identify key code patterns worth indexing:
**Configuration Files:**
- Package manifests, build configs, linting rules
- Extract version constraints and tool configurations
- These provide critical context for code generation
**Key Source Files:**
- Identify entry points, shared utilities, and core modules
- Extract patterns that define the project's coding style
- Note any non-obvious conventions visible only in code
### 6. Classify and Prioritize Sources
For each discovered artifact, assign:
**Knowledge Category:**
- `architecture` - System design decisions and patterns
- `requirements` - Business rules and acceptance criteria
- `implementation` - Coding patterns and conventions
- `domain` - Business domain concepts and terminology
- `operations` - Deployment, monitoring, and workflow rules
- `quality` - Testing patterns, review standards, and anti-patterns
**Retrieval Priority:**
- `critical` - Must be retrieved for every implementation task
- `high` - Should be retrieved for related implementation tasks
- `standard` - Available when specifically relevant
- `reference` - Background context when explicitly needed
### 7. Present Discovery Summary
Report findings to user:
"Welcome {{user_name}}! I've scanned your project {{project_name}} to catalog artifacts for your RAG knowledge base.
**Artifacts Discovered:**
| Category | Count | Priority Breakdown |
|---|---|---|
| Architecture | {{count}} | {{critical}}/{{high}}/{{standard}} |
| Requirements | {{count}} | {{critical}}/{{high}}/{{standard}} |
| Implementation | {{count}} | {{critical}}/{{high}}/{{standard}} |
| Domain | {{count}} | {{critical}}/{{high}}/{{standard}} |
| Operations | {{count}} | {{critical}}/{{high}}/{{standard}} |
| Quality | {{count}} | {{critical}}/{{high}}/{{standard}} |
**Source Files Cataloged:** {{total_files}}
**Recommended Chunking Strategy:**
Based on your artifact types, I recommend {{strategy}} chunking:
- {{strategy_rationale}}
Ready to index and chunk your project knowledge for RAG retrieval.
[C] Continue to knowledge indexing"
## SUCCESS METRICS:
✅ All relevant project artifacts discovered and cataloged
✅ Each artifact classified by category and retrieval priority
✅ Source file paths accurately recorded
✅ Chunking strategy recommended based on artifact analysis
✅ Discovery findings clearly presented to user
✅ User ready to proceed with indexing
## FAILURE MODES:
❌ Missing critical artifacts in planning or implementation directories
❌ Not checking for existing knowledge index before creating new one
❌ Incorrect classification of artifact categories or priorities
❌ Not scanning source code for pattern-level knowledge
❌ Not presenting clear discovery summary to user
## NEXT STEP:
After user selects [C] to continue, load `{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-02-index.md` to index and chunk the discovered artifacts.
Remember: Do NOT proceed to step-02 until user explicitly selects [C] from the menu and discovery catalog is confirmed!

View File

@ -0,0 +1,243 @@
# Step 2: Knowledge Indexing & Chunking
## MANDATORY EXECUTION RULES (READ FIRST):
- 🛑 NEVER generate content without user input
- ✅ ALWAYS treat this as collaborative indexing between technical peers
- 📋 YOU ARE A FACILITATOR, not a content generator
- 💬 FOCUS on creating self-contained, retrievable knowledge chunks
- 🎯 EACH CHUNK must be independently useful without requiring full document context
- ⚠️ ABSOLUTELY NO TIME ESTIMATES - AI development speed has fundamentally changed
- ✅ YOU MUST ALWAYS SPEAK OUTPUT In your Agent communication style with the config `{communication_language}`
## EXECUTION PROTOCOLS:
- 🎯 Show your analysis before taking any action
- 📝 Focus on creating atomic, self-contained knowledge chunks
- ⚠️ Present A/P/C menu after each major category
- 💾 ONLY save when user chooses C (Continue)
- 📖 Update frontmatter with completed categories
- 🚫 FORBIDDEN to load next step until all categories are indexed
## COLLABORATION MENUS (A/P/C):
This step will generate content and present choices for each knowledge category:
- **A (Advanced Elicitation)**: Use discovery protocols to explore nuanced knowledge relationships
- **P (Party Mode)**: Bring multiple perspectives to identify missing knowledge connections
- **C (Continue)**: Save the current chunks and proceed to next category
## PROTOCOL INTEGRATION:
- When 'A' selected: Execute {project-root}/_bmad/core/workflows/advanced-elicitation/workflow.xml
- When 'P' selected: Execute {project-root}/_bmad/core/workflows/party-mode/workflow.md
- PROTOCOLS always return to display this step's A/P/C menu after the A or P have completed
- User accepts/rejects protocol changes before proceeding
## CONTEXT BOUNDARIES:
- Discovery catalog from step-1 is available
- All artifact paths and classifications are identified
- Focus on creating chunks optimized for embedding and retrieval
- Each chunk must carry enough context to be useful in isolation
## YOUR TASK:
Index each discovered artifact into self-contained knowledge chunks with metadata tags, source tracing, and retrieval-optimized formatting.
## CHUNKING PRINCIPLES:
### Chunk Design Rules
1. **Self-Contained**: Each chunk must be understandable without reading the source document
2. **Tagged**: Every chunk has category, priority, source path, and semantic tags
3. **Atomic**: One concept or decision per chunk - no compound knowledge
4. **Traceable**: Every chunk links back to its source artifact and section
5. **Contextual**: Include enough surrounding context for accurate retrieval
6. **Deduplicated**: Avoid redundant chunks across different source artifacts
### Chunk Format
Each chunk follows this standard format:
```markdown
### [CHUNK-ID] Chunk Title
- **Source:** `{relative_path_to_source_file}`
- **Category:** architecture | requirements | implementation | domain | operations | quality
- **Priority:** critical | high | standard | reference
- **Tags:** comma-separated semantic tags for retrieval matching
**Context:** One-line description of when this knowledge is relevant.
**Content:**
The actual knowledge content - specific, actionable, self-contained.
```
## INDEXING SEQUENCE:
### 1. Index Critical-Priority Artifacts
Process all artifacts marked as `critical` priority first:
**For each critical artifact:**
- Read the complete source file
- Identify distinct knowledge units (decisions, rules, constraints)
- Create one chunk per knowledge unit
- Apply semantic tags for retrieval matching
- Present chunks to user for validation
**Present results:**
"I've created {{chunk_count}} critical-priority chunks from {{source_count}} sources:
{{list_of_chunk_titles_with_tags}}
These chunks will be prioritized in every retrieval query.
[A] Advanced Elicitation - Explore deeper knowledge connections
[P] Party Mode - Review from multiple implementation perspectives
[C] Continue - Save these chunks and proceed"
### 2. Index High-Priority Artifacts
Process all `high` priority artifacts:
**For each high-priority artifact:**
- Read source file and identify knowledge units
- Create chunks with appropriate tags
- Cross-reference with critical chunks for consistency
- Identify any overlaps and deduplicate
### 3. Index Standard-Priority Artifacts
Process `standard` priority artifacts:
**For each standard artifact:**
- Read source file for domain-specific knowledge
- Create chunks focused on contextual information
- Tag for specific retrieval scenarios
### 4. Index Reference-Priority Artifacts
Process `reference` priority artifacts:
**For each reference artifact:**
- Extract background context and terminology
- Create lighter-weight chunks for supplementary retrieval
- Tag for broad topic matching
### 5. Cross-Reference and Deduplicate
After all categories are indexed:
**Deduplication Analysis:**
- Identify chunks with overlapping content across sources
- Merge or consolidate redundant chunks
- Ensure cross-references between related chunks are tagged
- Present deduplication summary to user
**Relationship Mapping:**
- Identify chunks that frequently co-occur in implementation contexts
- Tag related chunks for retrieval grouping
- Create chunk clusters for common query patterns
### 6. Generate Knowledge Index Document
Compile all validated chunks into the knowledge index file:
**Document Structure:**
```markdown
# Knowledge Index for {{project_name}}
_RAG-optimized knowledge base for AI agent retrieval. Each chunk is self-contained and tagged for semantic search._
---
## Index Summary
- **Total Chunks:** {{total_count}}
- **Critical:** {{critical_count}} | **High:** {{high_count}} | **Standard:** {{standard_count}} | **Reference:** {{ref_count}}
- **Sources Indexed:** {{source_count}}
- **Last Synced:** {{date}}
---
## Critical Knowledge
{{critical_chunks}}
## Architecture Knowledge
{{architecture_chunks}}
## Requirements Knowledge
{{requirements_chunks}}
## Implementation Knowledge
{{implementation_chunks}}
## Domain Knowledge
{{domain_chunks}}
## Operations Knowledge
{{operations_chunks}}
## Quality Knowledge
{{quality_chunks}}
```
### 7. Present Indexing Summary
"Knowledge indexing complete for {{project_name}}!
**Chunks Created:**
| Category | Critical | High | Standard | Reference | Total |
|---|---|---|---|---|---|
| Architecture | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
| Requirements | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
| Implementation | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
| Domain | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
| Operations | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
| Quality | {{n}} | {{n}} | {{n}} | {{n}} | {{n}} |
**Deduplication:** Removed {{removed_count}} redundant chunks
**Cross-References:** {{xref_count}} chunk relationships mapped
[C] Continue to optimization"
## SUCCESS METRICS:
✅ All discovered artifacts indexed into self-contained chunks
✅ Each chunk has proper metadata tags and source tracing
✅ No redundant or overlapping chunks remain
✅ Cross-references between related chunks are mapped
✅ A/P/C menu presented and handled correctly for each category
✅ Knowledge index document properly structured
## FAILURE MODES:
❌ Creating chunks that require reading the full source document
❌ Missing semantic tags that prevent accurate retrieval
❌ Not deduplicating overlapping chunks from different sources
❌ Not cross-referencing related knowledge units
❌ Not getting user validation for each category
❌ Creating overly large chunks that reduce retrieval precision
## NEXT STEP:
After completing all categories and user selects [C], load `{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-03-optimize.md` to optimize the knowledge base for retrieval quality.
Remember: Do NOT proceed to step-03 until all categories are indexed and user explicitly selects [C]!

View File

@ -0,0 +1,289 @@
# Step 3: Knowledge Base Optimization & Completion
## MANDATORY EXECUTION RULES (READ FIRST):
- 🛑 NEVER generate content without user input
- ✅ ALWAYS treat this as collaborative optimization between technical peers
- 📋 YOU ARE A FACILITATOR, not a content generator
- 💬 FOCUS on optimizing chunks for retrieval quality and accuracy
- 🎯 ENSURE every chunk is retrieval-ready and well-tagged
- ⚠️ ABSOLUTELY NO TIME ESTIMATES - AI development speed has fundamentally changed
- ✅ YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`
## EXECUTION PROTOCOLS:
- 🎯 Show your analysis before taking any action
- 📝 Review and optimize chunks for retrieval precision
- 📖 Update frontmatter with completion status
- 🚫 NO MORE STEPS - this is the final step
## CONTEXT BOUNDARIES:
- All knowledge chunks from step-2 are indexed
- Cross-references and deduplication are complete
- Focus on retrieval quality optimization and finalization
- Ensure the knowledge index is ready for RAG pipeline integration
## YOUR TASK:
Optimize the knowledge index for retrieval quality, validate chunk completeness, and finalize the knowledge base for AI agent consumption.
## OPTIMIZATION SEQUENCE:
### 1. Retrieval Quality Analysis
Analyze the indexed chunks for retrieval effectiveness:
**Tag Coverage Analysis:**
- Review semantic tags across all chunks
- Identify gaps where common queries would miss relevant chunks
- Suggest additional tags for better retrieval matching
- Present tag coverage report to user
**Chunk Size Analysis:**
- Identify chunks that are too large (reduce retrieval precision)
- Identify chunks that are too small (lack sufficient context)
- Recommend splits or merges for optimal retrieval size
- Target: Each chunk should be 100-500 words for optimal embedding
**Context Sufficiency Check:**
- Verify each chunk is understandable without its source document
- Add missing context where chunks reference undefined terms
- Ensure technical terms are defined or tagged for glossary lookup
### 2. Semantic Tag Optimization
Optimize tags for retrieval accuracy:
**Tag Standardization:**
- Normalize similar tags (e.g., "api-design" and "api-patterns" → single standard)
- Create a tag vocabulary for the project
- Apply consistent tag format across all chunks
**Tag Enrichment:**
- Add technology-specific tags (framework names, library names)
- Add pattern-type tags (e.g., "error-handling", "state-management")
- Add lifecycle tags (e.g., "setup", "implementation", "testing", "deployment")
**Present Tag Summary:**
"I've optimized the semantic tags across {{chunk_count}} chunks:
**Tag Vocabulary:** {{unique_tag_count}} standardized tags
**Most Connected Tags:** {{top_tags_by_frequency}}
**Coverage Gaps Fixed:** {{gaps_fixed}}
Would you like to review the tag vocabulary? (y/n)"
### 3. Retrieval Scenario Testing
Validate retrieval quality with common query scenarios:
**Test Queries:**
Simulate these common developer queries against the knowledge index:
1. "How should I structure a new feature?" → Should retrieve: architecture + implementation chunks
2. "What are the testing requirements?" → Should retrieve: quality + implementation chunks
3. "What technology versions are we using?" → Should retrieve: critical implementation chunks
4. "How do I handle errors in this project?" → Should retrieve: implementation + quality chunks
5. "What are the business rules for {{core_feature}}?" → Should retrieve: requirements + domain chunks
**For each query, report:**
- Chunks that would be retrieved (by tag matching)
- Missing chunks that should be retrieved but aren't
- False positive chunks that would be retrieved incorrectly
- Recommended tag adjustments
### 4. Generate Retrieval Configuration
Create a retrieval configuration section in the knowledge index:
```markdown
## Retrieval Configuration
### Query Mapping
| Query Pattern | Target Categories | Priority Filter | Expected Chunks |
|---|---|---|---|
| "how to implement *" | implementation, architecture | critical, high | 3-5 |
| "testing requirements for *" | quality, implementation | critical, high | 2-4 |
| "business rules for *" | requirements, domain | all | 2-3 |
| "architecture decision for *" | architecture | all | 1-3 |
| "deployment process for *" | operations | all | 1-2 |
### Embedding Recommendations
- **Model:** Use an embedding model that handles technical content well
- **Chunk Overlap:** 50-100 characters overlap between adjacent chunks from the same source
- **Metadata Filters:** Always filter by category and priority for focused retrieval
- **Top-K:** Retrieve 3-5 chunks per query for optimal context balance
```
### 5. Finalize Knowledge Index
Complete the knowledge index with optimization results:
**Update Frontmatter:**
```yaml
---
project_name: '{{project_name}}'
user_name: '{{user_name}}'
date: '{{date}}'
total_chunks: {{total_count}}
sources_indexed: {{source_count}}
tag_vocabulary_size: {{tag_count}}
retrieval_tested: true
status: 'complete'
---
```
**Append Usage Guidelines:**
```markdown
---
## Usage Guidelines
**For AI Agents (RAG Retrieval):**
- Query this index using semantic search against chunk tags and content
- Always include critical-priority chunks in implementation context
- Filter by category when the task type is known
- Cross-reference related chunks using shared tags
**For Humans (Maintenance):**
- Re-run this workflow when new artifacts are created or significantly updated
- Add new chunks manually using the standard chunk format above
- Review and prune quarterly to remove outdated knowledge
- Update tags when new patterns or technologies are adopted
**For RAG Pipeline Integration:**
- Parse chunks by the `### [CHUNK-ID]` delimiter
- Extract metadata from the bullet-point headers (Source, Category, Priority, Tags)
- Use Tags field for semantic search indexing
- Use Priority field for retrieval ranking
```
### 6. Present Completion Summary
Based on user skill level, present the completion:
**Expert Mode:**
"Knowledge index complete. {{chunk_count}} chunks across {{source_count}} sources, {{tag_count}} semantic tags. Retrieval-tested and RAG-ready.
File saved to: `{project_knowledge}/knowledge-index.md`"
**Intermediate Mode:**
"Your project knowledge base is indexed and retrieval-ready!
**What we created:**
- {{chunk_count}} self-contained knowledge chunks
- {{source_count}} source artifacts indexed
- {{tag_count}} semantic tags for retrieval matching
- Retrieval configuration for RAG pipeline integration
**How it works:**
AI agents can now search this index to find exactly the project knowledge they need for any implementation task, instead of loading entire documents.
**Next steps:**
- Integrate with your RAG pipeline using the retrieval configuration
- Re-run this workflow when artifacts change significantly
- Review quarterly to keep knowledge current"
**Beginner Mode:**
"Your project knowledge base is ready! 🎉
**What this does:**
Think of this as a smart library catalog for your project. Instead of AI agents reading every document from start to finish, they can now search for exactly the knowledge they need.
**What's included:**
- {{chunk_count}} bite-sized knowledge pieces from your project documents
- Smart tags so agents can find the right knowledge quickly
- Priority labels so the most important knowledge comes first
**How AI agents use it:**
When an agent needs to implement something, it searches this index for relevant knowledge chunks instead of reading entire documents. This makes them faster and more accurate!"
### 7. Completion Validation
Final checks before completion:
**Content Validation:**
✅ All discovered artifacts indexed into chunks
✅ Each chunk has proper metadata and source tracing
✅ Semantic tags are standardized and comprehensive
✅ No redundant chunks remain after deduplication
✅ Retrieval scenarios tested successfully
✅ Retrieval configuration generated
**Format Validation:**
✅ Consistent chunk format throughout
✅ Frontmatter properly updated
✅ Tag vocabulary is standardized
✅ Document is well-structured and scannable
### 8. Completion Message
"✅ **GenAI Knowledge Sync Complete!**
Your retrieval-optimized knowledge index is ready at:
`{project_knowledge}/knowledge-index.md`
**📊 Knowledge Base Summary:**
- {{chunk_count}} indexed knowledge chunks
- {{source_count}} source artifacts cataloged
- {{tag_count}} semantic tags for retrieval
- {{category_count}} knowledge categories covered
- Retrieval-tested with {{test_count}} query scenarios
**🎯 RAG Integration Ready:**
- Self-contained chunks with metadata headers
- Standardized tag vocabulary for semantic search
- Priority-based retrieval ranking
- Query mapping configuration included
**📋 Maintenance:**
1. Re-sync when artifacts change significantly: run this workflow again
2. Add individual chunks manually using the standard format
3. Review quarterly to prune outdated knowledge
4. Update tags when new patterns emerge
Your AI agents can now retrieve precisely the project knowledge they need for any task!"
## SUCCESS METRICS:
✅ Knowledge index fully optimized for retrieval quality
✅ Semantic tags standardized and comprehensive
✅ Retrieval scenarios tested with good coverage
✅ Retrieval configuration generated for RAG pipeline
✅ Usage guidelines included for agents, humans, and pipelines
✅ Frontmatter properly updated with completion status
✅ User provided with clear maintenance guidance
## FAILURE MODES:
❌ Chunks too large or too small for effective retrieval
❌ Semantic tags inconsistent or too sparse
❌ Not testing retrieval scenarios before finalizing
❌ Missing retrieval configuration for pipeline integration
❌ Not providing maintenance and usage guidelines
❌ Frontmatter not properly updated
## WORKFLOW COMPLETE:
This is the final step of the GenAI Knowledge Sync workflow. The user now has a retrieval-optimized knowledge index that enables AI agents to find and use exactly the project knowledge they need for any implementation task, improving both speed and accuracy of AI-assisted development.

View File

@ -0,0 +1,50 @@
---
name: genai-knowledge-sync
description: 'Build and maintain a RAG-ready knowledge base from project artifacts. Use when the user says "build knowledge base", "sync knowledge", or "create RAG context"'
---
# GenAI Knowledge Sync Workflow
**Goal:** Create a structured, chunked knowledge index (`knowledge-index.md`) from project artifacts that is optimized for Retrieval-Augmented Generation (RAG) pipelines and AI agent context loading. This enables AI agents to retrieve the most relevant project knowledge at inference time rather than loading entire documents.
**Your Role:** You are a technical knowledge architect working with a peer to catalog, chunk, and index project artifacts into a retrieval-optimized format. You ensure every knowledge chunk is self-contained, well-tagged, and traceable to its source.
---
## WORKFLOW ARCHITECTURE
This uses **micro-file architecture** for disciplined execution:
- Each step is a self-contained file with embedded rules
- Sequential progression with user control at each step
- Document state tracked in frontmatter
- Focus on lean, retrieval-optimized content generation
- You NEVER proceed to a step file if the current step file indicates the user must approve and indicate continuation.
---
## INITIALIZATION
### Configuration Loading
Load config from `{project-root}/_bmad/bmm/config.yaml` and resolve:
- `project_name`, `output_folder`, `user_name`
- `communication_language`, `document_output_language`, `user_skill_level`
- `planning_artifacts`, `implementation_artifacts`, `project_knowledge`
- `date` as system-generated current datetime
- ✅ YOU MUST ALWAYS SPEAK OUTPUT In your Agent communication style with the config `{communication_language}`
### Paths
- `installed_path` = `{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync`
- `template_path` = `{installed_path}/knowledge-index-template.md`
- `output_file` = `{project_knowledge}/knowledge-index.md`
---
## EXECUTION
Load and execute `{project-root}/_bmad/bmm/workflows/4-implementation/genai-knowledge-sync/steps/step-01-discover.md` to begin the workflow.
**Note:** Artifact discovery, source cataloging, and chunking strategy selection are handled in step-01-discover.md.

View File

@ -11,7 +11,7 @@ const ui = new UI();
module.exports = { module.exports = {
command: 'status', command: 'status',
description: 'Display BMAD installation status and module versions', description: 'Display BMAD installation status and module versions',
options: [], options: [['-v, --verbose', 'Show detailed status including agent and workflow counts']],
action: async (options) => { action: async (options) => {
try { try {
// Find the bmad directory // Find the bmad directory
@ -53,6 +53,23 @@ module.exports = {
bmadDir, bmadDir,
}); });
// Verbose mode: show agent and workflow counts per module
if (options.verbose) {
const { glob } = require('glob');
for (const mod of modules) {
const moduleName = typeof mod === 'string' ? mod : (mod.id || mod.name || '');
if (!moduleName) continue;
const modDir = path.join(bmadDir, moduleName);
if (!(await fs.pathExists(modDir))) continue;
const agents = await glob('agents/**/*.agent.yaml', { cwd: modDir });
const workflows = await glob('workflows/**/*.{yaml,yml,md}', { cwd: modDir });
await prompts.log.info(`Module "${moduleName}": ${agents.length} agent(s), ${workflows.length} workflow(s)`);
}
}
process.exit(0); process.exit(0);
} catch (error) { } catch (error) {
await prompts.log.error(`Status check failed: ${error.message}`); await prompts.log.error(`Status check failed: ${error.message}`);

View File

@ -7,8 +7,14 @@ const packageJson = require('../../../package.json');
* Configuration utility class * Configuration utility class
*/ */
class Config { class Config {
/** @type {Map<string, { data: Object, mtime: number }>} */
#cache = new Map();
/** /**
* Load a YAML configuration file * Load a YAML configuration file with in-memory caching.
* Cached entries are automatically invalidated when the file's
* modification time changes, so callers always receive fresh data
* after a file is written.
* @param {string} configPath - Path to config file * @param {string} configPath - Path to config file
* @returns {Object} Parsed configuration * @returns {Object} Parsed configuration
*/ */
@ -17,8 +23,26 @@ class Config {
throw new Error(`Configuration file not found: ${configPath}`); throw new Error(`Configuration file not found: ${configPath}`);
} }
const content = await fs.readFile(configPath, 'utf8'); const resolved = path.resolve(configPath);
return yaml.parse(content); const stat = await fs.stat(resolved);
const mtime = stat.mtimeMs;
const cached = this.#cache.get(resolved);
if (cached && cached.mtime === mtime) {
return cached.data;
}
const content = await fs.readFile(resolved, 'utf8');
const data = yaml.parse(content);
this.#cache.set(resolved, { data, mtime });
return data;
}
/**
* Clear the in-memory YAML cache.
*/
clearCache() {
this.#cache.clear();
} }
/** /**