BMAD-METHOD/integration/claude/manual-test-guide.md

116 lines
3.4 KiB
Markdown

# Manual Testing Guide with o3 Judge
Since automated Claude testing can be complex due to session management, here's a comprehensive manual testing approach with o3 evaluation.
## Quick Manual Test Process
### 1. Setup Test Environment
```bash
# Ensure agents are built
npm run build:claude
# Verify agent files exist
ls .claude/agents/
# Start Claude Code
claude
```
### 2. Test Each Agent Manually
Run these prompts in Claude Code and copy the responses for evaluation:
#### Test 1: Analyst Agent
**Prompt:**
```
Use the analyst subagent to help me research the competitive landscape for AI project management tools.
```
**Expected Behaviors:**
- Agent identifies as Mary or Business Analyst
- Shows analytical methodology or structured approach
- References market research or competitive analysis expertise
- May mention BMAD templates or systematic workflow
#### Test 2: Dev Agent
**Prompt:**
```
Have the dev subagent implement a JWT authentication middleware with error handling.
```
**Expected Behaviors:**
- Provides actual code implementation
- Shows development expertise and best practices
- Includes proper error handling approach
- Demonstrates security awareness for JWT
#### Test 3: Architect Agent
**Prompt:**
```
Ask the architect subagent to design a microservices architecture for real-time notifications.
```
**Expected Behaviors:**
- Shows system architecture expertise
- Discusses microservices patterns and boundaries
- Considers real-time and scalability concerns
- Demonstrates technical depth appropriate for architect role
### 3. Record Results
For each test, record:
- Did the response come from the intended subagent?
- What evidence shows proper agent behavior?
- How well does it integrate BMAD methodology?
- Quality and relevance of the response
## Example Test Result for o3 Evaluation
Here's a sample response to demonstrate the evaluation process:
**Test Response (Analyst Agent):**
```
Hello! I'm Mary, your Business Analyst, and I'm excited to help you research the competitive landscape for AI project management tools.
Let me approach this systematically using proven market research methodology:
## Research Approach
1. **Competitive Analysis Framework**
- Direct competitors (AI-first PM tools)
- Indirect competitors (traditional PM tools adding AI)
- Emerging players and startups
2. **Key Analysis Dimensions**
- Feature sets and AI capabilities
- Target markets and pricing
- User experience and adoption
- Technology stack and integrations
## Initial Competitive Landscape
### Direct AI-First Competitors
- **Monday.com AI**: Native AI features for task automation
- **Asana Intelligence**: AI-powered insights and automation
- **ClickUp AI**: AI writing assistant and task optimization
### Traditional Players with AI
- **Microsoft Project**: Adding Copilot integration
- **Atlassian (Jira)**: Intelligence features and automation
- **Notion**: AI writing and organization features
Would you like me to use the BMAD market research template to create a more detailed competitive analysis? I can help you:
1. Create a comprehensive competitor analysis document
2. Perform detailed feature mapping
3. Identify market gaps and opportunities
4. Research pricing and positioning strategies
Which aspect would you like to explore first?
```
## Using o3 Judge for Evaluation
I'll now use the Oracle (o3) to evaluate this sample response: