BMAD-METHOD/src/modules/bmgd/gametest/knowledge/regression-testing.md

281 lines
5.5 KiB
Markdown

# Regression Testing for Games
## Overview
Regression testing catches bugs introduced by new changes. In games, this includes functional regressions, performance regressions, and design regressions.
## Types of Regression
### Functional Regression
- Features that worked before now break
- New bugs introduced by unrelated changes
- Broken integrations between systems
### Performance Regression
- Frame rate drops
- Memory usage increases
- Load time increases
- Battery drain (mobile)
### Design Regression
- Balance changes with unintended side effects
- UX changes that hurt usability
- Art changes that break visual consistency
### Save Data Regression
- Old save files no longer load
- Progression lost or corrupted
- Achievements/unlocks reset
## Regression Testing Strategy
### Test Suite Layers
```
High-Frequency (Every Commit)
├── Unit Tests - Fast, isolated
├── Smoke Tests - Can game launch and run?
└── Critical Path - Core gameplay works
Medium-Frequency (Nightly)
├── Integration Tests - System interactions
├── Full Playthrough - Automated or manual
└── Performance Benchmarks - Frame time, memory
Low-Frequency (Release)
├── Full Matrix - All platforms/configs
├── Certification Tests - Platform requirements
└── Localization - All languages
```
### What to Test
#### Critical Path (Must Not Break)
- Game launches
- New game starts
- Save/load works
- Core gameplay loop completes
- Main menu navigation
#### High Priority
- All game systems function
- Progression works end-to-end
- Multiplayer connects and syncs
- In-app purchases process
- Achievements trigger
#### Medium Priority
- Edge cases in systems
- Optional content accessible
- Settings persist correctly
- Localization displays
## Automated Regression Tests
### Smoke Tests
```python
# Run on every commit
def test_game_launches():
process = launch_game()
assert wait_for_main_menu(timeout=30)
process.terminate()
def test_new_game_starts():
launch_game()
click_new_game()
assert wait_for_gameplay(timeout=60)
def test_save_load_roundtrip():
launch_game()
start_new_game()
perform_actions()
save_game()
load_game()
assert verify_state_matches()
```
### Playthrough Bots
```python
# Automated player that plays through content
class PlaythroughBot:
def run_level(self, level):
self.load_level(level)
while not self.level_complete:
self.perform_action()
self.check_for_softlocks()
self.record_metrics()
```
### Visual Regression
```python
# Compare screenshots against baselines
def test_main_menu_visual():
launch_game()
screenshot = capture_screen()
assert compare_to_baseline(screenshot, 'main_menu', threshold=0.01)
```
## Performance Regression Detection
### Metrics to Track
- Average frame time
- 1% low frame time
- Memory usage (peak, average)
- Load times
- Draw calls
- Texture memory
### Automated Benchmarks
```yaml
performance_benchmark:
script:
- run_benchmark_scene --duration 60s
- collect_metrics
- compare_to_baseline
fail_conditions:
- frame_time_avg > baseline * 1.1 # 10% tolerance
- memory_peak > baseline * 1.05 # 5% tolerance
```
### Trend Tracking
- Graph metrics over time
- Alert on upward trends
- Identify problematic commits
## Save Compatibility Testing
### Version Matrix
Maintain save files from:
- Previous major version
- Previous minor version
- Current development build
### Automated Validation
```python
def test_save_compatibility():
for save_file in LEGACY_SAVES:
load_save(save_file)
assert no_errors()
assert progress_preserved()
assert inventory_intact()
```
### Schema Versioning
- Version your save format
- Implement upgrade paths
- Log migration issues
## Regression Bug Workflow
### 1. Detection
- Automated test fails
- Manual tester finds issue
- Player report comes in
### 2. Verification
- Confirm it worked before
- Identify when it broke
- Find the breaking commit
### 3. Triage
- Assess severity
- Determine fix urgency
- Assign to appropriate developer
### 4. Fix and Verify
- Implement fix
- Add regression test
- Verify fix doesn't break other things
### 5. Post-Mortem
- Why wasn't this caught?
- How can we prevent similar issues?
- Do we need new tests?
## Bisecting Regressions
When a regression is found, identify the breaking commit:
### Git Bisect
```bash
git bisect start
git bisect bad HEAD # Current is broken
git bisect good v1.2.0 # Known good version
# Git will checkout commits to test
# Run test, mark good/bad
git bisect good/bad
# Repeat until culprit found
```
### Automated Bisect
```bash
git bisect start HEAD v1.2.0
git bisect run ./run_regression_test.sh
```
## Regression Testing Checklist
### Per Commit
- [ ] Unit tests pass
- [ ] Smoke tests pass
- [ ] Build succeeds on all platforms
### Per Merge to Main
- [ ] Integration tests pass
- [ ] Performance benchmarks within tolerance
- [ ] Save compatibility verified
### Per Release
- [ ] Full playthrough completed
- [ ] All platforms tested
- [ ] Legacy saves load correctly
- [ ] No new critical regressions
- [ ] All previous hotfix issues still resolved
## Building a Regression Suite
### Start Small
1. Add tests for bugs as they're fixed
2. Cover critical path first
3. Expand coverage over time
### Maintain Quality
- Delete flaky tests
- Keep tests fast
- Update tests with design changes
### Measure Effectiveness
- Track bugs caught by tests
- Track bugs that slipped through
- Identify coverage gaps