BMAD-METHOD/src/modules/bmgd/gametest/knowledge/regression-testing.md

5.5 KiB

Regression Testing for Games

Overview

Regression testing catches bugs introduced by new changes. In games, this includes functional regressions, performance regressions, and design regressions.

Types of Regression

Functional Regression

  • Features that worked before now break
  • New bugs introduced by unrelated changes
  • Broken integrations between systems

Performance Regression

  • Frame rate drops
  • Memory usage increases
  • Load time increases
  • Battery drain (mobile)

Design Regression

  • Balance changes with unintended side effects
  • UX changes that hurt usability
  • Art changes that break visual consistency

Save Data Regression

  • Old save files no longer load
  • Progression lost or corrupted
  • Achievements/unlocks reset

Regression Testing Strategy

Test Suite Layers

High-Frequency (Every Commit)
├── Unit Tests - Fast, isolated
├── Smoke Tests - Can game launch and run?
└── Critical Path - Core gameplay works

Medium-Frequency (Nightly)
├── Integration Tests - System interactions
├── Full Playthrough - Automated or manual
└── Performance Benchmarks - Frame time, memory

Low-Frequency (Release)
├── Full Matrix - All platforms/configs
├── Certification Tests - Platform requirements
└── Localization - All languages

What to Test

Critical Path (Must Not Break)

  • Game launches
  • New game starts
  • Save/load works
  • Core gameplay loop completes
  • Main menu navigation

High Priority

  • All game systems function
  • Progression works end-to-end
  • Multiplayer connects and syncs
  • In-app purchases process
  • Achievements trigger

Medium Priority

  • Edge cases in systems
  • Optional content accessible
  • Settings persist correctly
  • Localization displays

Automated Regression Tests

Smoke Tests

# Run on every commit
def test_game_launches():
    process = launch_game()
    assert wait_for_main_menu(timeout=30)
    process.terminate()

def test_new_game_starts():
    launch_game()
    click_new_game()
    assert wait_for_gameplay(timeout=60)

def test_save_load_roundtrip():
    launch_game()
    start_new_game()
    perform_actions()
    save_game()
    load_game()
    assert verify_state_matches()

Playthrough Bots

# Automated player that plays through content
class PlaythroughBot:
    def run_level(self, level):
        self.load_level(level)
        while not self.level_complete:
            self.perform_action()
            self.check_for_softlocks()
            self.record_metrics()

Visual Regression

# Compare screenshots against baselines
def test_main_menu_visual():
    launch_game()
    screenshot = capture_screen()
    assert compare_to_baseline(screenshot, 'main_menu', threshold=0.01)

Performance Regression Detection

Metrics to Track

  • Average frame time
  • 1% low frame time
  • Memory usage (peak, average)
  • Load times
  • Draw calls
  • Texture memory

Automated Benchmarks

performance_benchmark:
  script:
    - run_benchmark_scene --duration 60s
    - collect_metrics
    - compare_to_baseline
  fail_conditions:
    - frame_time_avg > baseline * 1.1 # 10% tolerance
    - memory_peak > baseline * 1.05 # 5% tolerance

Trend Tracking

  • Graph metrics over time
  • Alert on upward trends
  • Identify problematic commits

Save Compatibility Testing

Version Matrix

Maintain save files from:

  • Previous major version
  • Previous minor version
  • Current development build

Automated Validation

def test_save_compatibility():
    for save_file in LEGACY_SAVES:
        load_save(save_file)
        assert no_errors()
        assert progress_preserved()
        assert inventory_intact()

Schema Versioning

  • Version your save format
  • Implement upgrade paths
  • Log migration issues

Regression Bug Workflow

1. Detection

  • Automated test fails
  • Manual tester finds issue
  • Player report comes in

2. Verification

  • Confirm it worked before
  • Identify when it broke
  • Find the breaking commit

3. Triage

  • Assess severity
  • Determine fix urgency
  • Assign to appropriate developer

4. Fix and Verify

  • Implement fix
  • Add regression test
  • Verify fix doesn't break other things

5. Post-Mortem

  • Why wasn't this caught?
  • How can we prevent similar issues?
  • Do we need new tests?

Bisecting Regressions

When a regression is found, identify the breaking commit:

Git Bisect

git bisect start
git bisect bad HEAD          # Current is broken
git bisect good v1.2.0       # Known good version
# Git will checkout commits to test
# Run test, mark good/bad
git bisect good/bad
# Repeat until culprit found

Automated Bisect

git bisect start HEAD v1.2.0
git bisect run ./run_regression_test.sh

Regression Testing Checklist

Per Commit

  • Unit tests pass
  • Smoke tests pass
  • Build succeeds on all platforms

Per Merge to Main

  • Integration tests pass
  • Performance benchmarks within tolerance
  • Save compatibility verified

Per Release

  • Full playthrough completed
  • All platforms tested
  • Legacy saves load correctly
  • No new critical regressions
  • All previous hotfix issues still resolved

Building a Regression Suite

Start Small

  1. Add tests for bugs as they're fixed
  2. Cover critical path first
  3. Expand coverage over time

Maintain Quality

  • Delete flaky tests
  • Keep tests fast
  • Update tests with design changes

Measure Effectiveness

  • Track bugs caught by tests
  • Track bugs that slipped through
  • Identify coverage gaps