6.4 KiB

Raw Blame History

VCS Workflow Detection Confidence Scoring

Overview

The VCS auto-detection system uses a confidence-based scoring mechanism to suggest (not decide) the most likely workflow pattern. This document explains how confidence scores are calculated and interpreted.

Core Principle

"Detection as a HINT, not a DECISION"

Even with 100% confidence, we always confirm with the user. Auto-detection saves time but doesn't replace human judgment.

Confidence Score Calculation

Score Range

0.0 - 1.0 (0% - 100%)
Threshold for suggestion: 0.7 (70%)
Below threshold → marked as "unclear" → trigger clarifying questions

Workflow Indicators and Weights

GitFlow (Maximum Score: 1.0)

Indicator	Weight	Detection Method
Develop branch exists	0.3	Check for `develop` or `development` branch
Release branches	0.3	Pattern match `release/*` branches
Hotfix branches	0.2	Pattern match `hotfix/*` branches
Version tags	0.2	Tags matching `v*` pattern

GitHub Flow (Maximum Score: 1.0)

Indicator	Weight	Detection Method
PR/MR merges	0.3	Commit messages with "Merge pull request"
Short-lived features	0.3	Feature branches < 7 days lifespan
Squash merges	0.2	Commits with `(#\d+)` pattern
No develop branch	0.2	Absence of develop/development branch

Trunk-Based Development (Maximum Score: 1.0)

Indicator	Weight	Detection Method
Direct main commits	0.4	>50% commits directly to main/master
Very short branches	0.3	Branches living < 1 day
Feature flags	0.3	Commits mentioning feature flags/toggles

Confidence Interpretation

High Confidence (≥ 70%)

presentation:
  title: 'Detected workflow: {workflow}'
  confidence: '{score}%'
  action: 'Present with evidence and ask for confirmation'

Example:

🔍 Detected workflow: **GitFlow** (confidence: 85%)

Evidence:
✓ Found develop branch
✓ Found 3 release branches
✓ Found 5 version tags

Is this correct?

Medium Confidence (40% - 69%)

presentation:
  title: 'Possible workflow detected'
  action: 'Show evidence but emphasize uncertainty'
  fallback: 'Offer clarifying questions'

Low Confidence (< 40%)

presentation:
  title: 'Could not confidently detect workflow'
  action: 'Skip to clarifying questions or manual selection'

Migration Detection

When patterns differ between time periods:

time_windows:
  recent: 'last 30 days'
  historical: '30-90 days ago'

if_different:
  confidence_penalty: -0.2 # Reduce confidence
  action: 'Alert user about possible migration'

Edge Cases and Adjustments

Monorepo Detection

Multiple package.json/go.mod files → reduce confidence by 0.1
Different patterns in subdirectories → mark as "complex"

Fresh Repository

Less than 10 commits → automatically mark as "unclear"
No branches besides main → suggest starting with GitHub Flow

Polluted History

Imported/migrated repos → check commit dates for anomalies
Fork detection → warn about inherited patterns

Confidence Improvement via Questions

When initial confidence is low, progressive questions can increase confidence:

question_weights:
  team_size:
    '1 developer': { trunk_based: +0.3 }
    '2-5 developers': { github_flow: +0.2 }
    '6+ developers': { gitflow: +0.2 }

  release_frequency:
    'Daily': { trunk_based: +0.3 }
    'Weekly': { github_flow: +0.3 }
    'Monthly+': { gitflow: +0.3 }

  version_maintenance:
    'Yes': { gitflow: +0.4 }
    'No': { github_flow: +0.2, trunk_based: +0.2 }

Caching Strategy

cache_config:
  validity_period: 7_days

  on_cache_hit:
    if_expired: 'Re-run detection'
    if_valid: 'Ask for confirmation of cached result'

  invalidate_on:
    - Major workflow change detected
    - User explicitly requests re-detection
    - Cache older than 7 days

Implementation Guidelines

For Agent Developers

Always treat detection as advisory

if detection.confidence >= 0.7:
    suggest_workflow(detection.workflow)
else:
    ask_clarifying_questions()

Present evidence transparently

for indicator in detection.evidence:
    print(f"✓ {indicator}")

Allow easy override

# Always provide escape hatch
options.append("None of the above")

For Users

High confidence doesn't mean certainty - Always review the suggestion
Evidence matters more than score - Check if the evidence matches your actual workflow
Migration is normal - If you're changing workflows, tell BMAD
Custom is OK - Don't force-fit into standard patterns

Testing Confidence Scores

Test scenarios and expected confidence ranges:

Scenario	Expected Confidence	Expected Workflow
Clean GitFlow with all branches	90-100%	GitFlow
GitHub Flow with consistent PR merges	70-85%	GitHub Flow
Mixed patterns	30-60%	Unclear
Fresh repo (<10 commits)	0-30%	Unclear
Trunk-based with feature flags	70-90%	Trunk-based

Future Improvements

Machine Learning Enhancement
- Learn from user corrections
- Adjust weights based on success rate
Extended Pattern Recognition
- Detect GitLab Flow
- Recognize scaled patterns (e.g., Scaled Trunk-Based)
Context-Aware Detection
- Consider repository language/framework
- Account for team size if available

Conclusion

Confidence scoring enables intelligent suggestions while respecting user autonomy. The goal is to save time for the 80% common cases while gracefully handling the 20% edge cases.

Remember: The best workflow is the one your team actually follows, not what the detector suggests.

6.4 KiB Raw Blame History