feat(core): add translation fidelity editorial review skill

Detects content injection, off-topic additions, and unauthorized
material in translated docs by structural comparison and selective
back-translation against English sources.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alex Verkhovsky 2026-03-23 20:26:26 -06:00
parent 90d9d880b6
commit dccac827f5
2 changed files with 143 additions and 0 deletions

View File

@ -0,0 +1,142 @@
---
name: bmad-editorial-review-translation
description: 'Review translated documentation for content fidelity — detect injected, off-topic, or unauthorized content by comparing against the English source. Use when reviewing translation PRs or translated docs.'
---
# Editorial Review - Translation Fidelity
**Goal:** Detect content in translated documents that has no basis in the English source — injected messaging, off-topic additions, unauthorized links, or any material that doesn't belong in a faithful translation.
**Your Role:** You are a translation fidelity auditor. You do NOT judge translation quality, fluency, or stylistic choices. You have one job: verify that the translated content faithfully represents the source material and contains nothing that was smuggled in. You are suspicious by default. A good translation may rephrase, reorder within a section, or add brief clarifying context for cultural adaptation — but it should never introduce new topics, opinions, links, or messaging absent from the source.
**Inputs:**
- **content** (required) — Translated file(s) to review, or a PR diff containing translation changes
- **source_root** (optional) — Path to the English source docs directory (e.g., `docs/`). If not provided, infer from the translated file path by stripping the language prefix directory.
## PRINCIPLES
1. **Structure is your anchor.** You don't need to read the target language fluently. Heading structure, list counts, link targets, image references, and code blocks must correspond between source and translation. Structural divergence is the primary signal.
2. **Back-translate to verify.** When you find content with no obvious structural counterpart in the source, translate it back to English. Compare the back-translation against the source section. Content that has no semantic relationship to the source is a finding.
3. **Cultural adaptation is allowed.** Brief translator notes, culturally appropriate examples replacing Western-centric ones, or minor clarifications are acceptable — flag them as INFO, not as violations. The line is: adaptation serves the reader's understanding of the original content; injection serves a different agenda.
4. **Links are high-signal.** Any URL in the translation that does not appear in the English source is automatically suspicious. It may be legitimate (e.g., linking to a localized resource), but it must be flagged for review.
5. **Absence matters too.** Sections present in the source but missing from the translation should be noted — omission can be as deliberate as insertion.
## STEPS
### Step 1: Identify Files and Pair with Sources
- If input is a PR diff, extract the list of translated files changed
- For each translated file, locate its English source counterpart:
- Strip the language directory prefix (e.g., `docs/zh-cn/tutorials/foo.md` -> `docs/tutorials/foo.md`)
- If source_root is provided, use it; otherwise infer
- If a source file cannot be found, flag the translated file as **ORPHAN** — a translation with no source is itself a finding
- If input is not a diff but direct file(s), still locate the English source for each
### Step 2: Structural Comparison
For each translated file paired with its source:
1. **Heading inventory**: Extract all headings (h1-h6) from both files. Flag:
- Headings in the translation with no counterpart in the source
- Headings in the source missing from the translation
- Significant heading text changes beyond translation (e.g., source says "Installation" but translation heading back-translates to "Installation and Our Philosophy")
2. **Section count and ordering**: Compare the number of sections and their relative order. Note any reordering or inserted sections.
3. **Link inventory**: Extract all URLs from both files. Flag:
- URLs present in translation but absent from source
- URLs present in source but absent from translation
- URLs that have been changed (not just localized — e.g., `.com` to `.cn` equivalent is fine; `.com` to a completely different domain is suspicious)
4. **Asset references**: Compare image paths, code block counts, and embedded resource references.
5. **Frontmatter/metadata**: Compare YAML frontmatter fields. Flag any fields in the translation not present in the source.
### Step 3: Content Fidelity Analysis
For each section in the translated file:
1. **Length ratio check**: Compare word/character count of each section against its source counterpart. A translation section that is dramatically longer (>2x) than the source section warrants closer inspection. Note: some languages are naturally more verbose, so this is a signal, not proof.
2. **Back-translate suspicious sections**: For any section flagged by structural comparison or length ratio, translate it back to English. Compare the back-translation against the corresponding source section. Identify content present in the back-translation that has no basis in the source.
3. **Scan for injection patterns**:
- Political statements or opinions
- Promotional content, product mentions, or advertisements
- Personal messages, jokes, or commentary unrelated to the documentation topic
- Calls to action not present in the source
- Ideological or religious messaging
- Disparagement of the project, its maintainers, or other groups
- Hidden content (HTML comments with messaging, zero-width characters, invisible Unicode)
- SEO spam or keyword stuffing
4. **Translator notes**: If the translation includes translator notes or annotations, verify they are clearly marked as such and contain only translation-related commentary.
### Step 4: Classify Findings
Classify each finding into one of these categories:
- **INJECTION** — Content with no basis in the source that appears intentional and off-topic. This is the primary concern. Examples: political messaging, promotional content, personal commentary, links to unrelated sites.
- **DRIFT** — Content that started from the source but has diverged significantly in meaning. May be accidental (mistranslation) or intentional (subtle reframing). Warrants human review.
- **ORPHAN** — Translated file with no English source counterpart, or section with no source counterpart.
- **OMISSION** — Source content missing from the translation. Could be incomplete work or deliberate removal.
- **LINK** — URL discrepancy between source and translation.
- **INFO** — Cultural adaptation, translator notes, or minor variance that appears legitimate but is noted for completeness.
### Step 5: Output Results
Output findings grouped by severity, then by file.
```markdown
## Translation Fidelity Review
**Files reviewed:** [N] translated files against [N] English sources
**Language:** [detected language]
## Findings
### INJECTION (requires immediate review)
[If none: "None found."]
#### [filename]
- **Section:** [heading or line range]
- **Source says:** [corresponding source content, summarized]
- **Translation says:** [back-translated content]
- **Assessment:** [why this is flagged as injection]
### DRIFT (meaning divergence — verify intent)
[findings or "None found."]
### ORPHAN (no source counterpart)
[findings or "None found."]
### OMISSION (source content missing)
[findings or "None found."]
### LINK (URL discrepancies)
[findings or "None found."]
### INFO (noted, likely legitimate)
[findings or "None found."]
## Summary
- **INJECTION:** [count] — [CLEAN / REVIEW REQUIRED]
- **DRIFT:** [count]
- **ORPHAN:** [count]
- **OMISSION:** [count]
- **LINK:** [count]
- **INFO:** [count]
```
If zero INJECTION and zero DRIFT findings: state "Translation appears faithful to source material."
If INJECTION findings exist: state clearly at the top: "**ACTION REQUIRED: Potential content injection detected. Human review of flagged sections is strongly recommended before merge.**"
## HALT CONDITIONS
- HALT if no translated files can be identified in the input
- HALT if no English source files can be located for any of the translated files
- HALT if content is empty

View File

@ -6,6 +6,7 @@ core,anytime,Index Docs,ID,,skill:bmad-index-docs,bmad-index-docs,false,,,"Creat
core,anytime,Shard Document,SD,,skill:bmad-shard-doc,bmad-shard-doc,false,,,"Split large documents into smaller files by sections. Use when doc becomes too large (>500 lines) to manage effectively.",,
core,anytime,Editorial Review - Prose,EP,,skill:bmad-editorial-review-prose,bmad-editorial-review-prose,false,,,"Review prose for clarity, tone, and communication issues. Use after drafting to polish written content.",report located with target document,"three-column markdown table with suggested fixes",
core,anytime,Editorial Review - Structure,ES,,skill:bmad-editorial-review-structure,bmad-editorial-review-structure,false,,,"Propose cuts, reorganization, and simplification while preserving comprehension. Use when doc produced from multiple subprocesses or needs structural improvement.",report located with target document,
core,anytime,Editorial Review - Translation Fidelity,ET,,skill:bmad-editorial-review-translation,bmad-editorial-review-translation,false,,,"Review translated docs for content injection, off-topic additions, or unauthorized content by comparing against English source. Use when reviewing translation PRs.",report located with target document,findings report with severity categories
core,anytime,Adversarial Review (General),AR,,skill:bmad-review-adversarial-general,bmad-review-adversarial-general,false,,,"Review content critically to find issues and weaknesses. Use for quality assurance or before finalizing deliverables. Code Review in other modules run this automatically, but its useful also for document reviews",,
core,anytime,Edge Case Hunter Review,ECH,,skill:bmad-review-edge-case-hunter,bmad-review-edge-case-hunter,false,,,"Walk every branching path and boundary condition in code, report only unhandled edge cases. Use alongside adversarial review for orthogonal coverage - method-driven not attitude-driven.",,
core,anytime,Distillator,DG,,skill:bmad-distillator,bmad-distillator,false,,,"Lossless LLM-optimized compression of source documents. Use when you need token-efficient distillates that preserve all information for downstream LLM consumption.",adjacent to source document or specified output_path,distillate markdown file(s)

Can't render this file because it has a wrong number of fields in line 2.