Compare commits

...

10 Commits

Author SHA1 Message Date
wdjwxh 1968f7fd3c feat: 改进变更配对算法,更新项目文档
## 主要变更

### wiki_sync.py
- 新增 `calculate_similarity()` 函数,使用 difflib.SequenceMatcher 计算文本相似度
- 改进 `group_changes_by_line()` 函数,基于内容相似度(阈值0.5)进行智能配对
- 修复了将不相关行错误配对为 replaced 的问题

### 文档更新
- README.md: 说明通过 Claude Code 和 /wiki-sync-translate skill 使用
- CLAUDE.md: Claude Code 配置和使用示例
- SKILL.md: 添加配对算法说明

### 其他
- 新增 requirements.txt: requests, python-dotenv
- 删除旧的 sync.py 和 target.txt

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-22 15:14:55 +08:00
wdjwxh 73cc023b1a fix 2026-03-22 11:18:51 +08:00
wdjwxh c3e28bf7de 增加自动翻译skill 2026-03-22 11:12:35 +08:00
wdjwxh 40f123d4e4 fix 2025-12-22 10:25:47 +08:00
wdjwxh 8959124a15 fix: 修复错位BUG 2025-12-19 16:56:05 +08:00
wdjwxh 0e1294d833 v2 2025-12-19 10:07:11 +08:00
wdjwxh 3d5eb0e017 合并 2025-12-12 12:45:13 +08:00
wdjwxh 3e72dde44d t6 2025-12-11 22:17:18 +08:00
wdjwxh 00c45a870f t5 2025-12-11 22:11:08 +08:00
wdjwxh 1e8473eb7b t4 2025-12-11 17:55:34 +08:00
13 changed files with 1792 additions and 1182 deletions

View File

@ -1,7 +1,14 @@
{
"permissions": {
"allow": [
"Bash(python:*)"
"Bash(python:*)",
"Bash(tree:*)",
"Bash(dir:*)",
"Bash(python3 -m venv venv)",
"Bash(source venv/bin/activate)",
"Bash(pip install:*)",
"Bash(source /mnt/d/code/sync-pd2-wiki/venv/bin/activate)",
"Bash(curl -s \"https://wiki.projectdiablo2.cn/w/api.php?action=query&prop=revisions&titles=Filter%20Info&rvprop=ids&format=json\")"
]
}
}

View File

@ -0,0 +1,148 @@
---
name: mediawiki-wikitext
description: MediaWiki Wikitext markup language for Wikipedia and wiki-based sites. Use when creating or editing wiki articles, generating wikitext content, working with wiki tables/templates/references, or converting content to wikitext format. Triggers on requests mentioning Wikipedia, MediaWiki, wikitext, wiki markup, or wiki article creation.
---
# MediaWiki Wikitext
Generate and edit content using MediaWiki's wikitext markup language.
## Quick Reference
### Text Formatting
```wikitext
''italic'' '''bold''' '''''bold italic'''''
<code>inline code</code> <sub>subscript</sub> <sup>superscript</sup>
<s>strikethrough</s> <u>underline</u>
```
### Headings (line start only, avoid level 1)
```wikitext
== Level 2 ==
=== Level 3 ===
==== Level 4 ====
```
### Lists
```wikitext
* Bullet item # Numbered item ; Term
** Nested ## Nested : Definition
```
### Links
```wikitext
[[Page Name]] Internal link
[[Page Name|Display Text]] With display text
[[Page Name#Section]] Section link
[https://url Display Text] External link
[[File:image.jpg|thumb|Caption]] Image
[[Category:Name]] Category (place at end)
```
### Table
```wikitext
{| class="wikitable"
|+ Caption
|-
! Header 1 !! Header 2
|-
| Cell 1 || Cell 2
|}
```
### Templates & Variables
```wikitext
{{TemplateName}} Basic call
{{TemplateName|arg1|name=value}} With arguments
{{{parameter|default}}} Parameter (in template)
{{PAGENAME}} {{CURRENTYEAR}} Magic words
```
### References
```wikitext
Text<ref>Citation here</ref>
<ref name="id">Citation</ref> Named reference
<ref name="id" /> Reuse reference
{{Reflist}} Display footnotes
```
### Special Tags
```wikitext
<nowiki>[[escaped]]</nowiki> Disable markup
<pre>preformatted block</pre> Preformatted (no markup)
<syntaxhighlight lang="python"> Code highlighting
code here
</syntaxhighlight>
<math>x^2 + y^2 = z^2</math> LaTeX math
<!-- comment --> Comment (hidden)
---- Horizontal rule
#REDIRECT [[Target Page]] Redirect (first line only)
```
### Magic Words
```wikitext
__NOTOC__ Hide table of contents
__TOC__ Position TOC here
__NOEDITSECTION__ Hide section edit links
```
## Common Patterns
### Article Structure
```wikitext
{{Infobox Type
| name = Example
| image = Example.jpg
}}
'''Article Title''' is a brief introduction.
== Section ==
Content with citation<ref>Source</ref>.
=== Subsection ===
More content.
== See also ==
* [[Related Article]]
== References ==
{{Reflist}}
== External links ==
* [https://example.com Official site]
{{DEFAULTSORT:Sort Key}}
[[Category:Category Name]]
```
### Template Definition
```wikitext
<noinclude>{{Documentation}}</noinclude><includeonly>
{| class="wikitable"
! {{{title|Default Title}}}
|-
| {{{content|No content provided}}}
{{#if:{{{footer|}}}|
{{!}}-
{{!}} {{{footer}}}
}}
|}
</includeonly>
```
## Key Syntax Rules
1. **Headings**: Use `==` to `======`; don't use `=` (reserved for page title)
2. **Line-start markup**: Lists (`*#;:`), headings, tables (`{|`) must start at line beginning
3. **Closing tags**: Close heading equals on same line; no text after closing `==`
4. **Blank lines**: Create paragraph breaks; single newlines are ignored
5. **Pipes in templates**: Use `{{!}}` for literal `|` inside templates
6. **Escaping**: Use `<nowiki>` to escape markup; `&amp;` for `&`, `&lt;` for `<`
## Resources
For detailed syntax, see:
- **references/syntax.md**: Complete markup reference with all options
- **references/templates.md**: Template and parser function details
- **assets/snippets.yaml**: Editor snippets for common patterns

View File

@ -0,0 +1,236 @@
# MediaWiki Wikitext Snippets
# For VS Code and compatible editors
---
# Headings
h2:
prefix: '@h2'
body: "== ${1:Heading} ==\n"
description: Level 2 heading
h3:
prefix: '@h3'
body: "=== ${1:Heading} ===\n"
description: Level 3 heading
h4:
prefix: '@h4'
body: "==== ${1:Heading} ====\n"
description: Level 4 heading
# Text formatting
bold:
prefix: '@bold'
body: "'''${1:text}'''"
description: Bold text
italic:
prefix: '@italic'
body: "''${1:text}''"
description: Italic text
# Links
link:
prefix: '@link'
body: '[[${1:Page}]]'
description: Internal link
linkd:
prefix: '@linkd'
body: '[[${1:Page}|${2:Display}]]'
description: Link with display text
elink:
prefix: '@elink'
body: '[${1:https://} ${2:Text}]'
description: External link
file:
prefix: '@file'
body: '[[File:${1:name.jpg}|thumb|${2:Caption}]]'
description: Image/file
cat:
prefix: '@cat'
body: '[[Category:${1:Name}]]'
description: Category
# Table
table:
prefix: '@table'
body: |
{| class="wikitable"
|+ ${1:Caption}
|-
! ${2:Header1} !! ${3:Header2}
|-
| ${4:Cell1} || ${5:Cell2}
|}
description: Basic table
tr:
prefix: '@tr'
body: |
|-
| ${1} || ${2}
description: Table row
# References
ref:
prefix: '@ref'
body: '<ref>${1:Citation}</ref>'
description: Reference
refn:
prefix: '@refn'
body: '<ref name="${1:id}">${2:Citation}</ref>'
description: Named reference
refr:
prefix: '@refr'
body: '<ref name="${1:id}" />'
description: Reference reuse
reflist:
prefix: '@reflist'
body: |
== References ==
{{Reflist}}
description: References section
# Templates
tpl:
prefix: '@tpl'
body: '{{${1:Template}}}'
description: Template call
tplp:
prefix: '@tplp'
body: '{{${1:Template}|${2:param}=${3:value}}}'
description: Template with params
infobox:
prefix: '@infobox'
body: |
{{Infobox ${1:type}
| name = ${2}
| image = ${3}
}}
description: Infobox
# Code
code:
prefix: '@code'
body: |
<syntaxhighlight lang="${1:python}">
${2:code}
</syntaxhighlight>
description: Syntax highlight block
codei:
prefix: '@codei'
body: '<code>${1:code}</code>'
description: Inline code
nowiki:
prefix: '@nowiki'
body: '<nowiki>${1:text}</nowiki>'
description: Escape markup
pre:
prefix: '@pre'
body: |
<pre>
${1:text}
</pre>
description: Preformatted block
math:
prefix: '@math'
body: '<math>${1:formula}</math>'
description: Math formula
# Comments
comment:
prefix: '@comment'
body: '<!-- ${1:comment} -->'
description: Comment
todo:
prefix: '@todo'
body: '<!-- TODO: ${1:task} -->'
description: TODO comment
# Magic words
notoc:
prefix: '@notoc'
body: '__NOTOC__'
description: Hide TOC
toc:
prefix: '@toc'
body: '__TOC__'
description: TOC position
# Lists
ul:
prefix: '@ul'
body: |
* ${1:Item 1}
* ${2:Item 2}
* ${3:Item 3}
description: Bullet list
ol:
prefix: '@ol'
body: |
# ${1:Item 1}
# ${2:Item 2}
# ${3:Item 3}
description: Numbered list
dl:
prefix: '@dl'
body: |
; ${1:Term}
: ${2:Definition}
description: Definition list
# Structure
redirect:
prefix: '@redirect'
body: '#REDIRECT [[${1:Target}]]'
description: Redirect
article:
prefix: '@article'
body: |
{{Infobox
| name = ${1:Name}
}}
'''${2:Title}''' is ${3:description}.
== Overview ==
${4:Content}
== References ==
{{Reflist}}
[[Category:${5:Category}]]
description: Article template
hr:
prefix: '@hr'
body: '----'
description: Horizontal rule
br:
prefix: '@br'
body: '<br />'
description: Line break
sort:
prefix: '@sort'
body: '{{DEFAULTSORT:${1:Key}}}'
description: Default sort key

View File

@ -0,0 +1,345 @@
# MediaWiki Wikitext Syntax Reference
Complete syntax reference for MediaWiki wikitext markup.
## Text Formatting (Inline)
| Syntax | Result | Notes |
|--------|--------|-------|
| `''text''` | *italic* | Two single quotes |
| `'''text'''` | **bold** | Three single quotes |
| `'''''text'''''` | ***bold italic*** | Five single quotes |
| `<code>text</code>` | `monospace` | Inline code |
| `<var>x</var>` | Variable style | |
| `<kbd>Ctrl</kbd>` | Keyboard input | |
| `<samp>output</samp>` | Sample output | |
| `<sub>2</sub>` | Subscript | H₂O |
| `<sup>2</sup>` | Superscript | x² |
| `<s>text</s>` | ~~strikethrough~~ | |
| `<del>text</del>` | Deleted text | Semantic |
| `<ins>text</ins>` | Inserted text | Often underlined |
| `<u>text</u>` | Underline | |
| `<small>text</small>` | Small text | |
| `<big>text</big>` | Large text | Deprecated |
## Headings
```wikitext
= Level 1 = (Don't use - reserved for page title)
== Level 2 ==
=== Level 3 ===
==== Level 4 ====
===== Level 5 =====
====== Level 6 ======
```
**Rules:**
- Must start at line beginning
- No text after closing equals on same line
- 4+ headings auto-generate TOC (unless `__NOTOC__`)
- Spaces around heading text are optional but recommended
## Lists
### Unordered (Bullet) Lists
```wikitext
* Item 1
* Item 2
** Nested item 2.1
** Nested item 2.2
*** Deeper nesting
* Item 3
```
### Ordered (Numbered) Lists
```wikitext
# First item
# Second item
## Sub-item 2.1
## Sub-item 2.2
### Sub-sub-item
# Third item
```
### Definition Lists
```wikitext
; Term 1
: Definition 1
; Term 2
: Definition 2a
: Definition 2b
```
### Mixed Lists
```wikitext
# Numbered item
#* Bullet under numbered
#* Another bullet
# Next numbered
#: Definition-style continuation
```
### Indentation
```wikitext
: Single indent
:: Double indent
::: Triple indent
```
**Note:** List markers must be at line start. Blank lines end the list.
## Links
### Internal Links
```wikitext
[[Page Name]]
[[Page Name|Display Text]]
[[Page Name#Section]]
[[Page Name#Section|Display Text]]
[[Namespace:Page Name]]
[[/Subpage]]
[[../Sibling Page]]
```
### External Links
```wikitext
[https://example.com] Numbered link [1]
[https://example.com Display Text] Named link
https://example.com Auto-linked
```
### Special Links
```wikitext
[[File:Image.jpg]] Embed image
[[File:Image.jpg|thumb|Caption]] Thumbnail with caption
[[File:Image.jpg|thumb|left|200px|Caption]]
[[Media:File.pdf]] Direct file link
[[Category:Category Name]] Add to category
[[:Category:Category Name]] Link to category (no add)
[[Special:RecentChanges]] Special page
```
### Interwiki Links
```wikitext
[[en:English Article]] Language link
[[wikt:word]] Wiktionary
[[commons:File:Image.jpg]] Wikimedia Commons
```
## Images
### Basic Syntax
```wikitext
[[File:Example.jpg|options|caption]]
```
### Image Options
| Option | Description |
|--------|-------------|
| `thumb` | Thumbnail (default right-aligned) |
| `frame` | Framed, no resize |
| `frameless` | Thumbnail without frame |
| `border` | Thin border |
| `right`, `left`, `center`, `none` | Alignment |
| `200px` | Width |
| `x100px` | Height |
| `200x100px` | Max dimensions |
| `upright` | Smart scaling for tall images |
| `upright=0.5` | Custom ratio |
| `link=Page` | Custom link target |
| `link=` | No link |
| `alt=Text` | Alt text for accessibility |
### Gallery
```wikitext
<gallery>
File:Image1.jpg|Caption 1
File:Image2.jpg|Caption 2
</gallery>
<gallery mode="packed" heights="150">
File:Image1.jpg
File:Image2.jpg
</gallery>
```
## Tables
### Basic Structure
```wikitext
{| class="wikitable"
|+ Caption
|-
! Header 1 !! Header 2 !! Header 3
|-
| Cell 1 || Cell 2 || Cell 3
|-
| Cell 4 || Cell 5 || Cell 6
|}
```
### Table Elements
| Markup | Location | Meaning |
|--------|----------|---------|
| `{|` | Start | Table start |
| `|}` | End | Table end |
| `|+` | After `{|` | Caption |
| `|-` | Row | Row separator |
| `!` | Cell | Header cell |
| `!!` | Cell | Header cell separator (same row) |
| `|` | Cell | Data cell |
| `||` | Cell | Data cell separator (same row) |
### Cell Attributes
```wikitext
| style="background:#fcc" | Red background
| colspan="2" | Spans 2 columns
| rowspan="3" | Spans 3 rows
! scope="col" | Column header
! scope="row" | Row header
```
### Sortable Table
```wikitext
{| class="wikitable sortable"
|-
! Name !! Value
|-
| Alpha || 1
| Beta || 2
|}
```
## References
### Basic Citation
```wikitext
Statement<ref>Source information</ref>
== References ==
{{Reflist}}
```
### Named References
```wikitext
First use<ref name="smith2020">Smith, 2020, p. 42</ref>
Second use<ref name="smith2020" />
```
### Grouped References
```wikitext
Note<ref group="note">Explanatory note</ref>
Source<ref>Regular citation</ref>
== Notes ==
{{Reflist|group="note"}}
== References ==
{{Reflist}}
```
## Special Tags
### nowiki (Escape Markup)
```wikitext
<nowiki>[[Not a link]]</nowiki>
<<nowiki/>nowiki> Outputs: <nowiki>
```
### pre (Preformatted)
```wikitext
<pre>
Preformatted text
Whitespace preserved
'''Markup not processed'''
</pre>
```
### syntaxhighlight (Code)
```wikitext
<syntaxhighlight lang="python">
def hello():
print("Hello")
</syntaxhighlight>
<syntaxhighlight lang="python" line="1" start="10">
# Line numbers starting at 10
</syntaxhighlight>
```
Supported languages: python, javascript, php, java, c, cpp, csharp, ruby, perl, sql, xml, html, css, json, yaml, bash, etc.
### math (LaTeX)
```wikitext
Inline: <math>E = mc^2</math>
Block: <math display="block">\sum_{i=1}^n i = \frac{n(n+1)}{2}</math>
Chemistry: <chem>H2O</chem>
```
### Transclusion Control
```wikitext
<includeonly>Only when transcluded</includeonly>
<noinclude>Only on template page itself</noinclude>
<onlyinclude>Only this part is transcluded</onlyinclude>
```
## HTML Entities
| Entity | Character | Description |
|--------|-----------|-------------|
| `&amp;` | & | Ampersand |
| `&lt;` | < | Less than |
| `&gt;` | > | Greater than |
| `&nbsp;` | (space) | Non-breaking space |
| `&mdash;` | — | Em dash |
| `&ndash;` | | En dash |
| `&rarr;` | → | Right arrow |
| `&larr;` | ← | Left arrow |
| `&copy;` | © | Copyright |
| `&euro;` | € | Euro |
| `&#58;` | : | Colon (in definition lists) |
## Miscellaneous
### Horizontal Rule
```wikitext
----
```
### Comments
```wikitext
<!-- This is a comment -->
<!--
Multi-line
comment
-->
```
### Line Breaks
```wikitext
Line 1<br />Line 2
```
### Redirect
```wikitext
#REDIRECT [[Target Page]]
#REDIRECT [[Target Page#Section]]
```
Must be first line of page.
### Signatures (Talk pages)
```wikitext
~~~ Username only
~~~~ Username and timestamp
~~~~~ Timestamp only
```
### Categories
```wikitext
[[Category:Category Name]]
[[Category:Category Name|Sort Key]]
{{DEFAULTSORT:Sort Key}}
```
Place at end of article.

View File

@ -0,0 +1,311 @@
# MediaWiki Templates and Parser Functions
## Template Basics
### Calling Templates
```wikitext
{{TemplateName}}
{{TemplateName|positional arg}}
{{TemplateName|param1=value1|param2=value2}}
{{TemplateName
| param1 = value1
| param2 = value2
}}
```
### Template Parameters (Definition Side)
```wikitext
{{{1}}} First positional parameter
{{{paramName}}} Named parameter
{{{1|default}}} With default value
{{{paramName|}}} Empty default (vs undefined)
```
### Transclusion
```wikitext
{{:Page Name}} Transclude article (with colon)
{{Template Name}} Transclude template
{{subst:Template Name}} Substitute (one-time expansion)
{{safesubst:Template}} Safe substitution
{{msgnw:Template}} Show raw wikitext
```
## Parser Functions
### Conditionals
#### #if (empty test)
```wikitext
{{#if: {{{param|}}} | not empty | empty or undefined }}
{{#if: {{{param|}}} | has value }}
```
#### #ifeq (equality test)
```wikitext
{{#ifeq: {{{type}}} | book | It's a book | Not a book }}
{{#ifeq: {{{1}}} | {{{2}}} | same | different }}
```
#### #iferror
```wikitext
{{#iferror: {{#expr: 1/0}} | Division error | OK }}
```
#### #ifexist (page exists)
```wikitext
{{#ifexist: Page Name | [[Page Name]] | Page doesn't exist }}
```
#### #ifexpr (expression test)
```wikitext
{{#ifexpr: {{{count}}} > 10 | Many | Few }}
{{#ifexpr: {{{year}}} mod 4 = 0 | Leap year candidate }}
```
#### #switch
```wikitext
{{#switch: {{{type}}}
| book = 📚 Book
| article = 📄 Article
| website = 🌐 Website
| #default = 📋 Other
}}
{{#switch: {{{1}}}
| A | B | C = First three letters
| #default = Something else
}}
```
### String Functions
#### #len
```wikitext
{{#len: Hello }} Returns: 5
```
#### #pos (find position)
```wikitext
{{#pos: Hello World | o }} Returns: 4 (first 'o')
{{#pos: Hello World | o | 5 }} Returns: 7 (after position 5)
```
#### #sub (substring)
```wikitext
{{#sub: Hello World | 0 | 5 }} Returns: Hello
{{#sub: Hello World | 6 }} Returns: World
{{#sub: Hello World | -5 }} Returns: World (from end)
```
#### #replace
```wikitext
{{#replace: Hello World | World | Universe }} Returns: Hello Universe
```
#### #explode (split)
```wikitext
{{#explode: a,b,c,d | , | 2 }} Returns: c (third element)
```
#### #urlencode / #urldecode
```wikitext
{{#urlencode: Hello World }} Returns: Hello%20World
{{#urldecode: Hello%20World }} Returns: Hello World
```
### Math Functions
#### #expr
```wikitext
{{#expr: 1 + 2 * 3 }} Returns: 7
{{#expr: (1 + 2) * 3 }} Returns: 9
{{#expr: 2 ^ 10 }} Returns: 1024
{{#expr: 17 mod 5 }} Returns: 2
{{#expr: floor(3.7) }} Returns: 3
{{#expr: ceil(3.2) }} Returns: 4
{{#expr: round(3.567, 2) }} Returns: 3.57
{{#expr: abs(-5) }} Returns: 5
{{#expr: sqrt(16) }} Returns: 4
{{#expr: ln(e) }} Returns: 1
{{#expr: sin(pi/2) }} Returns: 1
```
**Operators:** `+`, `-`, `*`, `/`, `^` (power), `mod`, `round`, `floor`, `ceil`, `abs`, `sqrt`, `ln`, `exp`, `sin`, `cos`, `tan`, `asin`, `acos`, `atan`, `pi`, `e`
**Comparison:** `=`, `<>`, `!=`, `<`, `>`, `<=`, `>=`
**Logical:** `and`, `or`, `not`
### Date/Time Functions
#### #time
```wikitext
{{#time: Y-m-d }} Current: 2024-01-15
{{#time: F j, Y | 2024-01-15 }} January 15, 2024
{{#time: Y年n月j日 | 2024-01-15 }} 2024年1月15日
{{#time: l | 2024-01-15 }} Monday
```
**Format codes:**
| Code | Output | Description |
|------|--------|-------------|
| Y | 2024 | 4-digit year |
| y | 24 | 2-digit year |
| n | 1 | Month (no leading zero) |
| m | 01 | Month (with leading zero) |
| F | January | Full month name |
| M | Jan | Abbreviated month |
| j | 5 | Day (no leading zero) |
| d | 05 | Day (with leading zero) |
| l | Monday | Full weekday |
| D | Mon | Abbreviated weekday |
| H | 14 | Hour (24h, leading zero) |
| i | 05 | Minutes (leading zero) |
| s | 09 | Seconds (leading zero) |
#### #timel (local time)
```wikitext
{{#timel: H:i }} Local time
```
### Formatting Functions
#### #formatnum
```wikitext
{{#formatnum: 1234567.89 }} 1,234,567.89
{{#formatnum: 1,234.56 | R }} 1234.56 (raw)
```
#### #padleft / #padright
```wikitext
{{#padleft: 7 | 3 | 0 }} 007
{{#padright: abc | 6 | . }} abc...
```
#### #lc / #uc / #lcfirst / #ucfirst
```wikitext
{{#lc: HELLO }} hello
{{#uc: hello }} HELLO
{{#lcfirst: HELLO }} hELLO
{{#ucfirst: hello }} Hello
{{lc: HELLO }} hello (shortcut)
```
### Other Functions
#### #tag
```wikitext
{{#tag: ref | Citation text | name=smith }}
Equivalent to: <ref name="smith">Citation text</ref>
```
#### #invoke (Lua modules)
```wikitext
{{#invoke: ModuleName | functionName | arg1 | arg2 }}
```
## Magic Words
### Behavior Switches
```wikitext
__NOTOC__ No table of contents
__FORCETOC__ Force TOC even with <4 headings
__TOC__ Place TOC here
__NOEDITSECTION__ No section edit links
__NEWSECTIONLINK__ Add new section link
__NONEWSECTIONLINK__ Remove new section link
__NOGALLERY__ No gallery in category
__HIDDENCAT__ Hidden category
__INDEX__ Index by search engines
__NOINDEX__ Don't index
__STATICREDIRECT__ Don't update redirect
```
### Page Variables
```wikitext
{{PAGENAME}} Page title without namespace
{{FULLPAGENAME}} Full page title
{{BASEPAGENAME}} Parent page name
{{SUBPAGENAME}} Subpage name
{{ROOTPAGENAME}} Root page name
{{TALKPAGENAME}} Associated talk page
{{NAMESPACE}} Current namespace
{{NAMESPACENUMBER}} Namespace number
{{PAGEID}} Page ID
{{REVISIONID}} Revision ID
```
### Site Variables
```wikitext
{{SITENAME}} Wiki name
{{SERVER}} Server URL
{{SERVERNAME}} Server hostname
{{SCRIPTPATH}} Script path
```
### Date/Time Variables
```wikitext
{{CURRENTYEAR}} 4-digit year
{{CURRENTMONTH}} Month (01-12)
{{CURRENTMONTHNAME}} Month name
{{CURRENTDAY}} Day (1-31)
{{CURRENTDAYNAME}} Day name
{{CURRENTTIME}} HH:MM
{{CURRENTTIMESTAMP}} YYYYMMDDHHmmss
```
### Statistics
```wikitext
{{NUMBEROFPAGES}} Total pages
{{NUMBEROFARTICLES}} Content pages
{{NUMBEROFFILES}} Files
{{NUMBEROFUSERS}} Registered users
{{NUMBEROFACTIVEUSERS}} Active users
{{NUMBEROFEDITS}} Total edits
{{PAGESINCATEGORY:Name}} Pages in category
```
## Template Examples
### Simple Infobox
```wikitext
<noinclude>{{Documentation}}</noinclude><includeonly>
{| class="infobox" style="width:22em"
|-
! colspan="2" style="background:#ccc" | {{{title|{{PAGENAME}}}}}
{{#if:{{{image|}}}|
{{!}}-
{{!}} colspan="2" {{!}} [[File:{{{image}}}|200px|center]]
}}
|-
| '''Type''' || {{{type|Unknown}}}
|-
| '''Date''' || {{{date|—}}}
|}
</includeonly>
```
### Navbox Template
```wikitext
<noinclude>{{Documentation}}</noinclude><includeonly>
{| class="navbox" style="width:100%"
|-
! style="background:#ccf" | {{{title|Navigation}}}
|-
| {{{content|}}}
|}
</includeonly>
```
### Citation Template
```wikitext
<includeonly>{{#if:{{{author|}}}|{{{author}}}. }}{{#if:{{{title|}}}|''{{{title}}}''. }}{{#if:{{{publisher|}}}|{{{publisher}}}{{#if:{{{year|}}}|, }}}}{{{year|}}}.{{#if:{{{url|}}}| [{{{url}}} Link]}}</includeonly>
```
## Tips
1. **Pipe trick**: `[[Help:Contents|]]` displays as "Contents"
2. **Escape pipes in templates**: Use `{{!}}` for literal `|`
3. **Trim whitespace**: Parameters automatically trim whitespace
4. **Check emptiness correctly**: `{{{param|}}}` vs `{{{param}}}` - the former has empty default, latter is undefined if not passed
5. **Subst for speed**: Use `{{subst:Template}}` for templates that don't need dynamic updates

View File

@ -0,0 +1,164 @@
---
name: wiki-sync-translate
description: 同步英文 MediaWiki 页面变更到中文翻译文档。当用户需要更新中文 Wiki页面、同步英文变更、或翻译 Wiki 内容时触发。适用于 Project Diablo 2 Wiki 的中英双语同步维护场景。
---
# Wiki 同步翻译
同步英文 Wiki 页面变更到中文翻译文档,保持行号一致。
## 使用方法
- 用户请求更新中文 Wiki 页面
- 用户请求同步英文 Wiki 变更
- 用户指定某个页面需要进行翻译更新
- 用户执行 `/wiki-sync-translate <页面名称>`
## 工作目录
脚本位于:`.claude/skills/wiki-sync-translate/scripts/wiki_sync.py`
## 执行步骤
### Step 1: 运行同步脚本
使用 skill 目录下的专用脚本获取变更:
```bash
cd /mnt/d/code/sync-pd2-wiki
source venv/bin/activate
python .claude/skills/wiki-sync-translate/
scripts/wiki_sync.py --title "<页面名称>" --since <上次同步时间> --run
```
参数说明:
- `--title`: 指定要同步的页面名称
- `--since`: 起始时间,格式如 `2026-01-02T12:07:05Z`
- `--run`: 必须提供此参数才会执行
### Step 2: 读取输出文件
脚本会在 `wiki_sync_output/<时间戳>/changed_pages/` 目录下生成:
| 文件 | 说明 | 用途 |
|------|------|------|
| `*.comparison.json` | 结构化变更信息 | **必须读取,包含行号和变更内容** |
| `*.full.txt` | 英文最新版本 | 需要时参考 |
| `*.cn.txt` | 中文原文 | **复制到 result_pages不直接读取** |
| `*.old.txt` | 英文历史版本 | 需要时参考 |
**重要:** 只读取 `comparison.json`,不要读取整个 `*.cn.txt` 文件以节省 token。
### Step 3: 解析 comparison.json
`comparison.json` 格式:
```json
{
"title": "页面标题",
"has_cn_translation": true,
"summary": {
"total_changes": 1,
"replaced": 1,
"added": 0,
"removed": 0
},
"changes": [
{
"type": "replaced",
"old_line": 66,
"new_line": 66,
"old_content": "旧内容",
"new_content": "新内容"
}
]
}
```
变更类型:
- `replaced`: 替换,`old_line` 表示需要修改的行号
- `added`: 新增,`new_line` 表示插入位置
- `removed`: 删除,`old_line` 表示要删除的行
**配对算法说明:**
从 v2 版本开始,脚本使用**内容相似度**来判断 `removed``added` 是否应该配对为 `replaced`
- 相似度阈值设为 0.550%
- 只有当 removed 和 added 的内容相似度 ≥ 0.5 时,才会被配对
- 这避免了将完全不同内容的行错误配对
**注意:** 如果 `old_line``new_line` 差距很大但内容相似如行号从171变到193这通常意味着中间有其他行被插入或删除需要仔细检查变更是否真的相关。
### Step 4: 更新中文文档
**核心原则:行号必须完全一致,使用增量修改减少 token 消耗**
1. 创建 `wiki_sync_output/<时间戳>/result_pages/` 目录(如不存在)
2. **判断页面类型**
- **已有中文翻译**`has_cn_translation: true`):复制 `*.cn.txt` 到 result_pages然后用 Edit 增量修改
- **新页面**`has_cn_translation: false` 或 `is_new_page: true`):读取 `*.full.txt` 英文版本,翻译后直接 Write 到 result_pages
3. **使用 Edit 工具**增量修改(仅适用于已有翻译的页面):
- 根据 `old_line` 定位要修改的行
- 从 `*.cn.txt` 中提取该行的**完整内容**作为 `old_string`
- 构造新的行内容作为 `new_string`
- 智能更新规则:
- 仅同步变更的内容(如日期、数值)
- 保留中文翻译(如 "赛季 12" 不改为 "Season 12"
- 新增的英文内容智能翻译成中文,可以用 `Grep` 工具在 `references/PatchString.txt` 搜索技能或物品的中文名称
- 保持 MediaWiki 语法正确
4. 每次变更使用一次 Edit 调用
**示例操作流程:**
```bash
# 1. 复制文件
cp wiki_sync_output/<时间戳>/changed_pages/*.cn.txt wiki_sync_output/<时间戳>/result_pages/<页面名>.cn.txt
```
```json
// 2. 从 comparison.json 获取变更假设第66行需要修改
// 3. 只读取该行附近内容确认,然后用 Edit 修改
```
### Step 5: 输出结果
更新后的文档位于 `wiki_sync_output/<时间戳>/result_pages/<页面名>.cn.txt`,用户可直接复制到 Wiki。
## 示例
**输入 - comparison.json:**
```json
{
"changes": [{
"type": "replaced",
"old_line": 66,
"old_content": "| style=\"...\"| 2025-11-25<br>(Season 12)",
"new_content": "| style=\"...\"| 2026-01-25<br>(Season 12)"
}]
}
```
**中文原文第66行**
```
| style="color:#3f6e2d; background-color:#161f0c; border-color:#0d1709"| 2025-11-25<br>(赛季 12)
```
**更新后第66行**
```
| style="color:#3f6e2d; background-color:#161f0c; border-color:#0d1709"| 2026-01-25<br>(赛季 12)
```
**变更说明:** 日期 `2025-11-25``2026-01-25`,但 `赛季 12` 保持中文不变。
## 注意事项
1. **行号一致性**:确保中英文文档行号完全对应,这是长期维护的基础
2. **保留翻译**:只同步变更内容,不替换已有的中文翻译
3. **MediaWiki 语法**
- 表格分隔符 `|-` 位置保持一致
- 链接格式 `[[页面名|显示文本]]` 不变
- 样式属性 `style="..."` 不变
4. **特殊字符**:注意 `<br>`、`&nbsp;` 等 HTML 实体
## 错误处理
- 如果 `has_cn_translation` 为 false提示用户该页面无中文翻译
- 如果 `is_new_page` 为 true说明是新页面需要全新翻译
- 如果找不到对应行,可能是中英文版本不同步,需要人工确认

View File

@ -0,0 +1,447 @@
# -*- coding: utf-8 -*-
"""
MediaWiki Wiki 同步工具 - AI Agent 版本
输出 JSON 格式的对比文件便于 AI Agent 读取和处理
"""
import os
import argparse
from pathlib import Path
from datetime import datetime, timedelta
import requests
from dotenv import load_dotenv
import difflib
import json
import re
# ==================== 配置区 ====================
load_dotenv()
WIKI_API_URL_EN = os.getenv("WIKI_API_URL_EN", "https://wiki.projectdiablo2.com/w/api.php")
WIKI_API_URL_CN = os.getenv("WIKI_API_URL_CN", "https://wiki.projectdiablo2.cn/w/api.php")
OUTPUT_DIR = Path("wiki_sync_output")
OUTPUT_DIR.mkdir(exist_ok=True)
CURRENT_OUTPUT_DIR = None
LAST_TIMESTAMP_FILE = "last_sync_timestamp.txt"
SESSION_EN = requests.Session()
SESSION_EN.headers.update({
"User-Agent": "WikiSyncTool/5.0 (AI Agent Version)"
})
SESSION_CN = requests.Session()
SESSION_CN.headers.update({
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
})
SESSION_CN.trust_env = False
# ================================================
def load_last_timestamp():
if os.path.exists(LAST_TIMESTAMP_FILE):
with open(LAST_TIMESTAMP_FILE, encoding="utf-8") as f:
return f.read().strip()
return None
def save_last_timestamp(ts):
with open(LAST_TIMESTAMP_FILE, "w", encoding="utf-8") as f:
f.write(ts)
def get_recent_changes(since):
"""获取自 since 时间后每个页面的最新 revid"""
params = {
"action": "query",
"list": "recentchanges",
"rcprop": "title|ids|timestamp",
"rctype": "edit|new",
"rcdir": "newer",
"rcstart": since,
"rclimit": 500,
"format": "json"
}
latest = {}
while True:
try:
r = SESSION_EN.get(WIKI_API_URL_EN, params=params)
r.raise_for_status()
data = r.json()
if "error" in data:
raise Exception(data["error"])
for rc in data.get("query", {}).get("recentchanges", []):
latest[rc["title"]] = (rc["revid"], rc["timestamp"])
if "continue" not in data:
break
params.update(data["continue"])
except Exception as e:
print(f"获取最近更改时出错: {e}")
break
return latest
def get_old_revid(title, end_time):
"""获取指定时间前的最后一个 revid"""
params = {
"action": "query",
"prop": "revisions",
"titles": title,
"rvprop": "ids|timestamp",
"rvlimit": 1,
"rvdir": "older",
"rvstart": end_time,
"format": "json"
}
try:
r = SESSION_EN.get(WIKI_API_URL_EN, params=params).json()
pages = r["query"]["pages"]
page = next(iter(pages.values()))
if "revisions" in page:
return page["revisions"][0]["revid"]
except Exception as e:
print(f"获取旧版本ID时出错: {e}")
return None
def get_page_content(wiki_url, session, title, revid=None):
"""获取页面完整内容"""
params = {
"action": "query",
"prop": "revisions",
"titles": title,
"rvprop": "content|timestamp|ids",
"rvslots": "main",
"format": "json"
}
if revid:
params["rvstartid"] = revid
params["rvendid"] = revid
try:
r = session.get(wiki_url, params=params).json()
pages = r["query"]["pages"]
page = next(iter(pages.values()))
if "revisions" in page:
rev = page["revisions"][0]
return rev["slots"]["main"]["*"], rev["timestamp"], rev["revid"]
except Exception as e:
print(f"获取页面内容时出错: {e}")
return None, None, None
def generate_text_diff(old_text, new_text):
"""生成 unified diff 格式"""
if not old_text:
return "新创建页面"
old_lines = old_text.splitlines(keepends=True)
new_lines = new_text.splitlines(keepends=True)
return ''.join(difflib.unified_diff(old_lines, new_lines, lineterm='\n'))
def parse_diff_to_changes(diff_text):
"""
解析 diff 文本提取结构化的变更信息
返回一个列表每个元素包含变更类型行号旧内容新内容
"""
if not diff_text or diff_text.startswith("新创建页面"):
return []
changes = []
current_old_line = 0
current_new_line = 0
in_hunk = False
for line in diff_text.splitlines():
if line.startswith('@@'):
match = re.match(r'@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@', line)
if match:
current_old_line = int(match.group(1))
current_new_line = int(match.group(3))
in_hunk = True
elif line.startswith('---') or line.startswith('+++'):
continue
elif in_hunk:
if line.startswith('-'):
changes.append({
"type": "removed",
"old_line": current_old_line,
"new_line": None,
"old_content": line[1:],
"new_content": None
})
current_old_line += 1
elif line.startswith('+'):
changes.append({
"type": "added",
"old_line": None,
"new_line": current_new_line,
"old_content": None,
"new_content": line[1:]
})
current_new_line += 1
elif line.startswith(' '):
current_old_line += 1
current_new_line += 1
return changes
def calculate_similarity(text1, text2):
"""
计算两段文本的相似度0-1
使用 difflib.SequenceMatcher
"""
if not text1 or not text2:
return 0.0
# 去除首尾空白后比较
t1 = text1.strip()
t2 = text2.strip()
return difflib.SequenceMatcher(None, t1, t2).ratio()
def group_changes_by_line(changes, similarity_threshold=0.5):
"""
将变更按行号分组将连续的删除和添加合并为替换操作
改进使用内容相似度来判断 removed added 是否应该配对
- 如果 removed added 的内容相似度 >= threshold才配对为 replaced
- 否则分别标记为 removed added
"""
# 先收集所有的删除和添加
removed_by_line = {} # old_line -> content
added_by_line = {} # new_line -> content
for c in changes:
if c["type"] == "removed":
removed_by_line[c["old_line"]] = c["old_content"]
elif c["type"] == "added":
added_by_line[c["new_line"]] = c["new_content"]
# 使用贪心算法,基于内容相似度进行配对
grouped = []
used_added = set()
used_removed = set()
# 第一步:找出所有高相似度的配对
pairings = []
for old_line, old_content in removed_by_line.items():
for new_line, new_content in added_by_line.items():
if new_line not in used_added:
similarity = calculate_similarity(old_content, new_content)
if similarity >= similarity_threshold:
pairings.append((similarity, old_line, new_line, old_content, new_content))
# 按相似度降序排序,优先处理最相似的配对
pairings.sort(key=lambda x: x[0], reverse=True)
# 第二步:贪心配对
for similarity, old_line, new_line, old_content, new_content in pairings:
if old_line not in used_removed and new_line not in used_added:
grouped.append({
"type": "replaced",
"old_line": old_line,
"new_line": new_line,
"old_content": old_content,
"new_content": new_content,
"_similarity": round(similarity, 2) # 调试用,可选
})
used_removed.add(old_line)
used_added.add(new_line)
# 第三步:处理未配对的 removed
for old_line, old_content in sorted(removed_by_line.items()):
if old_line not in used_removed:
grouped.append({
"type": "removed",
"old_line": old_line,
"new_line": None,
"old_content": old_content,
"new_content": None
})
# 第四步:处理未配对的 added
for new_line, new_content in sorted(added_by_line.items()):
if new_line not in used_added:
grouped.append({
"type": "added",
"old_line": None,
"new_line": new_line,
"old_content": None,
"new_content": new_content
})
# 按行号排序
grouped.sort(key=lambda x: x["old_line"] or x["new_line"] or 0)
return grouped
def create_diff_json(title, en_old_content, en_new_content, cn_content):
"""
创建结构化的 JSON 对比数据仅包含英文变更AI自行匹配中文
"""
# 生成英文 diff
diff_text = generate_text_diff(en_old_content, en_new_content)
# 解析变更
raw_changes = parse_diff_to_changes(diff_text)
grouped_changes = group_changes_by_line(raw_changes)
# 构建输出结构(精简版,不含中文行内容)
result = {
"title": title,
"timestamp": datetime.now().isoformat(),
"is_new_page": diff_text == "新创建页面",
"has_cn_translation": cn_content is not None,
"summary": {
"total_changes": len(grouped_changes),
"replaced": len([c for c in grouped_changes if c["type"] == "replaced"]),
"added": len([c for c in grouped_changes if c["type"] == "added"]),
"removed": len([c for c in grouped_changes if c["type"] == "removed"])
},
"changes": grouped_changes
}
return result
def save_files(title, diff_json, en_full_text, cn_content, timestamp, revid=None, old_full_text=None):
"""保存文件"""
global CURRENT_OUTPUT_DIR
if CURRENT_OUTPUT_DIR is None:
current_time_str = datetime.now().strftime("%Y%m%d_%H%M%S")
CURRENT_OUTPUT_DIR = OUTPUT_DIR / current_time_str
CURRENT_OUTPUT_DIR.mkdir(exist_ok=True)
(CURRENT_OUTPUT_DIR / "new_pages").mkdir(exist_ok=True)
(CURRENT_OUTPUT_DIR / "changed_pages").mkdir(exist_ok=True)
print(f"创建输出目录: {CURRENT_OUTPUT_DIR}")
safe_title = "".join(c if c.isalnum() or c in " -_." else "_" for c in title)
time_str = timestamp[:19].replace("-", "").replace(":", "").replace("T", "_")
base_filename = f"{safe_title}-{time_str}-{revid}" if revid else f"{safe_title}-{time_str}"
is_new_page = diff_json["is_new_page"]
if is_new_page:
target_dir = CURRENT_OUTPUT_DIR / "new_pages"
print(f" 检测到新页面")
# 保存英文完整内容
full_file = target_dir / f"{base_filename}.full.txt"
with open(full_file, "w", encoding="utf-8") as f:
f.write(en_full_text)
print(f" → 已保存: {full_file.relative_to(OUTPUT_DIR)}")
else:
target_dir = CURRENT_OUTPUT_DIR / "changed_pages"
# 保存英文完整内容
full_file = target_dir / f"{base_filename}.full.txt"
with open(full_file, "w", encoding="utf-8") as f:
f.write(en_full_text)
print(f" → 已保存: {full_file.relative_to(OUTPUT_DIR)}")
# 保存中文内容
if cn_content:
cn_file = target_dir / f"{base_filename}.cn.txt"
with open(cn_file, "w", encoding="utf-8") as f:
f.write(cn_content)
print(f" → 已保存: {cn_file.relative_to(OUTPUT_DIR)}")
# 保存 JSON 对比文件(核心输出)
json_file = target_dir / f"{base_filename}.comparison.json"
with open(json_file, "w", encoding="utf-8") as f:
json.dump(diff_json, f, ensure_ascii=False, indent=2)
print(f" → 已保存: {json_file.relative_to(OUTPUT_DIR)} (AI Agent 对比文件)")
# 保存历史版本
if old_full_text:
old_file = target_dir / f"{base_filename}.old.txt"
with open(old_file, "w", encoding="utf-8") as f:
f.write(old_full_text)
print(f" → 已保存: {old_file.relative_to(OUTPUT_DIR)}")
def process_single_page(title, since_time, update_timestamp=False):
"""处理单个页面"""
print(f"正在处理页面:{title}")
# 获取最新内容
latest_content, latest_ts, latest_revid = get_page_content(WIKI_API_URL_EN, SESSION_EN, title)
if latest_content is None:
print("页面不存在或被删除")
return None
# 获取旧版本
old_revid = get_old_revid(title, since_time)
old_content = None
if old_revid:
old_content, _, _ = get_page_content(WIKI_API_URL_EN, SESSION_EN, title, old_revid)
if old_content is None:
print(" 无法获取历史版本,视为新页面")
# 获取中文翻译
print(" 搜索中文翻译...")
cn_content = None
# 直接尝试获取同名页面
cn_result, _, _ = get_page_content(WIKI_API_URL_CN, SESSION_CN, title)
if cn_result:
cn_content = cn_result
print(f" 找到中文页面 ({len(cn_content)} 字符)")
else:
print(" 未找到中文翻译")
# 生成对比 JSON
diff_json = create_diff_json(title, old_content, latest_content, cn_content)
print(f" 变更统计: 替换={diff_json['summary']['replaced']}, "
f"新增={diff_json['summary']['added']}, 删除={diff_json['summary']['removed']}")
# 保存文件
save_files(title, diff_json, latest_content, cn_content, latest_ts, latest_revid, old_content)
if update_timestamp:
save_last_timestamp(latest_ts)
print(f"已更新时间戳 → {latest_ts}")
return latest_ts
def process_all_pages_since(since_time):
"""处理所有变更页面"""
print("正在获取最近变更列表...")
changes = get_recent_changes(since_time)
if not changes:
print("没有发现任何变更")
return
latest_global_ts = since_time
for title, (revid, ts) in changes.items():
print(f"\n处理:{title}")
page_ts = process_single_page(title, since_time)
if page_ts and page_ts > latest_global_ts:
latest_global_ts = page_ts
save_last_timestamp(latest_global_ts)
print(f"\n同步完成!最新时间戳: {latest_global_ts}")
print(f"文件保存在: {CURRENT_OUTPUT_DIR.resolve() if CURRENT_OUTPUT_DIR else OUTPUT_DIR.resolve()}")
def main():
parser = argparse.ArgumentParser(description="MediaWiki 同步工具 - AI Agent 版本")
parser.add_argument("--since", type=str, help="起始时间,格式: 2025-11-28T00:00:00Z")
parser.add_argument("--title", type=str, help="只处理指定页面")
parser.add_argument("--update-timestamp", action="store_true", help="更新全局时间戳")
parser.add_argument("--run", action="store_true", help="执行同步")
args = parser.parse_args()
if not args.run:
parser.print_help()
return
since_time = args.since or load_last_timestamp()
if not since_time:
since_time = (datetime.utcnow() - timedelta(days=1)).isoformat(timespec='seconds') + "Z"
print(f"起始时间: {since_time}")
if args.title:
process_single_page(args.title.strip(), since_time, args.update_timestamp)
else:
process_all_pages_since(since_time)
if __name__ == "__main__":
main()

55
CLAUDE.md Normal file
View File

@ -0,0 +1,55 @@
# PD2 Wiki Sync Tool - Claude Code 配置
## 项目概述
Project Diablo 2 Wiki 中英文同步工具。
## 环境要求
### Python 虚拟环境
**必须**先激活虚拟环境再运行任何 Python 脚本:
```bash
source venv/bin/activate
```
### 依赖
```bash
pip install -r requirements.txt
```
## 使用方式
本工具通过 Claude Code 的 Skill 方式使用,不提供命令行直接调用。
### 基本用法
启动 Claude Code 后,使用 `/wiki-sync-translate` 命令:
```
/wiki-sync-translate <描述你想要同步的内容>
```
### 使用示例
| 场景 | 命令示例 |
|------|---------|
| 同步所有变更 | `/wiki-sync-translate 同步从2026-01-02以来的所有变更` |
| 同步单个页面 | `/wiki-sync-translate 同步 Maps 页面` |
| 同步特定时间范围的页面 | `/wiki-sync-translate 同步从2026-01-01开始的 General Changes 页面` |
| 同步最近变更 | `/wiki-sync-translate 同步最近一周的所有变更` |
## Skill 位置
`.claude/skills/wiki-sync-translate/`
## 输出目录
`wiki_sync_output/<时间戳>/`
## 注意事项
- 中文 Wiki 行号必须与英文完全对应
- 变更配对使用内容相似度算法(阈值 0.5

214
README.md
View File

@ -1,181 +1,119 @@
# Wiki Sync Tool - Enhanced Version
# PD2 Wiki Sync Tool
一个用于同步和跟踪 MediaWiki 网站变更的 Python 工具,支持双语对比和精确的行号定位
Project Diablo 2 Wiki 中英文同步工具,用于同步英文 Wiki 变更到中文翻译文档
## 功能特点
## 前置要求
- 🔄 自动同步 MediaWiki 网站的最新更改
- 📝 生成带语法高亮的 HTML diff 文件,清晰展示变更内容
- 💾 保存页面完整内容供离线查阅
- ⏰ 支持增量同步,只获取上次同步后的新变更
- 🔍 支持按时间点或特定页面进行同步
- 📁 自动组织输出文件到时间戳目录
- 🌐 **新增**:自动同步中文翻译版本
- 🎯 **新增**:精确的行号映射,点击英文行自动定位到中文对应行
- 📊 **新增**:生成精美的双语对比网页
- 🎨 **新增**现代化的UI设计支持同步滚动和高亮显示
### 1. 安装 Claude Code
## 安装
1. 确保你已经安装了 Python 3.6+
2. 克隆此仓库:
```bash
git clone <repository-url>
cd wiki-sync-tool
```
3. 安装依赖:
```bash
pip install requests python-dotenv
```
## 配置
创建一个 `.env` 文件并配置你的 MediaWiki API 地址:
```env
# 英文版 Project Diablo 2 Wiki API 地址
WIKI_API_URL_EN=https://wiki.projectdiablo2.com/w/api.php
# 中文版 Project Diablo 2 Wiki API 地址
WIKI_API_URL_CN=https://wiki.projectdiablo2.cn/w/api.php
```
或者复制提供的示例配置文件:
本工具需要通过 [Claude Code](https://github.com/anthropics/claude-code) 来使用。
```bash
cp .env.example .env
# 安装 Claude Code
npm install -g @anthropic-ai/claude-code
```
### 2. 克隆仓库
```bash
git clone <repository-url>
cd sync-pd2-wiki
```
### 3. 创建虚拟环境并安装依赖
```bash
python -m venv venv
source venv/bin/activate # Linux/macOS
# 或
.\venv\Scripts\activate # Windows
pip install -r requirements.txt
```
## 使用方法
### 基本全量同步
### 启动 Claude Code
同步自上次运行以来的所有更改:
在项目目录下运行
```bash
python sync.py --run
claude
```
首次运行时,会同步过去 24 小时内的更改。
### 使用 Skill 同步 Wiki
### 指定时间起点同步
在 Claude Code 中使用 `/wiki-sync-translate` 命令:
从指定时间开始同步:
#### 示例 1同步特定时间的所有页面变更
```bash
python sync.py --since 2025-11-28T00:00:00Z --run
```
/wiki-sync-translate 帮我同步从2026-01-02以来的所有变更
```
### 同步特定页面
#### 示例 2同步特定页面
只同步特定页面的最新更改:
```bash
python sync.py --title "Amazon Basin" --run
```
/wiki-sync-translate 同步 Maps 页面的变更
```
### 同步特定页面并更新时间戳
#### 示例 3同步特定时间范围的单个页面
同步特定页面并在完成后更新全局时间戳:
```bash
python sync.py --title "Amazon Basin" --update-timestamp --run
```
/wiki-sync-translate 同步从2026-01-01开始的 General Changes 页面
```
### 查看帮助
#### 示例 4同步最近的变更
```bash
python sync.py --help
```
/wiki-sync-translate 同步最近一周的所有变更
```
## 输出文件
每次运行都会在 `wiki_sync_output` 目录下创建一个以时间戳命名的子目录,包含生成的文件:
- `页面标题-时间戳-revid.diff.html` - MediaWiki原生HTML diff文件
- `页面标题-时间戳-revid.diff.txt` - 文本格式的diff类似git diff
- `页面标题-时间戳-revid.full.txt` - 页面的最新完整内容
- `页面标题-时间戳-revid.old.txt` - 页面的历史版本内容(如果有变更)
- `页面标题-时间戳-revid.cn.txt` - 中文翻译内容(如果找到)
- `页面标题-时间戳-revid.comparison.html` - **双语对比网页**(如果找到中文翻译)
### 双语对比网页特性
生成的双语对比网页具有以下高级功能:
1. **精确行号映射**
- 英文diff中的每一行都标注了对应的中文行号
- 点击英文任意行,自动高亮并滚动到对应的中文行
2. **交互式体验**
- 鼠标悬停时预览对应的中文行
- 点击时高亮显示对应关系
- 平滑滚动动画效果
3. **视觉设计**
- 现代化的UI设计
- 标准的diff配色绿色新增、红色删除、灰色未变更
- 响应式布局,支持移动端查看
4. **同步滚动**
- 左右两栏滚动位置自动同步
- 便于对比相同位置的内容
### Diff 文件示例
文本diff格式示例
同步完成后,文件会保存在 `wiki_sync_output/<时间戳>/` 目录:
```
--- old_version
+++ new_version
@@ -10,7 +10,7 @@
This is line 10
-This line will be removed
+This line will be added
This is line 12
wiki_sync_output/20260322_145110/
├── new_pages/ # 新创建的页面
│ └── [页面名].full.txt # 英文完整内容
├── changed_pages/ # 有变更的页面
│ ├── [页面名].full.txt # 英文最新版本
│ ├── [页面名].cn.txt # 中文原文
│ ├── [页面名].comparison.json # 结构化变更信息
│ └── [页面名].old.txt # 英文历史版本
└── result_pages/ # 更新后的中文文档
└── [页面名].cn.txt # 可直接复制到 Wiki
```
HTML diff特性
- 绿色背景表示新增内容
- 红色背景表示删除内容
- 左侧彩色竖线标识变更类型
- +/- 标记清晰显示变更位置
- 删除内容带有删除线效果
## 变更配对算法
## 技术细节
工具使用**内容相似度算法**来判断变更是否应该配对:
### 行号解析机制
- 相似度阈值0.550%
- 只有内容相似度 ≥ 50% 的行才会被配对为 `replaced`
- 这避免了将完全不同内容的行错误配对
工具使用自定义的diff解析器能够精确提取
- Hunk头部的行号范围信息
- 每一行变更对应的旧版本和新版本行号
- 增删改上下文行的准确位置
### 中文页面搜索策略
1. 首先尝试精确匹配页面标题
2. 如果失败,则进行模糊搜索
3. 支持标题中的空格和特殊字符处理
### 目录组织
## 目录结构
```
wiki_sync_output/
├── 20251211_152702/
│ ├── Amazon_Basin-20251211_152645-12345.diff.html
│ ├── Amazon_Basin-20251211_152645-12345.diff.txt
│ ├── Amazon_Basin-20251211_152645-12345.full.txt
│ ├── Amazon_Basin-20251211_152645-12345.old.txt
│ ├── Amazon_Basin-20251211_152645-12345.cn.txt
│ └── Amazon_Basin-20251211_152645-12345.comparison.html
└── 20251211_153127/
└── ...
sync-pd2-wiki/
├── .claude/
│ └── skills/
│ └── wiki-sync-translate/
│ ├── SKILL.md # Skill 定义
│ └── scripts/
│ └── wiki_sync.py # 同步脚本
├── wiki_sync_output/ # 同步输出
├── references/ # 参考文件
├── requirements.txt # Python 依赖
├── CLAUDE.md # Claude Code 配置
└── README.md # 本文件
```
## 许可证
## License
MIT License
## 贡献
欢迎提交 Issue 和 Pull Request。
MIT

2
requirements.txt Normal file
View File

@ -0,0 +1,2 @@
requests>=2.28.0
python-dotenv>=1.0.0

1035
sync.py

File diff suppressed because it is too large Load Diff

View File

@ -1,8 +0,0 @@
根据READMEsync.py中会获取wiki.projectdiablo2.com的变更并拉下原文的全量文件。现在需要增加以下功能
1. 获取英文wiki的最新页面full(已实现)获取其上个版本的全量full(用上一步中的old_revid拉取).
2. 如果该网页是新增现有逻辑则只保存最新文件full即可。
3. 如果该wiki是变更则用历史版本的full文件和最新的文件进行diff得到diff文件。此处用模仿git diff的Python或库进行。得到diff文件。
4. 对于该页面标题去另一网站wiki.projectdiablo2.cn搜索并拉下原文这是同步的翻译后的中文网站。需要注意的在两个网站的页面ID不会一致但页面title是保持一致的同时绝大部分页面经过了翻译。
5. 保存一个网页生成diff文件的网页展示页面设计美观精致使用现代化的CSS/JS。将页面竖向分成两栏左边为英文源码的两个版本DIFF右侧为同样行号的中文源码。 注意行号是保持一致的。绝大多数页面的中文的行号是完全一致的可以放心对比。diff的展示同样要有标准的红色、绿色等.