miku-docx2md

Real Document Validation: v0.8.2

Validation Summary

Additional generated image validation candidates were attempted but excluded from this pass because Microsoft Word could not open them.

Most .docx files used for this pass are local validation files under workplace/ and are not committed to the repository. The earlier mixed BasicSample01.docx fixture was replaced by focused Word-authored fixtures under tests/fixtures/docx/word-*.docx.

Document Matrix

ID Document type Key features CLI result Notes
doc-001 validation memo paragraphs, inline formatting, external link, nested list, table issue Conversion completed, but heading style was not preserved after Word save; inline formatting around proofing markers produced noisy Markdown.
doc-002 validation structure sample section-like labels, numbered list, grid-span table pass with caveat Conversion completed; table grid-span placeholder was preserved. Section-like labels were plain paragraphs after Word save.
doc-003 Word-authored focused fixtures Heading 1-5, paragraph text, embedded JPEG image pass Headings are covered by word-headings-basic.docx. Word inline images are covered by focused image fixtures, and converter reports images: 1 with exported image assets for them.

CLI Commands

npm run cli -- workplace/validation-docx/01-basic-structure.docx --out workplace/validation-docx/out/01-basic-structure-word.md --summary-out workplace/validation-docx/out/01-basic-structure-word.summary.txt --debug
npm run cli -- workplace/validation-docx/02-lists-and-tables.docx --out workplace/validation-docx/out/02-lists-and-tables-word.md --summary-out workplace/validation-docx/out/02-lists-and-tables-word.summary.txt --debug
npm run cli -- tests/fixtures/docx/word-headings-basic.docx --out workplace/validation-docx/out/word-headings-basic.md --summary-out workplace/validation-docx/out/word-headings-basic.summary.txt --debug
npm run cli -- tests/fixtures/docx/word-inline-image-basic.docx --out workplace/validation-docx/out/word-inline-image-basic.md --summary-out workplace/validation-docx/out/word-inline-image-basic.summary.txt --assets-dir workplace/validation-docx/out/word-inline-image-basic.assets --debug

Both commands completed successfully.

Observations

doc-001

Expected structure from the generated validation source:

Observed after Word save and CLI conversion:

**Bold ****text**and*italic** text*.

This should be treated as a focused follow-up candidate: inline formatting should remain readable when Word inserts w:proofErr elements between runs.

doc-002

Expected structure from the generated validation source:

Observed after Word save and CLI conversion:

doc-003

Expected structure:

Observed after CLI conversion:

This covers Word-authored inline drawing image extraction.

The behavior is now split across focused regression fixtures:

The current regression tests cover heading behavior, inline image extraction, and image alt text behavior.

Findings

Bugs

Invalid Validation Candidates

These files are local workplace/ candidates only and should not be treated as Word-round-tripped validation documents.

Missing Fixtures

Known Limitations / Data Notes

Follow-Up Actions

v0.8.3 Recheck Notes

Date: 2026-05-08

Commands rerun against local validation documents:

npm run cli -- workplace/validation-docx/01-basic-structure.docx --out workplace/validation-docx/out-current/01-basic-structure.md --summary-out workplace/validation-docx/out-current/01-basic-structure.summary.txt --debug
npm run cli -- workplace/validation-docx/02-lists-and-tables.docx --out workplace/validation-docx/out-current/02-lists-and-tables.md --summary-out workplace/validation-docx/out-current/02-lists-and-tables.summary.txt --debug
npm run cli -- workplace/validation-docx/05-word-openable-image.docx --out workplace/validation-docx/out-current/05-word-openable-image.md --summary-out workplace/validation-docx/out-current/05-word-openable-image.summary.txt --assets-dir workplace/validation-docx/out-current/05-word-openable-image.assets --debug
npm run cli -- tests/fixtures/docx/word-image-alt-text-basic.docx --out workplace/validation-docx/out-current/word-image-alt-text-basic.md --summary-out workplace/validation-docx/out-current/word-image-alt-text-basic.summary.txt --assets-dir workplace/validation-docx/out-current/word-image-alt-text-basic.assets --debug

Recheck observations:

Browser automation note: