workplace/validation-docx/; committed focused fixtures under tests/fixtures/docx/word-*.docxAdditional generated image validation candidates were attempted but excluded from this pass because Microsoft Word could not open them.
Most .docx files used for this pass are local validation files under workplace/ and are not committed to the repository. The earlier mixed BasicSample01.docx fixture was replaced by focused Word-authored fixtures under tests/fixtures/docx/word-*.docx.
| ID | Document type | Key features | CLI result | Notes |
|---|---|---|---|---|
| doc-001 | validation memo | paragraphs, inline formatting, external link, nested list, table | issue | Conversion completed, but heading style was not preserved after Word save; inline formatting around proofing markers produced noisy Markdown. |
| doc-002 | validation structure sample | section-like labels, numbered list, grid-span table | pass with caveat | Conversion completed; table grid-span placeholder was preserved. Section-like labels were plain paragraphs after Word save. |
| doc-003 | Word-authored focused fixtures | Heading 1-5, paragraph text, embedded JPEG image | pass | Headings are covered by word-headings-basic.docx. Word inline images are covered by focused image fixtures, and converter reports images: 1 with exported image assets for them. |
npm run cli -- workplace/validation-docx/01-basic-structure.docx --out workplace/validation-docx/out/01-basic-structure-word.md --summary-out workplace/validation-docx/out/01-basic-structure-word.summary.txt --debug
npm run cli -- workplace/validation-docx/02-lists-and-tables.docx --out workplace/validation-docx/out/02-lists-and-tables-word.md --summary-out workplace/validation-docx/out/02-lists-and-tables-word.summary.txt --debug
npm run cli -- tests/fixtures/docx/word-headings-basic.docx --out workplace/validation-docx/out/word-headings-basic.md --summary-out workplace/validation-docx/out/word-headings-basic.summary.txt --debug
npm run cli -- tests/fixtures/docx/word-inline-image-basic.docx --out workplace/validation-docx/out/word-inline-image-basic.md --summary-out workplace/validation-docx/out/word-inline-image-basic.summary.txt --assets-dir workplace/validation-docx/out/word-inline-image-basic.assets --debug
Both commands completed successfully.
Expected structure from the generated validation source:
Observed after Word save and CLI conversion:
headings: 0.**Bold ****text**and*italic** text*.
This should be treated as a focused follow-up candidate: inline formatting should remain readable when Word inserts w:proofErr elements between runs.
Expected structure from the generated validation source:
Observed after Word save and CLI conversion:
headings: 0.←M← was preserved.Expected structure:
word-headings-basic.docxword-inline-image-basic.docxword-inline-image-basic.docxObserved after CLI conversion:
word-headings-basic.docx summary reported headings: 5.word-inline-image-basic.docx.word/media/image1.jpeg.rId5, and word/_rels/document.xml.rels mapped rId5 to media/image1.jpeg.images: 1 and imageAssets: 1 for the focused image fixtures.word/media/image1.jpeg.This covers Word-authored inline drawing image extraction.
The behavior is now split across focused regression fixtures:
tests/fixtures/docx/word-headings-basic.docxtests/fixtures/docx/word-inline-image-basic.docxtests/fixtures/docx/word-image-alt-text-basic.docxThe current regression tests cover heading behavior, inline image extraction, and image alt text behavior.
03-image-and-unsupported.docx could not be opened by Microsoft Word.04-parser-image-asset.docx could not be opened by Microsoft Word.These files are local workplace/ candidates only and should not be treated as Word-round-tripped validation documents.
w:proofErr inside or between formatted runs.word-inline-image-basic.docx has been added.word-inline-image-basic.docx covers a Word-authored embedded JPEG image and is now extracted as an image asset.w:proofErr.miku-docx2md-web repository.word-inline-image-basic.docx.Date: 2026-05-08
Commands rerun against local validation documents:
npm run cli -- workplace/validation-docx/01-basic-structure.docx --out workplace/validation-docx/out-current/01-basic-structure.md --summary-out workplace/validation-docx/out-current/01-basic-structure.summary.txt --debug
npm run cli -- workplace/validation-docx/02-lists-and-tables.docx --out workplace/validation-docx/out-current/02-lists-and-tables.md --summary-out workplace/validation-docx/out-current/02-lists-and-tables.summary.txt --debug
npm run cli -- workplace/validation-docx/05-word-openable-image.docx --out workplace/validation-docx/out-current/05-word-openable-image.md --summary-out workplace/validation-docx/out-current/05-word-openable-image.summary.txt --assets-dir workplace/validation-docx/out-current/05-word-openable-image.assets --debug
npm run cli -- tests/fixtures/docx/word-image-alt-text-basic.docx --out workplace/validation-docx/out-current/word-image-alt-text-basic.md --summary-out workplace/validation-docx/out-current/word-image-alt-text-basic.summary.txt --assets-dir workplace/validation-docx/out-current/word-image-alt-text-basic.assets --debug
Recheck observations:
05-word-openable-image.docx converted with images: 1 and imageAssets: 1.word/media/word-openable-image.png, media type image/png, alt text Word openable image alt, and a nonzero size.word-image-alt-text-basic.docx converted with images: 1 and imageAssets: 1; manifest media type is image/jpeg and alt text is preserved.sectPr as an unsupported trace in these validation documents. This is acceptable diagnostic noise for now.01-basic-structure.docx still produces noisy inline Markdown around adjacent formatted runs. The focused w:proofErr regression is now covered, but broader run coalescing remains a known limitation for generated/minimal validation documents.Browser automation note:
miku-docx2md-web repository.