Interesting approach. Does keeping the model in HTML also preserve enough structure for tracked changes/comments, or do you handle those as a separate layer when converting back to DOCX?
tanin•Jun 18, 2026
Thank you!
My thesis is that an intermediate layer would eventually end up being equivalent to the docx format, so I've decided not to have any intermediate representation.
We convert docx to html and send it AI. When AI rewrites the HTML and it back, we diff the rewritten HTML against the docx's document.xml and make the modification. This is a simplistic explanation of it. There are a bunch of validations and processing going on.
Regarding the tracked changes/comments, we simply invent new HTML tags for those things e.g. <ins>, <del>, <commentRangeStart> and etc.
dev-kdrainc•Jun 18, 2026
Thanks for sharing! I like your approach to working under the hood! Great job
tanin•Jun 19, 2026
Thank you!
StahlGuo•Jun 18, 2026
I would try it today, sounds good
tanin•Jun 18, 2026
Please let me know if you have questions or suggestions.
3 Comments
My thesis is that an intermediate layer would eventually end up being equivalent to the docx format, so I've decided not to have any intermediate representation.
We convert docx to html and send it AI. When AI rewrites the HTML and it back, we diff the rewritten HTML against the docx's document.xml and make the modification. This is a simplistic explanation of it. There are a bunch of validations and processing going on.
Regarding the tracked changes/comments, we simply invent new HTML tags for those things e.g. <ins>, <del>, <commentRangeStart> and etc.