Git (or Hg) plugin for dealing with Microsoft Word and/or OpenOffice files

JudoWill picture JudoWill · Jul 20, 2010 · Viewed 21.5k times · Source

Has anyone come across a Git or Hg plugin for "meaningful" diffs/merging/branching of OpenOffice or Microsoft word files.

I know I can 'checkin' .doc files but both Git and Hg treat them as binary blobs. I'd like to be able to do all (or at least many) of the normal revision based operations on the text of the file.

And yes, I do know that I should be using Latex or converting files back-and-forth between RTF. I'm just looking for a more "native" solution since I'm trying to manage collaboration between techies and "management people".

This is related to my question on Biostar here: http://biostar.stackexchange.com/questions/1749/writing-collaboration-with-source-control-and-microsoft-word

Thanks.

Answer

aparkerlue picture aparkerlue · Dec 4, 2010

How about:

  1. Save your Word docs in XML.
  2. Commit your XML Word files.
  3. Diff using an external XML diff tool. For example:

    $ git difftool -t xmldiff c3d293 498571

Transforming the XML files to have one element per line should make the check-in process run efficiently and also allow the external XML diff tool to process quickly.

References: