I am in charge of several Excel files and SQL schema files. How should I perform better document version control on these files?
I need to know the part modified (different part) in these files and keep all the versions for reference. Currently I am appending the time stamp on the file name, but I found it seemed to be inefficient.
Is there a way or good practice to do better document version control?
By the way, editors send me the files via email.
The answer I have written here can be applied in this case. A tool called xls2txt can provide human-readable output from .xls files. So in short, you should put this to your .gitattributes file:
*.xls diff=xls
And in the .git/config:
[diff "xls"]
binary = true
textconv = /path/to/xls2txt
Of course, I'm sure you can find similar tools for other file types as well, making git diff
a very useful tool for office documents. This is what I currently have in my global .gitconfig:
[diff "xls"]
binary = true
textconv = /usr/bin/py_xls2txt
[diff "pdf"]
binary = true
textconv = /usr/bin/pdf2txt
[diff "doc"]
binary = true
textconv = /usr/bin/catdoc
[diff "docx"]
binary = true
textconv = /usr/bin/docx2txt
The Pro Git book has a good chapter on the subject: 8.2 Customizing Git - Git Attributes