Reputation: 18657
Faced with an unfortunate need to version MS Word documents I have implemented the following configuration
~/.gitconfig
# Help MS Word document versioning
[diff "pandoc"]
textconv=pandoc --to=markdown
prompt = false
./repo/.gitattributes
# Version control MS Word
*.docx diff=pandoc
*.docm diff=pandoc
When I try to run git diff Big-Problematic-Document.docm
19:17 $ git diff Big-Problematic-Document.docm
UTF-8 decoding error in /var/folders/7x/kwc1y_l96t55_rwlv35mg8xh0000gn/T//uPSuEc_Big-Problematic-Document.docm at byte offset 22 (c1).
The input must be a UTF-8 encoded text.
fatal: unable to read files to diff
pandoc
converted that thing and kept some non-UTF-8 encoded textIs there a way to further develop ~/.gitconfig
so the pandoc
conversion will remove non-UTF-8 text?
Upvotes: 0
Views: 305
Reputation: 11
seems that pandoc
does not know how to treat .docm
(Word w/ VB macros).
Give him an help and add the --read=docx
explicit hint that the input is in fact a Word doc(x).
Probably you want to add that to the docm
line in your .gitattributes
.
That cures it for pandoc 2.9.1.1
Upvotes: 1