Reputation: 1397
I have a binary file format that I include in a git repository. I know the file format of the binary and could conceivably create a diff like tool for them that would produce a text output so I could see diffs when I look at a git history. I could even create a tool that could take an original binary file, and the diff text and create the new binary file, that way git wouldn't have to save the binary file over and over again with small changes.
If I were to make these type of tools, how could I integrate it with git?
Upvotes: 5
Views: 1609
Reputation: 20909
From git help config
:
diff.external
If this config variable is set, diff generation is not performed
using the internal diff machinery, but using the given command. Can
be overridden with the ‘GIT_EXTERNAL_DIFF’ environment variable.
The command is called with parameters as described under "git
Diffs" in git(1). Note: if you want to use an external diff program
only on a subset of your files, you might want to use
gitattributes(5) instead.
gitattributes(5)
also mentions a mechanism called textconv
: instead of supplying a diff program, you supply a program that converts your binary file to a textual summary; the normal git diff mechanisms are then used to present diffs of those textual summaries.
Edit: I don't know of any way to make the low-level object-packing routines use a custom diff tool. Reading between the lines of the low-level git-pack-objects(1)
man page, it seems likely that the underlying pack format uses a binary diff format, which adaptively searches for an existing object to construct a binary delta from, so as to avoid storing the entire new object. At this level the objects (files) are just binary blobs, and I think in all but the most obscure cases it's probably best to treat the object packing stuff as an implementation detail.
In other words, if your binary objects are similar to each other at the binary level, they will be represented efficiently automatically by git. The common cases I can imagine where this wouldn't be true are compressed and encrypted files.
Upvotes: 4