Git merge on Windows with duplicate files

Question

I'm having problem synchronizing a local branch with a remote, I know why and have a work-around - but would like to know if there is a better solution.

Relevant factors:

It's for a shared gh-pages branch; so no active development on the branch
I work on Windows (and cannot easily activate Posix-support for case-sensitive file-names).
The repository contains two files that only differ in case for the file names, i.e. they will over-write each other!
I want to add some other files
And synchronize this branch with the remote (using merge/pull).

Items 2 and 3 mean that just checking out the files creates a change, as one case-variant over-writes the other.

However, items 1-4 can still be handled by creating a new local branch, staging and committing the new files - and ignoring warnings.

But merge/pull etc fail; as the files with case-conflicts cause problems.

Work-around:

Delete the local branch and then create a new one based on the remote (assuming there are no local changes).

torek · Accepted Answer

You need not delete any branch but you will have to be careful with your working tree.

Remember that what Git actually uses, both to check out any existing commit, and to make any new commit, is Git's index. Git also calls this entity the staging area, reflecting its role in making new commits, and—though these days you mostly see this as a flag, --cached—the cache, reflecting the index's role in making Git go fast.

Files stored in the index are case-sensitive, so Git's index is capable of holding two separate files named a/readme.txt and A/README.TXT, for instance. Note that files in the index are represented with path names that use forward slashes, and that a/readme.txt is a file name—there are no folders in the index, just files with embedded slashes in their names.

The files that are in the index are stored in Git's internal form. This is not useful to you: these files are in a compressed, read-only, Git-only data format. So Git expands each such file into an ordinary everyday read/write file. This ordinary read/write file goes into a folder and has an ordinary everyday file name. But this means that when Git goes to write both a/readme.txt, which requires creating a folder named a and a file named readme.txt in it, and A/README.TXT, which requires creating a folder named A and a file named README.TXT in that, you get a name collision. Only one folder, and only one file, actually get created.

The index continues to hold both files, a/readme.txt and A/README.TXT. Using one of Git's so-called plumbing commands, git update-index, it's possible (but very difficult and annoying) to update both of these files. Remember that while the index holds files in Git's frozen-and-compressed format, you can replace these index files wholesale, provided that you:

compress the data into Git's internal format, producing what Git calls a blob hash ID;
provide git update-index the file name, such as a/readme.txt, and the blob hash ID.

The data to go into the internal blob object can come from any file anywhere on your computer. It need not even be in your working tree: it can be there or elsewhere. Use git hash-object -w to create the internal blob object, and save the resulting hash ID in a variable. Then, immediately run git update-index to replace the index copy of the file.

For how to use each of these two low-level plumbing commands, see their documentation: git hash-object and git update-index.

Note that these two commands are not really meant for use by humans: they are meant to be run by programs that are more human-oriented, and are just building blocks that commands like git add and git rm can use. The problem with using git add and git rm here is that those programs want to work with your working tree files, using the names as found on your computer—such as A\README.TXT—rather than with Git's internal file names. So that's why you would need to use the low-level commands, so that you can store both of Git's internal files' data (a/readme.txt and A/README.TXT) in files with different names, and then update Git's internal files from those different names.

Edit: I forgot to mention that you will need to read out the two files. There are multiple ways to do this but probably the easiest is to use git show with shell-style redirection. In sh/bash, you would run:

git show HEAD:a/readme.txt > lowercase-readme
git show HEAD:A/README.TXT > uppercase-readme

to get both files out, with two different names, into your working tree. Git will in fact extract both files during git checkout, but they will wind up occupying a single work-tree file, whose name might be either readme.txt or README.TXT, depending on which name "wins" the get-a-Windows-file-name competition; this file might appear in a folder named a, or one named A, depending on which of those "wins" the get-a-windows-folder-name competition.

Git merge on Windows with duplicate files

Relevant factors:

Work-around:

Answers (1)

Related Questions