Reputation:
I had a React project folder structure similar to,
├── folderA
├── folderB
│ └── SubFolderA // <--notice here
└── folderC
I noticed the inconsistency and renamed SubFolderA
to subFolderA
.
I then pushed to my repository and a Jenkins build was triggered. The job failed, however, because, after some investigation, my branch lists the new subFolderA
as -- still -- SubFolderA
.
What can I do here?
Upvotes: 1
Views: 1113
Reputation: 487725
Before you read the below (but do read it), note that if you are on a Mac and want to deal with case-sensitivity issues, there is a really easy way to do this by making a case-sensitive file system on a virtual disk, then mounting that disk (just double-click the .dmg
file on the desktop if you like that method, for instance), then clone your repository in the mounted volume. You now have a case-sensitive setup and can deal with everything trivially.
For details on creating a case-sensitive volume, see my answer here to How do I change case of the names of multiple files, already committed?
This is a symptom of a whole class of problems with Git that I like to describe this way: The files you see and work with, in a Git repository, are not the files in the Git repository.
This probably sounds funny (bizarre, impossible, and/or humorous)—it's meant to, to make it easy to remember—but it's literally true. The thing here is that Git doesn't store files. Git stores commits. Each commit then stores files, as a sort of read-only archive, but the files stored inside a commit aren't ordinary everyday files. They're in a special Git-only format, not subject to the same constraints your computer imposes on files. This means Git can store files—file names and/or contents—that your computer may not like, and in some cases, may not even be able to store.1
Aside from line endings (as in footnote 1), the most common manifestation of this has to do with mixed-case file and/or directory ("folder", if you prefer) names. Both macOS and Windows generally provide and present case-preserving but case-insensitive names. Once you create a README.md
file, you cannot create a second ReadMe.md
file in the same place: any attempt to do so just uses the existing README.md
file. And yet, a Git commit can store both files with different content, and you can set this up on a Linux system, with different contents for the two files.
On the Linux system, for instance, we create the README.md
file and put in it the line this is README.md
. Then we create ReadMe.md
and put in it the line this is ReadMe.md
. We commit both files and make sure the commit is now somewhere such that we can get it on our Mac or Windows box (e.g., we git push
it to GitHub, if we can't just use ssh to the Linux box).
Now we clone the repository onto that Mac or Windows box, and ask Git to check out the commit we made on the Linux system. This commit cannot be correctly checked out because to do so requires creating both files, but your system won't let you do that.
So: what happens when you do check out this commit? The answer is that Git actually does check out both files, but your OS only lets you work on / with one of them. If you compare what you have, in your work-area, where Git has extracted its files into ordinary everyday files for you to work on, to what Git has in its area—which is in Git's format and stores both files—you have in effect either deleted one of the two files (if we treat the file whose name isn't there as deleted), or replaced the contents of one of the two files (if we treat the file whose name isn't there as "there" by reading the file whose name is there after doing the case-folding).
That is, there should be both a ReadMe.md
and a README.md
, and there isn't; so either one is deleted, or the one file has whichever line it has, and that means we changed the other file—or maybe even both. Let's say ReadMe.md
got checked out second and overwrote the contents, so that README.md
has the line this is ReadMe.md
.
The interesting thing here is that we can actually work with this repository on the macOS or Windows system. This is because Git does not make commits from what we have in our working tree. When you run git commit
, Git does not use the files you work on / with. It uses, instead, the files that Git has stored in what Git calls, variously, the index, or the staging area, or (rarely now) the cache. These files are in Git's own internal format, and therefore can have any names that Git supports, and any content (including LF-only lines instead of CRLF-ended lines, on Windows systems).
The big stumbling block, of course, is that you can't see or edit the copies of the files that are in Git's index. They're not normal files! To see the contents of a Git-ized file, as it appears in Git's index, you must tell Git to copy the file out of Git's index. The usual way to do that is git checkout
, but this usual way is failing, because the file's name doesn't comport with your system's file-name requirements, in some way or another. But there are other ways to see the content: for instance, git show :README.md
and git show :ReadMe.md
will show you both files' contents. This special colon (:
) syntax is peculiar to Git itself; :README.md
means the file named README.md
as found in the index. Since Git's file names are case-sensitive, this is clearly distinct from :ReadMe.md
; Git won't get the two mixed up.
This particular problem manifestation is, of course, different from the one you're seeing. In your case, the problem is two-fold:
How do you know what case Git has used to store these files? When you run git checkout
, Git just lazily re-uses file and/or folder names that you have in place, which means that if you have SubFolderA
in place, Git will use that, even if the stored-in-Git names use folderB/subFolderA/file.ext
as their names.2
How do you make sure that your next commit will use folderB/subFolderA/file.ext
as the file's name, if that's what you want Git to use as the file's name?
Fortunately, it's easy to view what Git is using, on any system, because there is a low-level command—one that you wouldn't normally use—that lets you dump out what's in Git's index. That command is git ls-files
. When run with no options at all, it simply dumps out all the file names that are in Git's index right now. (Used with --stage
, it produces more-detailed output that helps show why the index is also called the staging area. With --debug
, it dumps out the internal flag bits and other cache information that are the reason that the index is also called the cache. In all cases, the information is not really meant for human consumption: it's what Git calls a plumbing command, which is a command that's used to build higher-level, human-useful commands.)
When you git checkout
some commit (or git switch
to it, which is the same here), Git fills in its index from that commit. So the file names in the index are the ones Git is really using. If you inspect it, you'll see what's in the commit. As you work on / with the commit and use git add
, git mv
, and the like, if you inspect Git's index, you'll see what Git has in the proposed next commit.
Alternatively, you can see what's in some existing commit without checking it out: use git ls-tree -r
with anything that locates the commit itself, such as a raw hash ID, or a branch name. See the gitrevisions documentation for the many ways to spell a hash ID.
Now, if you need to work with such a file, and can't check it out—for instance, if there's a file named aux.h
or con.jpg
in some commit, and you're on a Windows box that refuses to allow you to have a file with that name—here's how you can do it, manually and painfully:
git show :aux.h > fakeaux.h
(replace fakeaux.h
with whatever file name you like). Use whatever method is required in whatever command line interpreter you're using, so as to have Git extract the contents (with git show
) and put them into a file whose name you can deal with. (Alternatively, you might be able to use git mv
to rename the file in the index temporarily, although this might fail because it might notice that the file doesn't exist in your working tree.)Run git hash-object -w fakeaux.h
. Note that hash-object does not apply CRLF conversions, clean filters, and so on. If you have a new-enough Git, you can add --path=aux.h
to make it do whatever conversions would happen with git add aux.h
. If none of this talk about clean filters makes any sense to you, just make sure the file has the right line endings (LF-only if appropriate) first.
The hash-object
command prints out a big ugly hash ID. Snag it with the mouse (cut-and-paste), or if you're using bash, consider using command substitution to capture the hash ID, e.g.:
hash=$(git hash-object -w fakeaux.h)
Last, use git update-index
to replace the hash ID. This is probably best done with the --cacheinfo
option. Note that when using --cacheinfo
you must supply a mode, which should be 100644
for a non-executable file, or 100755
for an executable file.
Here's an example of doing this sort of tricky, by-hand update on a box that doesn't require it at all, just so you can see the method. The first step I used here, git ls-files --stage Makefile
, is just to find the current mode
to use later, though it also shows how files are actually stored in the index (by name and blob-hash-ID):
$ git ls-files --stage Makefile
100644 7b64106930a615c2e867a061f94cd6d3ea834641 0 Makefile
$ git show :Makefile > xx
$ vim xx
[editor session, snipped]
$ git hash-object -w xx
cd099334db6c1136d79653872256c7091db7c1bd
$ git update-index --cacheinfo 100644,cd099334db6c1136d79653872256c7091db7c1bd,Makefile
$ git status -s
MM Makefile
?? xx
The MM
above is because the staged Makefile
has my change in it, but the working tree Makefile
doesn't; the xx
untracked file is where I made the change. I can now run git diff --cached
to show that I have in fact changed the proposed Makefile
in the proposed next commit:
$ git diff --cached
diff --git a/Makefile b/Makefile
index 7b64106930..cd099334db 100644
--- a/Makefile
+++ b/Makefile
@@ -1,3 +1,4 @@
+sneaky
# The default target of this Makefile is...
all::
Note that this problem comes up on macOS not just with case-folding, but also with some Unicode file names. For instance, Unicode has two ways to spell schön
. On a Linux system, we can use both ways to make two different files with two different names that both display as schön
. We can then git add
both of these files, and commit. But on macOS, you can't have two separate files here: we're back in the old ReadMe.md
vs README.md
situation, this time with schön
vs schön
. The case-sensitive volume trick won't help here, but the ugly do-it-by-hand, manipulate-the-index-with-plumbing-commands will.
1Modern Windows and macOS can store any content, so that's not generally a problem, but—especially on Windows—you might see files whose lines have CRLF endings, and yet what's stored inside Git doesn't have CRLF endings. When this is the case, the files you're editing are clearly different than the files that are going into commits.
Technically, it's not Windows or macOS itself that is the problem. Rather, it's the file systems these systems provide. Linux now has the ability to mount case-insensitive file systems, and macOS certainly supports case-sensitive file systems. It's just that the default that you'll encounter on Linux is case-sensitive while the default that you'll encounter on macOS is case-insensitive.
2Note that, as stored in Git, files have names that have embedded slashes in them. These are not folders-with-files, they're just long names containing slashes. The slashes are always normal (forward) slashes, even on Windows. Technically, this is a feature of Git's index, rather than of the files as they're stored in the commits, but since commits are built from the index, you probably should think about the file names this way.
(Note that you can have a file name with a backslash in it, too: some/path/to/file\with\weird\name
is OK, as far as Git is concerned. That's just one long path, but when checked out, Git will try to accommodate a Linux system by breaking it up into three directories / folders, plus a final name: some
, path
, to
, file\with\weird\name
. What Git does on Windows here, I don't know, but it might be interesting to test. What I mean by "always forward slashes" is that when your Windows work-tree has some\path\to\file.ext
, Git has some/path/to/file.ext
in its index.)
Upvotes: 4