Reputation: 1671
I'm seeing git do things with line endings which seems to contradict everything I've seen on this site and the official docs and even its own warning messages. (Or maybe I fail at reading comprehension.) Here's a small reproduction.
# repro.sh
git --version # 2.27.0.windows.1
mkdir empty
cd empty
echo '* text=auto !eol' > .gitattributes
echo hi > t.txt
git init
git config core.autocrlf false # I guess git attributes overrides this anyway?
git add t.txt # wait what? warning: LF will be replaced by CRLF in t.txt??? I thought git likes LF?
git commit -m 'the plot thickens'
git cat-file -p `git rev-parse HEAD:t.txt` > temp.txt # get raw blob as its really stored, I hope
od -c t.txt # original file in working dir ends in LF
od -c temp.txt # file from git also ends in LF, despite git warning??
# end of script
Does this make any sense? I thought that git sometimes likes to convert CRLF to plain LF on "git add" and does the reverse on checkout, but I've never heard of it turning plain LF into CRLF on git add, as the warning seems to threaten. And then it doesn't do it. The file that gets checked in is exactly what I have in my working directory, as verified by cat-file. So why warn at all? What's going on?
Upvotes: 2
Views: 280
Reputation: 487893
The message itself has always seemed a bit ... wrong? weird? ill-phrased?—I'm not sure what to call it. The intent of the message is to warn you that something seems inconsistent: that the way you see the file in the future may not be compatible with the way you see the file right now.
With that in mind, let's get to specifics:
echo '* text=auto !eol' > .gitattributes
First, on text=auto
: this sets the text
attribute to a string value auto
, which tells Git: please guess whether each file is text or binary. I personally think this is a bad idea: you don't want Git to guess. You should just tell it. Git's guesses are usually pretty good, but I don't like my software to guess that much. :-)
In any case, let's move on to !eol
: this means to set the eol
attribute to the unspecified state. This may not be what you want. It starts unspecified, so if you don't want to specify it, you can just leave it unspecified. The !
prefix exists so that you can correct some previous setting: for instance, if the default should be eol=lf
, you might have:
* eol=lf
but since JPG files should not be munged we can then override this just for *.jpg
:
*.jpg !eol
(although *.jpg binary
is probably better: it means -diff -merge -text
and with -text
the eol
attribute becomes irrelevant).
So, what we have so far is: a file is text if and only if Git guesses that it's text, and the eol attribute is unspecified.
git config core.autocrlf false # I guess git attributes overrides this anyway?
The text
attribute specifically overrides this one. The gitattributes documentation says, in part:
If the
text
attribute is unspecified, Git uses thecore.autocrlf
configuration variable to determine if the file should be converted.
This doesn't say what happens if the text
attribute is specified (which it is, to auto
), but going back just a bit, we find that with text=auto
:
If Git decides that the content is text, its line endings are converted to LF on checkin. When the file has been committed with CRLF, no conversion is done.
This talks only about checkin. The documentation doesn't say this, but that's really during git add
, which is when Git will, maybe, turn CRLF into LF-only.
git add t.txt # wait what? warning: LF will be replaced by CRLF in t.txt???
Git emits these warnings during git add
(unless they're suppressed through other configurations) when it notices anything suspicious. The warnings are, or at least include, what you've seen, which I sometimes call ill-phrased (for lack of a better term). I don't have a better way to phrase them that's not so verbose that it becomes problematic, though.
There are only two built-in LF/CRLF conversions:
An "on the way in" conversion that turns CRLF into LF-only: this happens during only git add
, and then only if it is, or seems to be, called-for.
An "on the way out" conversion that turns LF-only into CRLF: this happens during git checkout
, git reset --hard
, git restore
(if run with an explicit or implied --worktree
), and other similar operations. But, like the on-the-way-in CRLF-to-LF conversion, it only happens if it is or seems-to-be called-for.
What is happening here is that Git is suspicious that you'll have an LF-to-CRLF conversion occur on the way out, some time in the future. I think your setup is not configured this way right now, because you have !eol
and are on Linux (you are on Linux? maybe not: you said windows
in a version string). So maybe your setup is configured this way right now because you have !eol
and are on Windows. I don't use Windows, so I'm not sure what the defaults are on Windows.
Meanwhile, though, t.txt
, as seen in both your index and your working tree, has pure LF-only line endings. If Git were to perform an on-the-way-out LF-to-conversion (from index copy to working tree copy), your t.txt
file in your working tree would suddenly have CRLF line endings.
That's what this warning message means. If, in the future, Git does text conversion on the file, the result of extracting what's now in Git's index won't match the actual file in your working tree right now. The one conversion that Git can do here is to turn LF-only into CRLF, and t.txt
is currently LF-only.
git commit -m 'the plot thickens'
The plot hasn't really thickened, here. All conversions happened before this point. The commit command merely takes the t.txt
file stored in Git's index (that's the only file in Git's index at this point since the repository is all-new) and makes a commit out of that.
git cat-file -p `git rev-parse HEAD:t.txt` > temp.txt # get raw blob as its really stored, I hope
This does, yes. You could equally grab :t.txt
from the index, or use git ls-files --stage
to get the blob hash ID.
Note that the git commit
step has not modified the working tree copy. It's still untouched. To force Git to extract the index copy back to the working tree, first remove the working tree copy, then use any Git command that will create it afresh. This will run the extraction step, which will—or won't—turn LF-only to CRLF as requested by your various configurations:
rm t.txt
git checkout -- t.txt
You can now use od
or similar to see what happened. Did \n
become \r\n
, or not? That tells you how Git interprets your current setup (core.autocrlf
, core.eol
, and the various attributes in .git/info/attributes
and .gitattributes
) for this file.
Note: git ls-files --eol
has, since Git 2.8, been able to tell you more about what's going on here. It will, separately:
to each file that is presently in the index.
Upvotes: 3