Reputation: 1106
I am experiencing a curious situation, probably related to this question, but I'd like to better understand what is going on here.
I have a repository where right after a clone
git status
reports that a file has been modified.
I created a minimal reproduction here,
with a repo containing just the ignore list, a very trivial .gitattributes
,
and the file causing me headaches: gradlew.bat
.
All my attempts in the following are performed using Linux/ZSH (the issue has been reproduced on multiple Linux installations and shells).
Right after clone, if I run git status
, I get:
❯ git status
On branch master
Your branch is up to date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: gradlew.bat
no changes added to commit (use "git add" and/or "git commit -a")
And if I try to check out the unmodified version with git checkout HEAD -- gradlew.bat
, then issue git status
again:
❯ git checkout HEAD -- gradlew.bat
❯ git status
On branch master
Your branch is up to date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: gradlew.bat
no changes added to commit (use "git add" and/or "git commit -a")
Okay then, I downloaded the file directly from GitHub, and checked the hashes:
❯ md5sum gradlew.bat
6b56324406b764fd6c5d4d7d215a3cd7 gradlew.bat
❯ sha512sum gradlew.bat
d4fef021e30640670fe20243e4fc4f0336b2f118f8c172c138a8c0c3028c93b12da9479812cede4196401bbc87ce9df89573dbec7378373cafafca6698867f55 gradlew.bat
Which are exactly the same of the file git
mark as changed:
❯ md5sum gradlew.bat && sha512sum gradlew.bat
6b56324406b764fd6c5d4d7d215a3cd7 gradlew.bat
d4fef021e30640670fe20243e4fc4f0336b2f118f8c172c138a8c0c3028c93b12da9479812cede4196401bbc87ce9df89573dbec7378373cafafca6698867f55 gradlew.bat
This means it's not even matter of LF
/CRLF
line endings.
git diff
is not helpful either, as it just suggests that the file changed entirely:
diff --git a/gradlew.bat b/gradlew.bat
index ac1b06f..107acd3 100755
--- a/gradlew.bat
+++ b/gradlew.bat
@@ -1,89 +1,89 @@
-@rem
-@rem Copyright 2015 the original author or authors.
-@rem
-@rem Licensed under the Apache License, Version 2.0 (the "License");
-@rem you may not use this file except in compliance with the License.
-@rem You may obtain a copy of the License at
-@rem
-@rem https://www.apache.org/licenses/LICENSE-2.0
-@rem
-@rem Unless required by applicable law or agreed to in writing, software
-@rem distributed under the License is distributed on an "AS IS" BASIS,
-@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-@rem See the License for the specific language governing permissions and
-@rem limitations under the License.
-@rem
-
-@if "%DEBUG%" == "" @echo off
-@rem ##########################################################################
-@rem
-@rem Gradle startup script for Windows
-@rem
-@rem ##########################################################################
-
-@rem Set local scope for the variables with windows NT shell
-if "%OS%"=="Windows_NT" setlocal
-
-set DIRNAME=%~dp0
-if "%DIRNAME%" == "" set DIRNAME=.
-set APP_BASE_NAME=%~n0
-set APP_HOME=%DIRNAME%
-
-@rem Resolve any "." and ".." in APP_HOME to make it shorter.
-for %%i in ("%APP_HOME%") do set APP_HOME=%%~fi
-
-@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
-set DEFAULT_JVM_OPTS="-Xmx64m" "-Xms64m"
-
-@rem Find java.exe
-if defined JAVA_HOME goto findJavaFromJavaHome
-
-set JAVA_EXE=java.exe
-%JAVA_EXE% -version >NUL 2>&1
-if "%ERRORLEVEL%" == "0" goto execute
-
-echo.
-echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
-echo.
-echo Please set the JAVA_HOME variable in your environment to match the
-echo location of your Java installation.
-
-goto fail
-
-:findJavaFromJavaHome
-set JAVA_HOME=%JAVA_HOME:"=%
-set JAVA_EXE=%JAVA_HOME%/bin/java.exe
-
-if exist "%JAVA_EXE%" goto execute
-
-echo.
-echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%
-echo.
-echo Please set the JAVA_HOME variable in your environment to match the
-echo location of your Java installation.
-
-goto fail
-
-:execute
-@rem Setup the command line
-
-set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar
-
-
-@rem Execute Gradle
-"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %*
-
-:end
-@rem End local scope for the variables with windows NT shell
-if "%ERRORLEVEL%"=="0" goto mainEnd
-
-:fail
-rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of
-rem the _cmd.exe /c_ return code!
-if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1
-exit /b 1
-
-:mainEnd
-if "%OS%"=="Windows_NT" endlocal
-
-:omega
+@rem
+@rem Copyright 2015 the original author or authors.
+@rem
+@rem Licensed under the Apache License, Version 2.0 (the "License");
+@rem you may not use this file except in compliance with the License.
+@rem You may obtain a copy of the License at
+@rem
+@rem https://www.apache.org/licenses/LICENSE-2.0
+@rem
+@rem Unless required by applicable law or agreed to in writing, software
+@rem distributed under the License is distributed on an "AS IS" BASIS,
+@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+@rem See the License for the specific language governing permissions and
+@rem limitations under the License.
+@rem
+
+@if "%DEBUG%" == "" @echo off
+@rem ##########################################################################
+@rem
+@rem Gradle startup script for Windows
+@rem
+@rem ##########################################################################
+
+@rem Set local scope for the variables with windows NT shell
+if "%OS%"=="Windows_NT" setlocal
+
+set DIRNAME=%~dp0
+if "%DIRNAME%" == "" set DIRNAME=.
+set APP_BASE_NAME=%~n0
+set APP_HOME=%DIRNAME%
+
+@rem Resolve any "." and ".." in APP_HOME to make it shorter.
+for %%i in ("%APP_HOME%") do set APP_HOME=%%~fi
+
+@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
+set DEFAULT_JVM_OPTS="-Xmx64m" "-Xms64m"
+
+@rem Find java.exe
+if defined JAVA_HOME goto findJavaFromJavaHome
+
+set JAVA_EXE=java.exe
+%JAVA_EXE% -version >NUL 2>&1
+if "%ERRORLEVEL%" == "0" goto execute
+
+echo.
+echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
+echo.
+echo Please set the JAVA_HOME variable in your environment to match the
+echo location of your Java installation.
+
+goto fail
+
+:findJavaFromJavaHome
+set JAVA_HOME=%JAVA_HOME:"=%
+set JAVA_EXE=%JAVA_HOME%/bin/java.exe
+
+if exist "%JAVA_EXE%" goto execute
+
+echo.
+echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%
+echo.
+echo Please set the JAVA_HOME variable in your environment to match the
+echo location of your Java installation.
+
+goto fail
+
+:execute
+@rem Setup the command line
+
+set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar
+
+
+@rem Execute Gradle
+"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %*
+
+:end
+@rem End local scope for the variables with windows NT shell
+if "%ERRORLEVEL%"=="0" goto mainEnd
+
+:fail
+rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of
+rem the _cmd.exe /c_ return code!
+if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1
+exit /b 1
+
+:mainEnd
+if "%OS%"=="Windows_NT" endlocal
+
+:omega
The next I could think of was permissions, but the file was -rwxr-xr-x
and remains -rwxr-xr-x
.
I tried to see if there's anything else via stat
, but I found no clue there either:
❯ git reset --hard HEAD && stat gradlew.bat && git status && stat gradlew.bat
HEAD is now at f6d1022 remove irrelevant stuff
File: gradlew.bat
Size: 2763 Blocks: 8 IO Block: 4096 regular file
Device: 259,2 Inode: 7342244 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 1000/ <redacted>) Gid: ( 1000/ <redacted>)
Access: 2022-07-05 14:48:48.314141714 +0200
Modify: 2022-07-05 14:48:48.314141714 +0200
Change: 2022-07-05 14:48:48.314141714 +0200
Birth: 2022-07-05 14:48:48.314141714 +0200
On branch master
Your branch is up to date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: gradlew.bat
no changes added to commit (use "git add" and/or "git commit -a")
File: gradlew.bat
Size: 2763 Blocks: 8 IO Block: 4096 regular file
Device: 259,2 Inode: 7342244 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 1000/ <redacted>) Gid: ( 1000/ <redacted>)
Access: 2022-07-05 14:48:48.314141714 +0200
Modify: 2022-07-05 14:48:48.314141714 +0200
Change: 2022-07-05 14:48:48.314141714 +0200
Birth: 2022-07-05 14:48:48.314141714 +0200
I'm now out of ideas, what is causing this behaviour?
Upvotes: 2
Views: 1889
Reputation: 487745
This means it's not even matter of LF/CRLF line endings.
Ah, but it is.
Your repository is clone-able, so I cloned it. Here's what's actually in the file:
$ git rev-parse HEAD:gradlew.bat
ac1b06f93825db68fb0c0b5150917f340eaa5d02
$ git cat-file -p ac1b06f93825db68fb0c0b5150917f340eaa5d02 | head -3 | vis
@rem\^M
@rem Copyright 2015 the original author or authors.\^M
@rem\^M
The vis
command shows what's in the file, making sure that control characters like carriage return (control-M) are visible as backslash, hat, letter-code. We see that the file actually has CRLF endings as stored in the repository. This copy of the file literally cannot be changed, because it's inside a commit, and no part of any commit can ever be changed.
Curiously, we find the following .gitattributes
file:
$ vis .gitattributes
* text=auto eol=lf
*.[cC][mM][dD] text eol=crlf
*.[bB][aA][tT] text eol=crlf
*.[pP][sS]1 text eol=crlf
Now, the interesting thing about a .gitattributes
like this is that it tells Git to mess with file data. The tricky part is how Git will go about doing this messing-with-file-data:
The "as directed" part is complex and is determined by the rules in the .gitattributes
, but yours add up to saying that for *.bat
files, Git should do the operation in both cases. So it does:
Since the file as committed has CRLF endings, nothing happens "on the way out", but should you put the file back in, it will be changed to be stored with LF-only line endings.
We can see this in action here. We start with git ls-files --eol
to tell us what's actually in the index and working tree, for each file stored in Git's index:
$ git ls-files --eol
i/lf w/lf attr/text=auto eol=lf .gitattributes
i/lf w/lf attr/text=auto eol=lf .gitignore
i/crlf w/crlf attr/text eol=crlf gradlew.bat
So we see that the attr
s applied to gradlew.bat
are text eol=crlf
. The attr
s applied to the other files are text=auto eol=lf
.
The index and working tree copies of the .gitattributes
and .gitignore
are LF-only. The index and working tree copies of gradlew.bat
are CRLF (for both).
If we now git add gradlew.bat
—we may need to use --renormalize
, depending on Git vintage and certain raw stat data and timings and a lot of other details that vary from one system to another—and then run git ls-files --eol
again, we see that the index version of gradlew.bat
has changed:
$ git ls-files --eol
i/lf w/lf attr/text=auto eol=lf .gitattributes
i/lf w/lf attr/text=auto eol=lf .gitignore
i/lf w/crlf attr/text eol=crlf gradlew.bat
Committing this version will make a new commit in which the stored-for-all-time copy has LF-only line endings. Every extraction will produce CRLF endings, because gradlew.bat
has attr/text eol=crlf
applied, and every git add
will have those CRLF endings changed back to LF-only.
This whole area of Git's operation is very messy. If it's possible to not have Git mess with line endings, that's always my preference. However, if some files must have CRLF endings, the .gitattributes
style you've written is my preference here: the files in the repository will be LF-only, but the files in your working tree will be CRLF files. You may have to do one git add --renormalize .
pass to "clean up" and commit so that from then on, Git is happy with things.
Upvotes: 6