PatS
PatS

Reputation: 11554

What rule is git using from .gitattributes to determine the file type and line ending?

I'm trying to track down what rule is being used for files in my repo when I have a .gitattributes file. Is there a git command that tells you what a given file's type (text vs binary) and line-ending are?

To understand what I'm asking for, consider the following (truncated) .gitattributes file:

*.bat text eol=crlf
*.dll binary
*.exe binary

WinBin/* text eol=crlf
scripts/* text eol=lf

How should the following files be treated and which rule should be used? Below is my guess

test1.bat => Should match *.bat text eol=crlf
WinBin\test1.bat => Should match WinBin/* text eol=crlf
WinBin\test2.exe=> Should match WinBin/* text eol=crlf (but I'd rather it were binary)
scripts\test1.bat => Should match scripts/* text eol=lf

There are lots of resources out there describing how git's end-of-line works and the various issues. Below are some I've found

Some useful links

My configuration

I'm on windows using Git for Windows (GfW) and I've used git from git bash.exe as well as from Windows cmd.exe; I also have cygwin git.

GitForWindows: git --version => git version 2.26.0.windows.1 (latest)
Cygwin Git: git --version => git version 2.8.3 (pretty out of date)

Upvotes: 1

Views: 1698

Answers (2)

torek
torek

Reputation: 489818

Technically, what happens is that the attributes handler is given a file name and a list of rules from all the various .gitattributes files (there can be more than one). For each file name:

  • The name implies some directory, e.g., when the name is scripts/foo,1 the file is in a scripts directory, so the attributes file from that directory (scripts/.gitattributes) takes priority and the .gitattributes from the src/ directory (src/.gitattributes) does not apply at all.

  • Starting with the lowest priority file, Git applies each line to see if it matches. If it does match, all the attributes on that line apply. If an earlier line set an attribute, this later line overrides it. Otherwise, the earlier-set attribute still applies.

  • Git repeats this process for all applicable attribute files. Since the higher priority ones apply later, their lines override. The last lines of such files override earlier lines.

In this case, then, you had two applicable lines in the same .gitattributes file that would apply to scripts/test1.bat. So for that file:

*.bat text eol=crlf
scripts/* text eol=lf

the first line matches and sets eol=crlf, then the later line matches and sets text and sets eol=lf.

The result is that the final attributes grouping for this particular file is: text (set), eol=lf.

You didn't try this as an example, but suppose you had a file named scripts/a.exe. There's a line:

*.exe binary

along with the line:

scripts/* text eol=lf

so the combination here appears to be binary (set), text (set), and eol=lf. However, binary is itself a macro. You can define your own macros (though only in some attribute files). In this case, the macro expands to -diff -merge -text.

Each - means unset. So along with unsetting diff and merge, the first matching line did an unset text operation. The second matching line, scripts/*, did a set text operation, which overrode the previous unset text operation. The end result is that for this file, text is set, diff is unset, merge is unset, and eol is set-to-lf.

Note that unset is different from unspecified. An unspecified attribute foo means that no line explicitly had -foo or foo or foo=value on it.2 Except for the predefined macros and the attribute names called out in the documentation, attribute names aren't really fixed in stone: future versions of Git might add new ones, and you can set or unset or set-to-a-value attribute names that you make up yourself, and Git won't say anything about them. Writing:

* foo=bar

has no effect because nothing inside Git currently looks for a foo attribute, but if a future Git decided to use foo to mean something, it might suddenly start doing things.

It's pretty hard to deal with all of this, which is why git check-attr exists. See phd's answer.


1Git turns all backslashes to forward ones internally. You might want to favor the forward slash because \b, for instance, means backspace to some shell commands; \t means tab, \r means carriage-return, and so on. If you're using sh or bash, the shell will do its own thing with them, so reserve them for the shell. Do not use them in path names. Use the forward slash—it's easier to type anyway!

2If some previous line has set, unset, or set-to-some-value some attribute foo, a later line can undo that with !foo. This removes all settings and puts foo back into unspecified state. This is entirely different from -foo, which puts foo into unset state.

(Not every attribute that has a meaning actually uses all these distinctions.)

Upvotes: 4

phd
phd

Reputation: 94943

git check-attr — Display gitattributes information.

For every pathname, this command will list if each attribute is unspecified, set, or unset as a gitattribute on that pathname.

You probably need option -a:

git check-attr -a test1.bat WinBin\test1.bat WinBin\test2.bat scripts\test1.bat

Upvotes: 4

Related Questions