Reputation: 609
I am trying to keep file which holds branch specific information (version).
after adding the file i update the .gitgnore
with the file path.
The problem : when i merge the branch the file is getting merge into the destination branch.
I have tried the following: How to make Git "forget" about a file that was tracked but is now in .gitignore?
Using git rm --cached <file>
delete the file
Adding The file using --force
causes the file to be tracked even if it is ignored in .gitignore
is there a way to keep a file on specific branch ?
thanks
Upvotes: 3
Views: 2888
Reputation: 461
While I adore @torek's detailed answer above (points all around!), I'd like to share my solutions to OP's original question—namely,
How to keep a file which holds branch specific information (e.g. version)
This is probably my cleanest answer.
I'll assume that the file you want to be immune to incoming merges is a plaintext file called VERSION
.
Start with a clean index.
git commit
or git restore --staged --all
Append the following to a .gitattributes
file in the repo directory containing VERSION
.
echo "VERSION merge=ours" >> .gitattributes
Sanity check w/ git check-attr
:
git check-attr --all VERSION
## Expected output: ##
# VERSION: merge: ours
Stage and commit .gitattributes
.
For this technique to actually work, you and any other maintainers will need to run the following:
git config --global merge.ours.driver true
# or --local or wherever ¯\_(ツ)_/¯
Now, whenever merging into a branch containing this .gitattributes
definition, the file VERSION
will never get clobbered.
.gitattributes
entry—will likely include this VERSION
file in the merge (potentially causing a conflict) before our special attribute can be applied.Upvotes: 1
Reputation: 488183
Git has:
but:
so since branches don't really exist as a thing—they're completely changeable at all times—there literally can't be any branch-specific files. There are only commit-specific files.
In a comment, you say:
i thought after i add the file to .gitgnore in its current state it will be ignored from this time and forward
This is not the case.
Git does not ever really ignore files, and .gitignore
is the wrong name for what goes into this file. However, a correct name would be something like .git-do-not-complain-if-these-files-are-untracked-and-if-they-are-untracked-and-I-use-an-en-masse-add-command-do-not-make-them-tracked
. While that's what listing files here does, it's a ridiculous name for the file, so the Git guys chose .gitignore
: an inaccurate name, but one that's short and people can actually type in.
The trick to making this all make sense is to change your point of view. Those new to Git think that Git stores files, and uses branches to do that. Both of these ideas are wrong! The things Git stores are commits.
Now, commits do contain files. But it's an all-or-nothing deal: you either do have a commit—and thus all of its files—or you don't, and thus you don't have any of its files. The files stored inside each commit—each commit holds a full snapshot of every file; we'll come back to this idea in a moment—is stored in a special, read-only, Git-only, compressed and de-duplicated form. Nothing but Git can even read these files, and nothing—not even Git itself—can write to them. This means that the committed files are entirely useless for getting any work done.
Because committed files are useless (for getting work done that is), Git has to extract the committed files to a work-area. In this work area, the extracted files are expanded out to their normal everyday form. These are the files that you can see and work with. They are ordinary files. The only thing special about them is that Git extracted them from some commit, at some point.
Because they're copies that have literally been copied out of a commit, these files are not, in fact, in Git at all! They're just there for you to use, however you like. They are your files, in other words. Git's files are for Git, and yours are for you: Git just copies some out, sometimes, when you tell it to.
Besides the stored files—stored in this special Git-ified form—each commit stores some metadata, or information about the commit itself. This includes stuff like the author name and email address. We won't go into any detail here, but this metadata is crucial to Git: Git can't work without it.
The point of all of this is to see how the files you see and work with are not Git's files. They aren't in Git, so you can do anything you want with them. This is mostly a good thing—but since they're not in Git, you can also add more files to your work area, that didn't even come out of Git in the first place. These extra, added files are untracked files. There's a bit of mechanism here, and it's important to know about.
Your working tree, where you have your files, is pretty simple, if you've been using computers and files for a while. It's just like any other set of files that you can work with. You can create and destroy files here all you like. Git just set up some files initially for you, by checking out some commit. Git used a branch name to find that commit and we'll come back to this idea later, but for now, we just note that your initial set of files probably came out of a commit just now. (If not, they came out of a commit some time ago.)
This means there are two copies of each of these files:
But, in between the commits—which store read-only, compressed and de-duplicated files in an internal Git-only format—and your working tree, where you have these ordinary files, Git adds a third copy—well, sort of a copy—of each of these files. This third "copy" is stored in what Git calls, variously, the index, or the staging area, or the cache. The three terms all refer to the same thing. What's really in here is the file's name and some other information, including information about a pre-de-duplicated copy. When the file just came out of some commit, that de-duplicated copy is the frozen-for-all-time copy in the repository.
Because that copy is frozen, all commits that use the same version of that file get to share it. So does Git's index. So it doesn't really take any space to speak of. That's the de-duplication in action.
If you change the work-tree copy, you'll need to get Git to update its index copy. This is what the git add
command is about. When you run git add
on a file you've changed, Git:
—and this means that after git add
, the index is ready to go: you can make a new commit that has the updated file. All the existing files are still there, in Git's index. So what's going on is that the index has, at all times, the next commit in it, ready to go. The next commit is initially the same as the current commit.
If you git add
a file that isn't in the index yet, Git compresses the file down into the frozen format and creates a new entry in Git's index, for the new file. If you git rm
a file, Git removes the file from both its index and your work-tree, and now the next commit will completely lack the file. In this way, the index remains the proposed next commit.
If you remove a file from your work-tree, and then run git add
on that file, Git simply removes the index entry. The frozen file in the commit(s) is untouched by this process. That's the case for the git rm
as well: it only removes the index entry, not the underlying file data. That underlying file data can never be changed at all, and as long as some commit(s) are using it, it can't be removed either. So, again, the index holds the proposed next commit.
What this means is that git commit
merely needs to snapshot the index. Everything in the index is already in the frozen format, ready to go: it just has to be packaged up into a new commit. The set of files that go in this new commit are precisely those files that are in Git's index, at that time.
What you do in between git checkout
and git commit
, in your work-tree, is not really relevant here. But git add
copies from your work-tree to Git's index, and that does matter. So changing work-tree copies of files is useful, as long as you also git add
these changes.
If you don't git add
these changes, a work-tree file that came out of your current commit, and therefore is now "out of date" in the index (by comparison to what's in your work-tree), remains "out of date" there in a new commit, if you make one. That's fine: sometimes that's what you want! It's a bit annoying to have git status
tell you about these changes not staged for commit, but there is nothing wrong with this. However, note that if you run git add .
, the file is already in the index and will be updated in the index. An en-masse operation like git add .
or git add *
or git add -u
will see that the work-tree copy is updated as compared to the index copy, and will update the index copy from the work-tree, as usual.
What this means is that it's very important to know what's in the index. There are no user-oriented Git commands to list out the files in the index, but there is one that's not user-oriented: you can run git ls-files
(use git ls-files --stage
to get more detail; git ls-files
by itself just lists the names of files that are in Git's index; see the documentation for details).
Your work-tree is yours. Because of this, you can create files in it that are not yet in Git's index. The git status
command calls these files untracked. So if the commit you checked out has no file named F
, and you create a new file named F
, you now have an untracked file named F
.
You can also use git rm --cached
to remove a file copy from Git's index, without removing it from your work-tree. Suppose Git extracted some commit, and that commit came with a file named F
. This means there are currently three copies of F
: the read-only copy in the current commit, the copy in Git's index ready to go into the next commit, and the copy in your work-tree. If you now run git rm --cached F
, Git removes the index copy of F
. The read-only copy is still in the current commit, but your proposed next commit—the stuff in the index—lacks file F
. A new commit you make now will have no file F
.
Either way, file F
is now in your work-tree, but not in Git's index. That's what makes the file untracked. Because it's not in Git's index, it won't be in the next commit. If there's an F
in the current commit, the difference between the current and next commits is going to include the instruction: "delete file F
". If there's no F
in the current commit, there isn't going to be a difference concerning file F
, because it's not going to be in the current or next commit.
So an untracked file is very simple: it's a file that you have in your work-tree, but that is not in Git's index. But the way the file became untracked is less simple: maybe it's untracked because it wasn't committed, or maybe it's untracked because you removed it from Git's index. You get some control over this, after all.
By contrast, a file that is in Git's index is tracked. Maybe it's there because it was in the commit you checked out. Or, maybe it's there because you ran git add
to copy it into Git's index. You get some control over this too.
But you do need to remember that when you switch to some other commit, Git is going to empty its index of these files, from this commit, and fill the index instead with those files, from the other commit you're switching to. So if you have some tracked or untracked files now, that situation could change, because the index's contents can change with a new git checkout
.
(Some variants of git reset
, and the new-since-2.23 git restore
command, also affect Git's index. It's a big and important data structure! It's where new commits come from. So it's always a good idea to think about it now and then.)
.gitignore
Listing a file in .gitignore
mainly does the two things I listed earlier:
It keeps git status
from saying that the file is untracked, when it's untracked.
Note that git status
might not have said anything about it when it's tracked either. So you can't really tell if a file is tracked or not just from git status
, at least, not in all cases:
git status
says some file is staged for commit or not staged for commit, that file is definitely tracked.git status
says some file is untracked, that file is definitely untracked.git status
says nothing about a file, we don't know its tracked/untracked status.And, it keeps git add .
or other similar "add many files" commands from adding the file, if it's currently untracked.
But if some file is already tracked, the entry in .gitignore
has no effect.
There is a special trick you can use, but it's not designed for this. Suppose you have some file F
that is tracked and is committed. You make some change to your working tree copy of F
, but you want to be sure you don't accidentally git add
this change and copy it back into Git's index: you want Git's index to keep the old copy.
What you can do here is run:
git update-index --skip-worktree F
to set, on the index entry for file F
, the skip worktree flag. This tells Git to pretend that the index copy is always in sync with the work-tree copy. A git add .
won't copy F
back into Git's index, so that the old copy will remain in Git's index.
This flag is meant for something Git calls sparse checkout. If you start setting it here, you won't be able to use the sparse checkout feature properly later. It also has a bunch of weird side effects, because Git sometimes needs to change out the index copy of the file. I won't go into all the details here: they get complicated.
Without going into all the details, the metadata associated with each commit lets Git string commits together into chains. These chains all point backwards (for reasons we won't go into), so if we draw them, we get a picture that looks like this:
... <-F <-G <-H
where H
is the last commit in the chain. The commit has some hash ID, which is how Git actually finds the commit; you'll see these hash IDs in git log
output, for instance. But hash IDs are useless to humans because they're so big and ugly and impossible to get right. So, while Git uses the hash IDs, we don't. We use branch names.
A branch name simply holds the hash ID of the last (most-recent / newest) commit that we'd like to say is "on the branch". When we have a single, simple chain of commits, we get something like this:
...--F--G--H <-- branch
Once we start making lots of branches and different commits on these branches, though, we get divergence:
I--J <-- br1
/
...--G--H
\
K--L <-- br2
There are a bunch of interesting things about this. Two of them are:
H
are on both branches.br1
selects commit J
; the name br2
selects commit L
.This means that git checkout br1
will fill in both Git's index and your work-tree from the files that are in commit J
. Using git checkout br2
, you'll replace those files with the ones from commit L
. As long as the set of file names in J
and L
remain the same, any untracked files in your work-tree, that aren't in either J
or L
, are undisturbed as you switch between commits J
and L
.
But the things about branch names are:
Let's take a look at creating a new name for commit J
right now. We start with git checkout br1
, to select commit J
, and update our drawing a bit:
I--J <-- br1 (HEAD)
/
...--G--H
\
K--L <-- br2
The special name HEAD
is how Git knows which branch name we're using.
Now let's create a new name, br3
, using git branch br3
. The picture changes a bit:
I--J <-- br1 (HEAD), br3
/
...--G--H
\
K--L <-- br2
There are now two names for commit J
. If we create a new commit now, the new commit we make gets a new, unique hash ID, but we'll just call it N
for N
ew (I like to reserve M
for M
erge here):
N <-- br1 (HEAD)
/
I--J <-- br3
/
...--G--H
\
K--L <-- br2
What Git did is to snapshot the files in the index and add appropriate metadata, and thus create commit N
, and then write N
's actual hash ID into the current branch name.
Commit J
is now the last commit on br3
, but is on both br1
and br3
. Commits up through H
are on all three branches.
The files that are in commit J
do not, and cannot, change, as we create new commits. Yet commit J
used to be the tip commit of branch br1
. It's now the tip commit of branch br3
, with branch br1
having moved on to select commit N
.
The files in J
can be specific to commit J
. You just put the right contents into Git's index, then run git commit
, when you make commit J
. But they can't be specific to some branch name: the names move around, and someday commit J
can be on more than one branch. For instance, we can now run git checkout br2
and then run git merge br3
. The checkout br2
step does this:
N <-- br1
/
I--J <-- br3
/
...--G--H
\
K--L <-- br2 (HEAD)
The merge
command finds the merge base (commit H
) and then combines work done on the H
-to-L
leg with work done on the H
-to-J
leg, to produce a new merge commit M
:
N <-- br1
/
I--J <-- br3
/ \
...--G--H M <-- br2 (HEAD)
\ /
K--L
The branch name br2
now selects the new merge commit, and this merge commit reaches back to both commits L
and J
, so that commit J
is now on all three branches. (So is commit I
.)
The set of branches that "contain", or reach, a commit—by starting at the tip and working backwards—changes dynamically in a Git repository. A commit can be, and many are, on more than one branch.
Upvotes: 10