Reputation: 2636
Consider the following scenario. I have three branches - Master
, Develop
and Test
. All three branches have a config file (say a JenkinsFile
) that contains branch specific configuration. Please note that the configuration in this file is different for all three branches. Now, I create a feature branch off Master
, make some changes and merge this feature branch with Develop
and then with Test
.
The question is - how do I prevent the JenkinsFile
from being overwritten by any merge? I want the JenkinsFile
to remain intact and not be affected by any merge. Is there a way to "lock" these files? Does gitignore
work in this case?
Cheers!
Upvotes: 1
Views: 2757
Reputation: 490058
The question is - how do I prevent the JenkinsFile from being overwritten by any merge? I want the JenkinsFile to remain intact and not be affected by any merge. Is there a way to "lock" these files?
No.
There is a completely different way to go about this, though, that sidesteps the entire problem. In fact, there are multiple ways, but I'll show just one. There's an unfortunate problem in terms of getting to the state where things all work as desired, but once you do get there, you're good. The end goal here is to not have a committed file named Jenkinsfile
(or JenkinsFile
, but I've used the lowercase-F spelling below) whose content is branch-dependent. Instead, just have an uncommitted work-tree-only file whose name is Jenkins[Ff]ile
and whose content is branch-dependent. Make the committed files have other names.
Fundamentally, git merge
works by combining work done, i.e., combining the changes to some file(s) since some common starting point. But Git doesn't store changes; Git stores snapshots. This creates a problem for git merge
, and the solution requires that you understand how Git's commit graph works.
Almost every commit in a Git repository has at least one parent commit, which is that commit's immediate predecessor. Most have exactly one parent; commits of type "merge" have at least two, and usually exactly two. In fact, the presence of more than one parent is what defines a commit to be a merge commit. The other common special case is that the very first commit in a repository has no parent, because it can't have one, because it was the first commit. (Commits with three or more parents are called octopus merges but they do nothing you can't do with regular merges, so they're mainly for showing off. :-) )
These links, in which a commit stores the hash ID of its parent(s)—remember that each commit is found by its unique hash ID, that Git assigned to the commit when you made the commit—form backwards chains. These backwards chains are the history in the repository. History is commits; commits are history. A branch name simply identifies the (single) last commit that we wish to claim to be part of that branch:
... <-F <-G <-H <--master
Here, instead of actual hash IDs, I've drawn in single uppercase letters that stand in for each commit. The name master
holds the actual hash ID of commit H
. We say that master
points to H
. H
holds the hash ID of its parent G
, so H
points to G
, which points to F
, and so on, backwards down the line.
Nothing inside any commit can ever change, so we don't need the internal arrows, we just have to remember that they go backwards. It's actually very hard to go forwards, in Git: almost all operations start at the end(s) and work backwards. Once we have more than one branch, this gives is a picture that looks like this:
G--H <-- master
/
...--E--F
\
I--J <-- develop
\
K <-- test
To git checkout
a branch means *extract the snapshot from the tip commit of that branch. So
git checkout masterextracts the snapshot from commit
H, while
git checkout developor
git checkout testextracts those snapshots in turn. Also, doing a
git checkoutof some branch name attaches the special name
HEAD` to that branch. This is how Git knows which branch—and commit—is the current one.
When you run git merge
, you give Git the name of some other commit. That doesn't have to be a branch name—any name for a commit will serve—but giving it a branch name works fine, since that names the tip commit of that branch. So if you git checkout master
and then run git merge develop
, you start with:
G--H <-- master (HEAD)
/
...--E--F
\
I--J <-- develop
\
K <-- test
and Git finds commit J
. Git then works backwards from both the current commit H
and the named commit J
to find the merge base of these two commits.
The merge base is, loosely, the first commit we get to from both tips. That's a commit that's on both branches, and in this case, that's obviously commit F
. The idea of a merge base is crucial to understanding how merge works. Since the goal of the merge is to combine work, and that work can be found by comparing the snapshot in commit F
, one comparison at a time, to each of the two tip commits H
and J
:
git diff --find-renames <hash-of-F> <hash-of-H> # what we changed
git diff --find-renames <hash-of-F> <hash-of-J> # what they changed
To combine the changes, Git starts with all the files from F
, and looks at which files we changed and which ones they changed. If we both changed different files, Git takes ours or theirs as appropriate. If we both changed the same file—this eventually brings up a philosophical problem which we'll get back to in a moment—Git attempts to smash our changes together with their changes, by assuming that if we touched some source line and they didn't, it should take ours, and if they touched some source line and we didn't, it should take theirs too. If we both touched the same lines of the same file, then either we did the exact same thing to those lines—in which case, Git takes one copy of that change—or there's a conflict.
If there are no conflicts, Git applies these combined changes to the snapshot in the merge base—in F
, here—and uses the resulting files to write out a new snapshot. That new snapshot is a commit of type merge commit, having two parents. The first parent is the commit we were on before, H
, and the second is the one we named with our argument, J
, so the merge looks like this:
G--H
/ \
...--E--F L <-- master (HEAD)
\ /
I--J <-- develop
\
K <-- test
Note that nothing happens to any existing commit, nor to any other branch name. Only our own branch name, master
(to which HEAD
is attached), moves; master
now points to the new merge commit that Git just made.
If the merge goes badly, due to merge conflicts, Git will leave a mess behind. The index, which I'm not going to get into here, will contain all the conflicting input files, and the work-tree will contain Git's attempt at merge, along with conflict markers. Your job is to clean up the mess, fix up the index, and finish the merge (with git merge --continue
or git commit
—the --continue
just runs commit
) by hand.
Jenkinsfile
Suppose that in commit F
, the merge base, there is a file named Jenkinsfile
. This same file, with this same name, appears in commits H
and J
. The copies in H
and J
differ—you said they do, so we'll assume that they do. Therefore at least one differs from F
, and perhaps both differ from F
.
Git is going to assume that the file that is named Jenkinsfile
in both branch tips is the same file that is named Jenkinsfile
in F
. Obviously, it's not quite the same file—the contents differ—but Git will assume that it is, and that you're trying to combine work done on it.
So, Git will diff the version of Jenkinsfile
in F
against that in H
, and then diff it again, against the version in J
. There will be some changes. If both branch tips have changes, Git will combine them (or declare a conflict). Result: bad. Otherwise, Git will take the version of the file from whichever "side" changed it. Is that the side you want? If so, result: good. If not, result: bad.
In summary, for this scenario, there are three possible results:
It is of course possible that merge base commit F
has no file named Jenkinsfile
. And, it's possible that one or both commit has no such file. In this case, it gets a little trickier. We'll get to that in a moment.
The solution here is to avoid having a single, fixed-name file, such as Jenkinsfile
, in all commits when that file is intended to be branch-dependent. Suppose, instead, that commit F
contains Jenkinsfile.master
and Jenkinsfile.develop
and Jenkinsfile.test
. Then commit H
will have a Jenkinsfile.master
and Jenkinsfile.develop
and Jenkinsfile.test
too, and the changes from F
to H
in Jenkinsfile.master
will be the ones you want to keep. Since commit J
is in branch develop
, it should always either have the same changes—imported from master
at some point—or no changes at all. Git's merge will therefore do the right thing, in both cases.
The same logic applies to each of the other such files. Note that at this point, the commits identified by all branch tips should have no file named Jenkinsfile
(without a suffix) at all. This is, of course, an idealized goal-state: to get there, you must actually make new commits in each branch, renaming the existing Jenkinsfile
. But this will have no effect at all on any existing commits. All of that history in your repository is frozen for all time. This means that at some point, you'll run git merge
and git merge
will locate a merge base commit that has only Jenkinsfile
, not Jenkinsfile.master
, and not Jenkinsfile.develop
or any other suffix.
Let's assume now that in H
and J
, you have already done this renaming, but in merge base F
, you have not—obviously, since it's a historic commit. So F
has a Jenkinsfile
and no renamed files, while H
and J
have no Jenkinsfile
but do have the renamed files.
Now, remember above where we showed the git diff
s that git merge
runs, to figure out what has changed since the merge base. One of the arguments is --find-renames
. This directs Git to guess whether the file Jenkinsfile
in F
is "the same" file as Jenkinsfile.master
in H
, when comparing F
and H
. The same goes for the comparison of F
vs J
: is the old Jenkinsfile
the same file as the new Jenkinsfile.develop
?
If you followed the link to https://en.wikipedia.org/wiki/Ship_of_Theseus you will see that there's no philosophical right answer to the question of identity-over-time. But Git has its right answer, which is: If the file has a similarity index of 50% or better, it's the same file. We don't need to worry here about how Git computes this similarity index (it's a bit complicated); chances are very good that Git will detect the rename in both cases.
What this means in practice is that the first time you run this git merge
, Git will immediately declare a merge conflict, of the type I like to call a high level conflict. That is, Git will say that Jenkinsfile
was renamed in both branches, but to two different names. Git doesn't know whether to use the master version, or the develop version, or both, or neither, or what. It will just stop with a merge conflict. This is OK because it gives you a chance to resolve the conflict, which you should do by selecting the Jenkinsfile.master
file as it appears in the master
or --ours
branch, and selecting the Jenkinsfile.develop
file as it appears in the develop
or --theirs
branch, as your merged results. Put these two files into the index while removing the original name:
git rm --cached Jenkinsfile
git checkout --ours Jenkinsfile.master
git checkout --theirs Jenkinsfile.develop
git add Jenkinsfile.master Jenkinsfile.develop
You have now resolved the conflict by choosing to keep both files as they appear in both branch tips. You can now commit the result.
Every time you do a merge that uses one of the historic, single-Jenkinsfile
commits, you'll need to check that the merge result is correct, or resolve any conflicts. (If it's not correct, immediately after merging, you can fix it in place and use git commit --amend
to push the original merge aside and choose a new result as the merge commit. If you don't notice a bad merge, it's a bit more painful, but the recovery is similar in the end anyway. Remember how Git does merges, and work through the two git diff
s, to see how putting the right result in any tip commit gets you where you need to go.)
Jenkinsfile
Now that there's no file named Jenkinsfile
, you'll have to redirect any software that wants to use such a file. There are multiple solutions (depending on the software and your OS), including making a symbolic link from Jenkinsfile
to the correct per-branch checkout. (Make sure the symbolic link does not get committed, or you'll be right back to the same merge issue when Git tries to merge two potential symlink target changes.)
Upvotes: 3
Reputation: 35135
Keeping a file different between branches is super hard, especially one that changes from time to time.
A better solution is to have settings.environment.json files that contain settings for different environments and make your software use different settings file depending where it runs.
Having said that it's best not to keep your production settings in git. The deployment pipeline should contain passwords etc, not your version control system. In this scenario the settings file in all branches contains DEV settings (which are OK to be public) and the pipeline overwrites the settings with TEST and PROD values when it prepares the package for deployment to the target environment.
Upvotes: 1