Reputation: 14926
We have to work on another teams repo but for various reasons we want to keep our development work out of view until it is ready to push to their repo.
I cloned their repo, deleted the .git folder then created a new repo with the files. This got us up and running but I am not sure now the best way to merge changes from their repo to our repo.
Is there a better setup where we can work in isolation but not have such an awkward time keeping on top of merges?
Upvotes: 0
Views: 267
Reputation: 45819
By deleting the .git
folder, you removed the history. Then by creating a new repo from those files, you started a new history which, in git's view, is unrelated. Even if you committed before any changes were made, still you created a new commit (with coincidentally identical content).
What you want is for your team's repos to preserve the upstream repo's history so that your repos know how your changes relate to that history, and then it's (relatively) easy to merge in upstream changes. Then you control upstream visibility simply by controlling what you push to that origin.
This can still be achieved, but the exact steps depend on a number of questions about your current state. I'm going to have to make some assumptions, so comment if these assumptions are wrong and I can try to adjust the answer accordingly.
So presumably you created your new repo in such a way that you have a remote named origin
from which your developers clone their local repos. (Going forward I'll call that origin
, and the original repository from which you took the first snapshot I'll call upstream
.)
I'll also assume active development has been going on in your origin
, so that it would be a problem to just re-clone
from upstream
; and that you are at the point where you want to integrate changes from upstream
into origin
.
First step, working from a clone of origin
, is to pull in the upstream
history.
git remote add upstream url/of/upstream/repo
git fetch upstream
Next you need to integrate the histories. There are a couple ways to do that. The best long-term results would come from doing a history rewrite, though it requires some coordination among your team.
Ideally you would tell everyone on the team to push
their work to origin
; it needn't be fully merged, but all branches need to be push
ed. After push
ing their work, they should discard their clones and wait for you to tell them the rewrite is complete, at which point they would create new clones of origin
(and, optionally, add the upstream
remote as well).
If that level of coordination isn't possible, a rewrite can still be done; but it will likely be more work for some or all of the devs, as they'll have to perform their own migration of any changes that weren't in origin
at the time you clone
d (or did your last fetch
before the rewrite). In that case, it might be better to use a non-rewriting option (more on that in a bit).
Having fetch
ed all of your team's history (or replicated it when you cloned your working repo), you next need to identify the commit in the upstream
history that corresponds to the initial commit in your origin
repository.
If this was a tagged version, that may make things simple; but I'm assuming it was just whatever was on master
at the time. But even in that case, if you know about when you did it, you should be able to track down the correct commit and verify with git diff
that it matches your initial commit.
So you find that commit and we mark it O
in the following graph; and for convenience you might want to tag it.
X -- X -- O -- X -- X -- X <--(upstream/master)
o -- A -- B <--(master)(origin/master)
\
C <--(someBranch)
Now, the easiest thing to do is re-parent o
, so that its replacement's parent is O
. The other option, which produces a "cleaner" result, would be to re-parent A
and C
, replacing o
with O
as their parent. The latter can be slightly trickier, but not as much as you might think; so let's look at how to do that. In the above example, you could use something like
git filter-branch --parent-filter "sed \"s/$(git rev-parse master~2)/$(git rev-parse origin/master~3)/g\"" -- --branches
which should give you
X -- X -- O -- X -- X -- X <--(upstream/master)
|\
| A* -- B* <--(master)
\
C* <--(someBranch)
o -- A -- B <--(refs/original/refs/heads/master)(origin/master)
\
C <--(refs/original/refs/heads/someBranch)
Then you can either force-push (git push -f
) all of the rewritten branches to origin
, or recreate the origin
from the new repo.
Note that the local repo in which you did the rewrite will have new refs under original/refs/heads
, representing the pre-rewrite locations of the branches. Also note that the remote tracking refs for origin
are not yet updated (until you do the force pushes, or until you remove the origin
remote and re-add it using a remote that reflects the rewrite).
So... what if you decide a rewrite can't be done? Well, in that case you probably want to have a single "integration repo", cloned from origin
with upstream
added. In the integration repo you would set up a git replace
mapping, telling git "whenever you encounter object o
, use object O
instead". This "papers over" the problem. It can have a few quirks (see the git replace
docs), but ideally you'd be able to stop relying on the replace
mapping (and the integration repo) after some time has passed. The developers' history would not end up quite as clean as with a rewrite, but there'd be no need for a cut-over to a rebuilt repository.
The idea here is that eventually the histories will be "combined enough" that git will understand what to do without the replacement. This would have to do with how merge bases are calculated. Consider a simple case.
X -- X -- X -- O -- A -- B -- C <--(upstream/master)
o -- D -- E <--(master)
Now you want to merge the upstream
changes into master
. In your integration repo you've said
git replace master~2 upstream/master~3
which we might draw as
X -- X -- X -- O -- A -- B -- C <--(upstream/master)
:
o -- D -- E <--(master)
so git commands by default "see"
X -- X -- X -- O -- A -- B -- C <--(upstream/master)
\
D -- E <--(master)
Meaning if you say
git checkout master
git merge upstream/master
the calculated merge base will be O
and git will "think" it's giving you
X -- X -- X -- O -- A -- B -- C <--(upstream/master)
\ \
D ---- E ---- M <--(master)
which is really
X -- X -- X -- O -- A -- B -- C <--(upstream/master)
: \
o --- D --- E --- M <--(master)
In this example, master
is the only branch receiving changes from upstream
, and at this point master
's history tracks back into upstream/master
; so next time you merge upstream/master
into master
the merge base should be C
, and the replacement is no longer needed (so this could be performed in any clone, rather than needing to take place in a specially-set-up integration repo).
Now I mentioned that the developers' history would not be as "clean", and the obvious thing is that in the final state after you stop using the replacement, we have
X -- X -- X -- O -- A -- B -- C <--(upstream/master)
\
o --- D --- E --- M <--(master)
so the "lineage" of D
, E
, and M
is somewhat broken. In particular, it's not obvious how M
should be the result of merging C
into E
. This could be seen as an "evil merge", though it's not as bad as some in the sense that the default merge (without using a replacement) would generate merge conflicts anyway.
Upvotes: 1
Reputation: 1486
Deleting .git
was disadvantageous.
Clone the remote repo again. You can add additional remotes with git remote add yourrepo http://urlOfYourRepo
Make sure that the default push/fetch repo for your not-to-sync branches are set to yourrepo: git push -u yourRepo
to set the Default upstream
You can always git fetch origin
to get the remote repos changes to your repo and then merge them to yourrepo.
With git push origin yourCommit
you can push yourCommit (or branch) to the remote repo.
Upvotes: 1