Reputation: 887
I have a remote GitLab repo my-project
that contains 2 branches master
and dev
. And I have a directory on my local machine named also my-project
that I was developing some code in it and it has more or less the same structure as the remote repo. What I want to do is:
master
branch of the remote repo (without losing any files on the local repo).what I have done so far is:
cd my-project
git init
git fetch
git checkout -b new_dev_branch
git add .
git commit -m "first trial"
git remote add origin path/to/remote/gitlab/my-project.git
git push -u origin new_dev_branch
But the problem is that it didn't create the branch with the structure of the master as I wanted in step 3.
Any ideas?
Upvotes: 1
Views: 1921
Reputation: 487795
You've created a new root commit in your new repository. That's not what you wanted. You need a different (non-root) commit.
Sajib Khan's answer, suggesting merging or cherry-picking, is likely to hit stumbling blocks (depending on your specific Git version, but since you mention GitLab, you will have a Git version that is newer than 2.9).
You can fix this in several ways. Read the "long" section below to see all of them, and why this summary should be correct (note that I can't actually test it). Here is the summary of the "fix things" command sequence, which is your shortest path forward from where you are right now:
git push --delete origin new_dev_branch
git branch -m error
git fetch
git switch -c new_dev_branch --no-track dev
git restore --source error -SW -- .
git commit
(put in a good commit message here!)
git push -u origin new_dev_branch
I'm assuming you have Git 2.23 or later; if not, see below.
You're starting from the wrong idea. You are thinking that Git stores files, and that a branch name has a lot to do with that. But these aren't right! A Git repository is, at its heart, a storage system for commits. In other words, the unit of storage, in any repository, is the commit.
Now, it is true that commits contain files. But the trick is that every commit contains every file—or more precisely, "every file it contains", although that sounds redundant. You pick out one particular commit, such as a123456
, and tell Git to check out that commit. Git removes other files and then extracts the files that are in that commit. You now have the files from that commit.
Again, the real focus here is, and stays on, the commit. We always have to be concerned about the commit. There's a problem here for us humans, though: commits have these big horrible random-looking hexadecimal numbers like a123456
or f1dd1ee
or whatever, except that they're even bigger and longer and more random looking (my made-up ones, like deadcab
and feedc0ffee
, at least resemble words). We call these things hash IDs, and each commit has a unique hash ID, which is the "true name" of the commit.
The hash IDs are impossible for humans to deal with correctly, though, so Git provides us names. We start with branch names like master
or dev
. These names let Git remember the latest hash ID for us. But these names really just mean get me the latest commit, whose hash ID is stored in that name. There are other kinds of names, but there is a special property of branch names, which we'll see soon.
What this means for you is that you are not allowed to just create a new Git repository.1 You must start by copying the existing repository. The reason for this is that each new commit you make, in a Git repository, connects back to some existing commits. So you have to get their commits first, so that you can connect your new commits to their old (existing) commits. So this sequence doesn't quite work:
- initialise the directory as a git repo (which I already did)
- connect the initialised local repo somehow to the remote one.
- Then create a branch based on the structure of the master branch of the remote repo (without losing any files on the local repo).
- finally, push the code in the local repo to the newly created branch in the new repo.
It can be repaired in place, but it's tricky. If you can, it's far easier to just start over. I'll cover how to do that first.
1Welll, you are allowed, it's just going to become a problem if you do.
Given that you have a directory (or folder) full of files and those are the files you want to have in your new commit on your new branch, we start by putting that aside. We keep it ready, but we don't use it yet.
We now create a new folder (or directory), using git clone
. This is actually shorthand for doing five or six separate commands, which we will see in the "how to repair the existing setup" section, but for now, think of it this way: git clone
tells your Git software on your computer to call up some other computer somewhere, get some Git software on their computer to navigate to some Git repository, and have them pour out all their commits. Your Git—your software on your computer—copies all the commits into a new repository on your computer. These commits form branches, whose latest commit is remembered—by the Git software on their computer—with branch names. But when your Git—your software on your computer building your new repository—gets all the from their Git (their software reading their repository), your Git doesn't copy any of their branch names as-is. Your Git takes their branch names and changes them.
Their Git repository has some URL, such as ssh://[email protected]/path/to/repo.git
.2 Your Git—by which I again mean "your software running on your computer, working on your local repository"—will store this URL under a name. The default standard first name that git clone
uses here is origin
. We call this name a remote, so this URL gets stored as the remote named origin
.
This factors into the branch-name changes. When your Git clones all the commits from their Git, your Git takes their branch names and changes them into remote-tracking names. Their master
becomes your origin/master
; their dev
becomes your origin/dev
. So, at this point in the cloning process, you have one remote-tracking name for each of their branch names.
If they have three branch names, they have three "most recent commit" hash IDs. You now have three remote-tracking names, made by sticking origin/
in front of each branch name.3
Now that you have all of their commits and none of their branches, git clone
does its last special trick: it creates one branch, and checks out the latest commit for that one branch. The branch it creates depends on the -b
argument you give to git clone
. If you don't give git clone
a -b
, your Git asks their Git which name they recommend. They'll generally recommend master
, or GitHub now starts with main
instead, and your Git will create that name, based on your origin/master
or origin/main
, which your Git made from their master
or main
.
The result of this complicated dance is that you now have one branch with the same name as one of their branches. Your Git now uses git checkout
or git switch
4 to check out the latest commit on that branch, as stored in that branch name (which your Git copied from your remote-tracking name from their branch name).
2Sometimes that's a local file path, in which case "their computer" is really your computer. Sometimes that's a network drive, but if so, you will eventually discover that using network drives like this is error-prone: it's best to use an actual network protocol and call up the other computer on which the drive is local. But that's a separate topic.
3Technically, the remote-tracking names are in a separate namespace from the branch names. This means even if you create a (local) branch name that is spelled origin/something
, it's not a remote-tracking name. Git will keep them straight—and by default, git branch -a
will show the remote-tracking names in red and the branch names in other colors, so that you'll know which one is which—but don't do that; it makes things crazy for humans.
4The git switch
command was new in Git 2.23. It's the result of taking the old git checkout
, which had two modes—one "safe" and one "unsafe"—and splitting it into two separate commands: git switch
, which implements only the safe-mode check-out operations, and git restore
, which implements the unsafe-mode check-out operations. If you have 2.23 or later, it's a good idea to learn the new commands, so that you don't accidentally invoke the unsafe mode one when you think you're invoking the safe mode, but they're really almost the same.
It's time for a short (well, short-ish) sidebar here, about committed files as stored in a Git commit, vs the files you work with. Knowing this will explain a lot about Git that is otherwise mysterious.
Everything in a Git commit is read-only. No part of any commit can ever be changed, once you (or whoever) has made the commit. This way, each commit stores every file, in that particular version, forever—or at least, as long as the commit with that particular hash ID continues to exist. (It's a bit hard to remove commits, and we won't cover that here.)
The files inside the commit are not stored like ordinary files on your computer. Instead, they're in a special, read-only, Git-only, compressed and de-duplicated format. This handles the fact that most commits mostly re-use most files from a previous commit. Git actually stores the files' contents only once. When you change the contents, Git will have to store a new copy, but if you're re-using an old file—or even changing a file back—Git will be able to re-use the earlier stored version.
But this means that the files inside a commit are literally only readable by Git itself, and writable by nothing. These committed files, in other words, cannot be used for everyday work. So they aren't. Instead, when you check out a commit, Git extracts the archived files from the commit.
This extraction process is most of what git checkout
or git switch
is about. Except for that initial git clone
, we start with some commit that we have checked out now. Then we pick a different commit to check out, and:
This removing-and-filling-in happens in your working tree, which is simply the area in which you do your work. These are the files you can see and edit. But they are not in Git. They were merely extracted from Git, from a commit.
Whenever Git does this extracting step, it keeps track of these files in something that Git calls, variously, the index, or the staging area, or—rarely these days—the cache. These are three names for the same thing. It's really crucially important—for instance, it's where Git gets the list of files to remove when you switch commits—but we won't cover it properly here at all. Just remember that when you run git add
, you're creating or updating a copy of a file in this index / staging-area. When you run git commit
, Git makes the new commit snapshot from the files that are in this staging area.
Hence, you work in your working tree, and commit from the index / staging-area. Until you commit, though, your files aren't in Git.5 It's the commits that are the storage unit, that store the files.
When you do commit, Git will:
This last step adds the new commit to the current branch. This is the special property of branch names. As we add new commits, Git "moves the branch name forward". The new commit links back to what was the final commit before, and now the new commit is the final commit.
What this means is that if we draw out commits, using uppercase letters to stand in for the branch names, we get a picture that starts out like this:
... <-F <-G <-H <--master (HEAD)
where H
is the last commit on master
, that we checked out / extracted. We then work on this commit for a while, in our working tree, and run git add
and git commit
to make a new commit I
, which will point back to existing commit H
. Git will then write the new commit's hash ID I
into the name master
:
... <-F <-G <-H <-I <--master (HEAD)
When we have more than one branch name, we usually start out with two names pointing to the same commit, like this:
...--G--H <-- master (HEAD), new-branch
We then run git checkout new-branch
or git switch new-branch
to select the right name:
...--G--H <-- master, new-branch (HEAD)
Note that Git cleverly notices that we're moving from commit H
to commit H
, which means we're not really changing commits. Git therefore doesn't bother removing and replacing any files.6 Then we make our changes and add and commit:
...--G--H <-- master
\
I <-- new-branch (HEAD)
Note how new commit I
still points back to commit H
, as before. But this time, we didn't move the name master
. We moved the name new-branch
.
This linkage from commit to commit is critical in Git. Git doesn't work without it. So we have to start from the correct starting commit, when we make our new commit.
5Because of Git's internal storage format, git add
does put a file's content—but not its name—into Git. The content bytes can be recovered after particular accidents or disasters, for a while. But if you don't commit them, the content bytes eventually get discarded, unless they're duplicates from some existing commit.
6This optimization lets us create and switch to a new branch name after we've started working, in case we forgot to do that in advance.
git clone url
: this will copy a Git repository—or rather, its commits—to your computer and get you a new working tree in the new checkout. Then cd
(or whatever your change-working-directory command is) into the new clone.
git checkout dev
, if necessary (or add -b dev
to the clone command above). Remember, you're picking the desired starting commit here, via a branch name. If you prefer the new git switch
, run git switch dev
(or, again, use the -b
option during the clone command in step 1).
git checkout -b new_dev_branch
, to create the new branch name and switch to it. With git switch
, the -b
changes to -c
(create): git switch -c new_dev_branch
. There's no rhyme or reason to the -b
vs -c
here; you just have to memorize it.
Run whatever your local command is to copy each file from the place where you have them now, to the working tree here. Or, consider running git rm -r .
, to remove all files from your working tree here, then copying all the files from the other location.7
git add .
or git add
for each new file; git rm
for any file you want removed. This updates Git's index AKA staging area, so that you are ready to commit.
git commit
, to make a new commit and update the name new_dev_branch
.
git push origin new_dev_branch
: this has your Git call up their Git, using the URL stored under the name origin
. Your Git hands to their Git any new commits you have, that they don't, that they need: i.e., the commit you made in step 6. Then, having given them this new commit (and any all-new files in it—the de-duplication means you don't have to give them any old files that still match), your Git asks their Git to create the new-to-them branch name new_dev_branch
. Since this name is new to them, you'll be allowed to do this, assuming you are allowed to create new names. Their name new_dev_branch
will now remember the same "last" commit that your name new_dev_branch
remembers, and your Git will create origin/new_dev_branch
to remember that their Git remembers this commit as their branch named new_dev_branch
.
This command is likely to fail at the moment, so you will need to pick a different branch name, or else first delete the new_dev_branch
name in the remote, or use force-push. See the section on recovering from what you've done so far, below.
git branch --set-upstream-to origin/new_dev_branch
: this has your Git set the upstream setting of your existing branch, new_dev_branch
, to be your origin/new_dev_branch
, which is your Git's memory of their branch name new_dev_branch
.
You only have to do step 8 once: each of your branch names, in your Git repository, can remember exactly one upstream. Remembering an upstream is optional, but gives you some nice features.
You can make step 7, the git push
operation, do step 8 for you by using git push -u origin new_dev_branch
. This is just a convenience short-cut. It will run step 8 if and only if the git push
part succeeds. Since you only have to set the upstream whenever you want it to change, you only need the -u
option once.
7If you use some sort of en-masse file move operation, be careful: the Git repository itself is stored in the .git
directory (or folder) at the top of your working tree. This is where Git keeps all its own files, and is where the actual repository is. Your working tree is yours, and you can do whatever you like with it, as long as you keep in mind that some Git commands, like git restore
or git reset --hard
, will scribble on your working tree files. But Git's repository files are precious to Git, and if you mess with the wrong one, Git will be sad8 and stop working.
8Don't anthropomorphize computers, they hate that!
git init
and it didn't do what you wanted. Here's how to recover.Here, from your question, is what you ran. I've numbered them so that we can refer to them below.
cd my-project
git init
git fetch
git checkout -b new_dev_branch
git add .
git commit -m "first trial"
git remote add origin path/to/remote/gitlab/my-project.git
git push -u origin new_dev_branch
Step 1 is pretty simple. In step 2, Git told you:
Initialized empty Git repository in .../my-project/.git/
If it said instead that it re-initialized an existing Git repository, stop here! This is only for the new repository case.
Step 3 will just silently do nothing, because there is no remote to fetch from. So you still have an empty repository:
$ git init
Initialized empty Git repository in ...
$ git fetch
$
I'm not sure why Git doesn't complain here; it really should.
An empty Git repository has no commits and no branches. While it has no branches, you're still on some branch: you are just on a branch that does not exist. It's a peculiar state of affairs, but Git fixes it up when you create your first commit. So, we move on to step 4, which changes the name of the branch that you are on—that doesn't exist—from master
to new_dev_branch
:
$ git checkout -b new_dev_branch
Switched to a new branch 'new_dev_branch'
Step 5 adds all the files in the current directory to Git's index / staging-area, so that they are ready to be committed. This produces no output as long as it works. Step 6 then turns all of these files, plus the added metadata, into the first commit in this new repository. The repository is no longer empty, so now the branch name can exist. The branch name springs into existence, pointing to this one commit. This commit can't point back to any earlier commit, so it just doesn't. Git calls this a root commit:
$ git add .
$ git commit -m "first trial"
[new_dev_branch (root-commit) f342d15] first trial
1 file changed, 1 insertion(+)
create mode 100644 README
In my case I had only the one file, README
, that I added. Note the (root-commit)
; the number here, f342d15
, is unique, so yours will be different; the current branch name is new_dev_branch
and that branch now exists.
Your step 7 created a remote named origin
, storing the URL (path/to/remote/gitlab/my-project.git
). This enables the git push
command to run. Your step 8 sent your new root commit and created a branch named new_dev_branch
. Since your root commit has no parent commit, it does not link into the history of existing commits.
Given that you don't really want the other Git repository to use this commit as new_dev_branch
, you should probably first delete new_dev_branch
from the repository over at origin
:
git push --delete origin new_dev_branch
This has your Git call up their Git and ask them to delete their branch named new_dev_branch
.
Let's rename the (local) branch, too, to error
, so that we have it out of the way:
$ git branch -m error
$ git branch
* error
Next, we need to obtain, from the other Git repository, all of their commits. That's easy to do: just run git fetch
now! I haven't set up another repository so I won't show any output here:
git fetch
Your Git will call up their Git, have them list out all their branch names and corresponding commit hash IDs, and use those to obtain, from their Git, all their commits, and create, in your own local repository, remote-tracking names for each of their branch names. When this completes, run git branch -a
or git branch -r
to see all these names.
(Note: for no apparent reason, git branch -r
lists the remote-tracking names as origin/master
, origin/dev
, and so on, but git branch -a
lists them as remotes/origin/master
, remotes/origin/dev
, and so on. This has to do with the name-spaces I mentioned earlier. There's still no reason for git branch -a
to do this, though.)
We now want to check out some existing commit, as found via some remote-tracking name. This will remove from your working tree all the files you've committed into your root commit, on your branch currently named error
. That's OK because they are all safely stored in that commit.
Note that we don't have a branch named dev
, nor one named master
. And yet, we can run:
git checkout dev # or git switch dev
or the same with master
. This invokes what Git calls DWIM mode, or—in recent versions of git checkout
and git switch
—the --guess
mode. This mode is enabled by default; using --no-guess
turns it off.
What this mode does is simple-ish: if there is no branch with the name we give, before saying something like error: no such branch: dev
, Git will poke around in our remote-tracking names. If there's one that sufficiently resembles the name we asked for, Git will act as though we ran:
git checkout -b dev origin/dev
or:
git switch -c dev origin/dev
That is, Git will create our dev
based on our origin/dev
that's based on origin
's dev
. This is the same trick that git clone
pulls off during the initial clone: if they have a dev
, we probably want our dev
to match. That's the guess, or the "do what I mean": DWIM.
You can just be explicit and run:
git checkout -t origin/dev
or:
git switch -t origin/dev
The -t
or --track
option (either spelling is allowed) means here's the remote-tracking name; create a branch based on that name, and set the upstream of the new branch. The upstream setting happens when the DWIM / --guess
code does its thing too, so -t
is just an explicit variant.9
Now that we have a dev
, and its files are checked out—I'm assuming that you'd like your new_dev_branch
to start from this commit—now we're ready to make a new commit from all the files that are in your root commit on branch error
. Things get a little tricky here.
What we need to do now is:
error
;error
; andThere is a simple and straightforward way to do this:
git rm -r .
git checkout error -- .
git commit
This starts by removing all files from both Git's index and our working tree. We only have to do this if there are some files in the current commit that won't be in the new commit.
We then use the "dangerous" mode of git checkout
to wipe out all our files. This isn't really dangerous at this point, since already did that. We replace these files with those from the commit named by branch error
. The -- .
tells git checkout
all files in the specified commit, just as the git rm -r .
told Git all files that are in your index.
There are two ways to avoid the git rm -r
step. One of them is rather deeply magical as it uses one of Git's plumbing commands. We simply run:
git read-tree -m -u error
I don't really like this one because it's overly magic. Instead, we can use:
git restore -s error -SW -- .
which is more explicit: we ask git restore
to restore all the files to exactly the way they look in commit error
, removing any files that aren't in that commit. This is different from git checkout error -- .
, which won't remove files that aren't in that commit.10 The -SW
options tell Git to write both the index and the working tree.
Having used git restore
or git read-tree
, or the two-command remove-and-replace git rm
and git checkout
sequence, to fix up our index and working tree, we now simply run git commit
. This makes a new commit, on the current branch, as usual. The files in this new commit exactly match the files in the (lone, root) commit on branch error
.
We can now git push -u origin new_dev_branch
to (re-)create the new_dev_branch
branch over on origin
. It contains this commit and all the earlier commits, so it is the desired history.
There's one last trick we can use to make this shorter. We ran:
git checkout dev
to create dev
from origin/dev
, then:
git checkout -b new_dev_branch
to create new_dev_branch
with the same commit hash ID, and switch to new_dev_branch
, with no upstream set for new_dev_branch
. But we can combine these at the cost of not creating a local dev
branch in the first place—a trivial cost since we can always create it later. All we need to do is:
new_dev_branch
, pointing to the right commitand to do that, we can use:
git checkout -b new_dev_branch --no-track origin/dev
The --no-track
keeps Git from setting origin/dev
as the upstream. Actually setting that would be mostly harmless (we'll replace it with git push -u
), but it's perhaps nicer and safer to avoid it. Of course, if you have git restore
, you also have git switch
and we probably should use that, so we want:
git switch -c new_dev_branch --no-track origin/dev
here, as our shortcut.
9The profusion of ways to do this is mostly historical accident. However, there's a bit of a problem if you have two or more remotes, all of which have a dev
. Suppose you have r1/dev
and r2/dev
and you run git checkout dev
while you don't have a dev
. Which remote-tracking name, r1/dev
or r2/dev
, should Git use? Using git checkout -t r2/dev
tells Git which one to use, so the -t
option works when the DWIM mode doesn't.
10Internally, Git calls this --overlay
or --no-overlay
. In older versions of Git, git checkout
always runs in --overlay
mode, and git restore
always runs in --no-overlay
mode. In modern Git these options are exposed. See the documentation for further details.
git clone
This wouldn't be complete without expanding out git clone
to its constituent parts. You can replicate what git clone
does using six commands, one of them a plain shell (or CLI) command and the remainder Git commands run in the new directory:
mkdir clone && cd clone
, or your CLI equivalent: make the new directory.git init
: create an empty Git repository here.git remote add origin url
: add a remote named origin
to store the given URL. Use some other name if you gave the -o
option.git config
commands you need, if you gave -c
options to git clone
, or options like --single-branch
(though you can put the single-branch-ness in the git remote add
instead if you prefer).git fetch
or git fetch remote
(I'm not sure if you have to spell out the remote name if it's not origin
, but you might).git checkout branch
: create the branch you chose with -b
, or the recommended branch (learning the recommendation is tricky but you can probably assume main
or master
in most cases). If you used -n
, skip this step.Running git clone
just does these six steps, with a bit of fanciness: it discovers the recommended branch name; it figures out the name of the directory to make based on the URL, if you don't point it to some existing empty directory; and it will remove the partially-built clone if something goes wrong. But it's really just hiding the fact that it runs git fetch
and git checkout
for you.
Upvotes: 1
Reputation: 24146
You can pull remote master
branch into your new_dev_branch
branch:
$ git checkout new_dev_branch
$ git pull origin master
Now, check if all is ok then push to remote:
$ git push origin new_dev_branch
Another way you can do the following -
Checkout to the new_dev_branch
branch and copy the latest commit hash:
$ git checkout new_dev_branch
$ git log
# copy the latest commit_hash
Create a new branch (e.g. new_dev_branch_2) from remote's master
branch. Now, take (cherry pick) your commit:
$ git checkout -b new_dev_branch_2 origin/master
$ git cherry-pick <commit_hash>
Now, check if all is ok then push the new_dev_branch_2
branch to remote.
$ git push origin new_dev_branch_2
Upvotes: 1