Reputation: 3447
I accidentally used git add --all
earlier, then made several commits, and it's attempting to add several large files that should be ignored.
Now upon any commit, it shows 'this exceeds GitHub's file size limit of 100.00 MB'. I tried git --reset but it shows Your branch is ahead of 'origin/master' by 2 commits. How to get git back to normal again? Many thanks.
Upvotes: 1
Views: 210
Reputation: 490078
See the various answers at How to undo last commit(s) in Git? Or, jump down to the last section, the "final example".
The bug in your question is that you did not describe your situation properly. You not only added various files, you also committed them. This means that they are now stored permanently in your repository.
The word permanent is a little odd in Git. It's true that commits are fixed, unchanging, and remain in your repository for a good long time no matter what you do. The default is that they remain there forever, part of the history of every commit ever made by anyone. Git never deletes any commit: each new commit retains the identity of its previous commit—called its parent—and this backwards chain of newer-to-older commits is the history stored in your repository.
The commits themselves are actually stored by hash IDs, those big ugly things Git shows you (sometimes abbreviated) like a9b307c
and so on. Those hash IDs are the "true names" of each commit. They appear random, and are basically impossible for people to remember, so what we do is have Git attach a name, like master
, to some particular commit. We call this commit the tip of the branch. That commit has, inside itself, the hash ID of the previous branch tip.
Consider this drawing of a repository with just three commits, all on master
. I'm going to use one-letter names for each commit instead of hash IDs:
A <--B <--C <--master
The name master
stores the hash ID badf00d
or whatever, which is the actual hash ID of commit C
. We say that master
points to C
. Meanwhile C
stores the hash ID for B
; we say that C
points to B
. Commit B
stores the ID for A
, so B
points to A
.
Since A
was the very first commit, it can't point to any earlier commit. So it just doesn't. We call A
a root commit. Any Git repository must have a first commit—well, any Git repository that has any commits at all—so in general there's always a root commit. That root commit is how Git can stop traversing backwards.
To add a new commit, Git simply writes out the commit, with all its files—every commit stores every file that goes with that commit—and makes it point back to the current branch's tip:
A <--B <--C <--D
Commit D
gets a new, unique hash, which is based on everything in D
(including the hash ID of C
, and your name and email address and the current time). But—here's the secret of branch names in Git—Git now writes the new commit's hash ID into the branch name, so that the name master
now points to commit D
:
A <--B <--C <--D <--master
This is how branches grow. Once the commit is made, the name points to it, and the commit itself is permanently stored in your repository.
This sounds like bad news: you've committed those files, so you are now stuck with them. And, it is bad news, but it's not fatally bad.
Commits can, however, eventually be forgotten, if you work hard at it. Moreover, when you go to transfer your commits to someone else's repository—i.e., to git push
your new commits—Git will only push those commits that are reachable from the branch or branches1 you are pushing.
Hence, what you need to do here is to "forget" some of your commits. You do this by telling Git to re-point the branch name. For instance, let's say commit D
itself is the problem, and we just want to get rid of it. Suppose we could tell Git: "hey, make master
point to C
again"—like this:
A--B--C <-- master
\
D
Commit D
is still there (it's kind of permanent!) but it's no longer reachable from the name master
, because Git starts at the commit hash ID that master
names, and works backwards. That means Git doesn't "see" commit D
: it's no longer on branch master
.
1You can push multiple branch names all at once. This used to be the default action for git push
, in fact, although that proved to be error-prone, so now the default is to push only the current branch name.
It's sometimes important, in Git, to distinguish carefully between the names, like master
and develop
, and what I call "DAGlets": portions of the commit graph, like those we drew above. The commit graph is a Directed Acyclic Graph or DAG. Both git fetch
and git push
take branch names, or any other kind of name. They call up another Git, at the other end of an Internet-phone-call, and talk with it: they give each other some of these names, then turn those into the appropriate hash IDs. They then decide which commits (and other Git objects) to send or receive based on the hash IDs, and those parent links I mentioned earlier. Because hash IDs are based solely on each commit's contents, if your Git and their Git have the same commit, those two commits have the same hash ID.
git reset
The main way you move branch pointers around, in Git, is with git reset
. Unfortunately, git reset
is a complicated command.
Git has another important pair of features: while you make commits with git commit
, you make them from something. The "something" itself is actually Git's index. The index is mainly where you build the next commit.
When you run git add
, Git takes files from your work-tree—the place where you do your work, which has the files in the form you can actually use them—and copies them to the index. When you run git add --all
, Git takes all the work-tree files2 and adds them to the index.
At this point, they're in the index, but not committed, so they're not permanent. You can git reset
them back out of the index. That's one of git reset
's jobs: to re-set the index.
But those files are also in your work-tree. The work-tree versions are also not committed, so they are not permanent. You can git reset
them too, and that is another of git reset
's jobs.
And, of course, git reset
can move a branch name, which is what we need for this particular case, because you did commit the files, so now they are stored permanently as part of a new commit. Moving branch names is the third of git reset
's three main jobs.3 These three jobs are in a sort of order-of-importance:
git reset
always moves (re-sets) the branch name;git reset
sometimes resets the index; andgit reset
occasionally resets the work-tree.You control these with --soft
(which does job #1 only), --mixed
(which does jobs #1 and #2), and --hard
(which does all three jobs).
You must choose whether to keep your index and/or work-tree when you do your git reset
that moves your branch. If you use git reset --hard
, Git will do all three things. This is OK as long as you are ready to lose the temporary stuff in your index and work-tree.
The commits are permanent, although once you have forgotten their hash IDs, they will be hard to find. So it's OK to reset those away: you can get them back again. Moreover, you can save the hash ID with a new name, such as a new branch name, and then you can get them all back quite easily. Let's see one final example.
2All, that is, except files that (a) are not already in the index and (b) are listed in a .gitignore
or similar "don't automatically add" file.
3The git reset
command can also do several more-specific things, in which case it can stop doing some of its main jobs, but we won't mention them here.
Let's say that you are on master
and have made many commits, not just one, that have too many files in them. You would like to remember (save) your work, but also throw out the index and work-tree and get master
back in sync with your "upstream" repository, the one you cloned from, which you are having your Git call origin
.
Your commit graph has some stuff in it that looks like this:
...--o--o--o <-- origin/master
\
X--o--o--o--Y <-- master (HEAD)
You are on your master
branch, and it has all the bottom-row commits on it, plus all the top-row ones that both you and the other Git at origin
share.
You made a mistake in the very first commit, the one marked X
. So you wish to reset master
all the way back to point to the same commit as origin/master
, and throw out your current index and work-tree, because they're "clean" (committed and match the tip of your master
, which is commit Y
). You can run git reset --hard
to do this, but then you'll forget all those hash IDs from X
through Y
inclusive.
So, you can simply make a new branch that points to commit Y
:
git branch save-my-mistake
Now your picture looks like this:
...--o--o--o <-- origin/master
\
X--o--o--o--Y <-- master (HEAD), save-my-mistake
Now is the time to run:
git reset --hard origin/master
which moves the current branch—still master
—to point to the same commit as origin/master
, and also re-sets your index and work-tree to match that commit:
...--o--o--o <-- master (HEAD), origin/master
\
X--o--o--o--Y <-- save-my-mistake
And now you can get anything you want out of the save-my-mistake
branch at any time, because that branch name remembers your commits for you.
Eventually, when you are done with it, you can simply delete that branch. Those commits will, sooner or later,4 be "garbage collected" and truly vanish from your repository. Until that time, you won't see them anymore, as the name by which you (and Git) can find the incomprehensible hash IDs is gone.
4The expiration time is rather tricky. It depends in part on reflog entries, which expire in 90 days for "reachable" commits and 30 days for "unreachable" commits, with reachable and unreachable defined in terms of the reference's current value. Once the reflog entry itself expires, the underlying Git objects become eligible for this garbage collection if they are globally unreachable, i.e., unreachable from any name. They still get a 14 day grace period from the time they were created, although in most cases, if the 30 or 90 day time period has expired already, 14 days was a long time ago. However, deleting a branch name deletes all its reflog entries, which may expose younger commits, which then get whatever is left of their 14 day period.
In any case, all of this is driven from git gc --auto
, which other Git commands run when they think there's some good reason to do it. So until one of those Git commands runs git gc --auto
, these expired objects may stick around. That fits pretty well with Git's informal name, "the Borg of version control": it sticks everything into its collective, but it may destroy your world. :-)
Upvotes: 1
Reputation: 1984
You could git reset
back 2 commits by running git reset HEAD~2 --hard
Upvotes: 0
Reputation: 3031
If you committed already, one option is to delete your folder and clone it again. New changes will be lost! Only do this, if you are working alone on this repo and didn't change to much.
Upvotes: 0