Reputation: 3447

how to un-do git add --all followed by multiple commits

I accidentally used git add --all earlier, then made several commits, and it's attempting to add several large files that should be ignored.

Now upon any commit, it shows 'this exceeds GitHub's file size limit of 100.00 MB'. I tried git --reset but it shows Your branch is ahead of 'origin/master' by 2 commits. How to get git back to normal again? Many thanks.

Upvotes: 1

Answers (4)

torek

Reputation: 490078

TL;DR:

See the various answers at How to undo last commit(s) in Git? Or, jump down to the last section, the "final example".

Explanation

The bug in your question is that you did not describe your situation properly. You not only added various files, you also committed them. This means that they are now stored permanently in your repository.

The word permanent is a little odd in Git. It's true that commits are fixed, unchanging, and remain in your repository for a good long time no matter what you do. The default is that they remain there forever, part of the history of every commit ever made by anyone. Git never deletes any commit: each new commit retains the identity of its previous commit—called its parent—and this backwards chain of newer-to-older commits is the history stored in your repository.

The commits themselves are actually stored by hash IDs, those big ugly things Git shows you (sometimes abbreviated) like a9b307c and so on. Those hash IDs are the "true names" of each commit. They appear random, and are basically impossible for people to remember, so what we do is have Git attach a name, like master, to some particular commit. We call this commit the tip of the branch. That commit has, inside itself, the hash ID of the previous branch tip.

Consider this drawing of a repository with just three commits, all on master. I'm going to use one-letter names for each commit instead of hash IDs:

A <--B <--C   <--master

The name master stores the hash ID badf00d or whatever, which is the actual hash ID of commit C. We say that master points to C. Meanwhile C stores the hash ID for B; we say that C points to B. Commit B stores the ID for A, so B points to A.

Since A was the very first commit, it can't point to any earlier commit. So it just doesn't. We call A a root commit. Any Git repository must have a first commit—well, any Git repository that has any commits at all—so in general there's always a root commit. That root commit is how Git can stop traversing backwards.

To add a new commit, Git simply writes out the commit, with all its files—every commit stores every file that goes with that commit—and makes it point back to the current branch's tip:

A <--B <--C <--D

Commit D gets a new, unique hash, which is based on everything in D (including the hash ID of C, and your name and email address and the current time). But—here's the secret of branch names in Git—Git now writes the new commit's hash ID into the branch name, so that the name master now points to commit D:

A <--B <--C <--D   <--master

This is how branches grow. Once the commit is made, the name points to it, and the commit itself is permanently stored in your repository.

The bad news

This sounds like bad news: you've committed those files, so you are now stuck with them. And, it is bad news, but it's not fatally bad.

The good news

Commits can, however, eventually be forgotten, if you work hard at it. Moreover, when you go to transfer your commits to someone else's repository—i.e., to git push your new commits—Git will only push those commits that are reachable from the branch or branches¹ you are pushing.

Hence, what you need to do here is to "forget" some of your commits. You do this by telling Git to re-point the branch name. For instance, let's say commit D itself is the problem, and we just want to get rid of it. Suppose we could tell Git: "hey, make master point to C again"—like this:

A--B--C   <-- master
       \
        D

Commit D is still there (it's kind of permanent!) but it's no longer reachable from the name master, because Git starts at the commit hash ID that master names, and works backwards. That means Git doesn't "see" commit D: it's no longer on branch master.

¹You can push multiple branch names all at once. This used to be the default action for git push, in fact, although that proved to be error-prone, so now the default is to push only the current branch name.

It's sometimes important, in Git, to distinguish carefully between the names, like master and develop, and what I call "DAGlets": portions of the commit graph, like those we drew above. The commit graph is a Directed Acyclic Graph or DAG. Both git fetch and git push take branch names, or any other kind of name. They call up another Git, at the other end of an Internet-phone-call, and talk with it: they give each other some of these names, then turn those into the appropriate hash IDs. They then decide which commits (and other Git objects) to send or receive based on the hash IDs, and those parent links I mentioned earlier. Because hash IDs are based solely on each commit's contents, if your Git and their Git have the same commit, those two commits have the same hash ID.

What to know about `git reset`

The main way you move branch pointers around, in Git, is with git reset. Unfortunately, git reset is a complicated command.

Git has another important pair of features: while you make commits with git commit, you make them from something. The "something" itself is actually Git's index. The index is mainly where you build the next commit.

When you run git add, Git takes files from your work-tree—the place where you do your work, which has the files in the form you can actually use them—and copies them to the index. When you run git add --all, Git takes all the work-tree files² and adds them to the index.

At this point, they're in the index, but not committed, so they're not permanent. You can git reset them back out of the index. That's one of git reset's jobs: to re-set the index.

But those files are also in your work-tree. The work-tree versions are also not committed, so they are not permanent. You can git reset them too, and that is another of git reset's jobs.

And, of course, git reset can move a branch name, which is what we need for this particular case, because you did commit the files, so now they are stored permanently as part of a new commit. Moving branch names is the third of git reset's three main jobs.³ These three jobs are in a sort of order-of-importance:

git reset always moves (re-sets) the branch name;
git reset sometimes resets the index; and
git reset occasionally resets the work-tree.

You control these with --soft (which does job #1 only), --mixed (which does jobs #1 and #2), and --hard (which does all three jobs).

You must choose whether to keep your index and/or work-tree when you do your git reset that moves your branch. If you use git reset --hard, Git will do all three things. This is OK as long as you are ready to lose the temporary stuff in your index and work-tree.

The commits are permanent, although once you have forgotten their hash IDs, they will be hard to find. So it's OK to reset those away: you can get them back again. Moreover, you can save the hash ID with a new name, such as a new branch name, and then you can get them all back quite easily. Let's see one final example.

²All, that is, except files that (a) are not already in the index and (b) are listed in a .gitignore or similar "don't automatically add" file.

³The git reset command can also do several more-specific things, in which case it can stop doing some of its main jobs, but we won't mention them here.

Final example

Let's say that you are on master and have made many commits, not just one, that have too many files in them. You would like to remember (save) your work, but also throw out the index and work-tree and get master back in sync with your "upstream" repository, the one you cloned from, which you are having your Git call origin.

Your commit graph has some stuff in it that looks like this:

...--o--o--o   <-- origin/master
            \
             X--o--o--o--Y   <-- master (HEAD)

You are on your master branch, and it has all the bottom-row commits on it, plus all the top-row ones that both you and the other Git at origin share.

You made a mistake in the very first commit, the one marked X. So you wish to reset master all the way back to point to the same commit as origin/master, and throw out your current index and work-tree, because they're "clean" (committed and match the tip of your master, which is commit Y). You can run git reset --hard to do this, but then you'll forget all those hash IDs from X through Y inclusive.

So, you can simply make a new branch that points to commit Y:

git branch save-my-mistake

Now your picture looks like this:

...--o--o--o   <-- origin/master
            \
             X--o--o--o--Y   <-- master (HEAD), save-my-mistake

Now is the time to run:

git reset --hard origin/master

which moves the current branch—still master—to point to the same commit as origin/master, and also re-sets your index and work-tree to match that commit:

...--o--o--o   <-- master (HEAD), origin/master
            \
             X--o--o--o--Y   <-- save-my-mistake

And now you can get anything you want out of the save-my-mistake branch at any time, because that branch name remembers your commits for you.

Eventually, when you are done with it, you can simply delete that branch. Those commits will, sooner or later,⁴ be "garbage collected" and truly vanish from your repository. Until that time, you won't see them anymore, as the name by which you (and Git) can find the incomprehensible hash IDs is gone.

⁴The expiration time is rather tricky. It depends in part on reflog entries, which expire in 90 days for "reachable" commits and 30 days for "unreachable" commits, with reachable and unreachable defined in terms of the reference's current value. Once the reflog entry itself expires, the underlying Git objects become eligible for this garbage collection if they are globally unreachable, i.e., unreachable from any name. They still get a 14 day grace period from the time they were created, although in most cases, if the 30 or 90 day time period has expired already, 14 days was a long time ago. However, deleting a branch name deletes all its reflog entries, which may expose younger commits, which then get whatever is left of their 14 day period.

In any case, all of this is driven from git gc --auto, which other Git commands run when they think there's some good reason to do it. So until one of those Git commands runs git gc --auto, these expired objects may stick around. That fits pretty well with Git's informal name, "the Borg of version control": it sticks everything into its collective, but it may destroy your world. :-)

Upvotes: 1

yorammi

Reputation: 6458

If you didn't commit yet, run:

git stash -u

Upvotes: 0

thesecretmaster

Reputation: 1984

You could git reset back 2 commits by running git reset HEAD~2 --hard

Upvotes: 0

qräbnö

Reputation: 3031

If you committed already, one option is to delete your folder and clone it again. New changes will be lost! Only do this, if you are working alone on this repo and didn't change to much.

Upvotes: 0

how to un-do git add --all followed by multiple commits

Answers (4)

TL;DR:

Explanation

The bad news

The good news

What to know about git reset

Final example

Related Questions

What to know about `git reset`