Reputation: 63
My question is in the title; here are the details to help a git expert answer my question. It appears two branches of my local repo have become "mixed up": some files have appeared on a branch they were never on; many files have disappeared completely. My theory is that simultaneous use of two git-connected applications (Github Desktop and Rstudio) caused the problem - explained below.
I have a git project with several branches, let's focus on branch1
and branch2
. In branch2/Applications
I had thousands of files. branch1
had no such Applications folder. This was the normal situation.
To prep for a long-overdue commit and sync to remote, I was cleaning up this project locally, mostly on branch2
(by modifying .gitignore
and also moving some files to non-git local folders). At the same time I was using Github Desktop to switch back and forth between branch1
and branch2
, a few times going as fast as once every 30 seconds. An Rstudio project (.Rproj
) that lives in my repo's local directory was also open. Note, this .Rproj
did not exist on branch1
, yet the file remained open while I used Github to switch back and forth between branch1
and branch2
repeatedly.
Mysteriously, the large folder (15000+ files) Applications
on branch2
disappeared from the filesystem. I closed Github and Rstudio, and the git process ran in the background at high CPU load for several hours. I killed the process.
The current puzzling situation is:
branch2/Applications
is completely missing. Why? I did add it to .gitignore
, but my understanding is it should still appear on my local filesystem.branch1/Applications
now exists. But it only contains a small portion of the 15000+ files which ought to be on branch2/Applications
. Why?.Rproj
file has also "jumped branches." It now appears in the file system when branch1
is selected in Github, although before today's problems it was a branch2
file.My theory is that I switched too fast (or something) between branches in Github, which did not allow Rstudio's git monitoring to "catch up", and somehow the local repo got corrupted.
Is it possible for Github and Rstudio to interact to corrupt a local repo? And if so, do you have guidance for how I should proceed to attempt to recover the large folder? I'm not a git expert but can research commands if anyone has ideas. I don't know where to start.
FYI the large folder does not appear in a recent form on the remote repo, since it had been so long since my last push. So I don't think I can recover it from there.
Upvotes: 2
Views: 613
Reputation: 63
Here's what I ended up doing to restore the files. (Note that I do not actually want these private results files in a public git repo after all, so all I need to do is write them out to disk somewhere.) I ran git log --raw --all
and found what looked to be the full list of missing files where each line had the original full file path, as well as a SHA. Fantastic. Then I wrote a script with each line containing something like
git cat-file -p [SHA] > "/full/path/file.ext"
This restored file types to their original condition and subdirectory structure, including text-based files as well as pdf
, tar.gz
, and R
objects which had been saved from the R
workspace.
Thanks for everyone's help. In the end, probably the key step in causing this error was that I interrupted the git process in the middle of Github Desktop doing a stash operation on the 15000+ files while I was switching branches. I assume that caused the git branches to get crossed up.
Upvotes: 1