NiceStats
NiceStats

Reputation: 63

How can I recover local repo files if Github Desktop and Rstudio have interacted to corrupt the repo?

My question is in the title; here are the details to help a git expert answer my question. It appears two branches of my local repo have become "mixed up": some files have appeared on a branch they were never on; many files have disappeared completely. My theory is that simultaneous use of two git-connected applications (Github Desktop and Rstudio) caused the problem - explained below.

I have a git project with several branches, let's focus on branch1 and branch2. In branch2/Applications I had thousands of files. branch1 had no such Applications folder. This was the normal situation.

To prep for a long-overdue commit and sync to remote, I was cleaning up this project locally, mostly on branch2 (by modifying .gitignore and also moving some files to non-git local folders). At the same time I was using Github Desktop to switch back and forth between branch1 and branch2, a few times going as fast as once every 30 seconds. An Rstudio project (.Rproj) that lives in my repo's local directory was also open. Note, this .Rproj did not exist on branch1, yet the file remained open while I used Github to switch back and forth between branch1 and branch2 repeatedly.

Mysteriously, the large folder (15000+ files) Applications on branch2 disappeared from the filesystem. I closed Github and Rstudio, and the git process ran in the background at high CPU load for several hours. I killed the process.

The current puzzling situation is:

My theory is that I switched too fast (or something) between branches in Github, which did not allow Rstudio's git monitoring to "catch up", and somehow the local repo got corrupted.

Is it possible for Github and Rstudio to interact to corrupt a local repo? And if so, do you have guidance for how I should proceed to attempt to recover the large folder? I'm not a git expert but can research commands if anyone has ideas. I don't know where to start.

FYI the large folder does not appear in a recent form on the remote repo, since it had been so long since my last push. So I don't think I can recover it from there.

Upvotes: 2

Views: 613

Answers (1)

NiceStats
NiceStats

Reputation: 63

Here's what I ended up doing to restore the files. (Note that I do not actually want these private results files in a public git repo after all, so all I need to do is write them out to disk somewhere.) I ran git log --raw --all and found what looked to be the full list of missing files where each line had the original full file path, as well as a SHA. Fantastic. Then I wrote a script with each line containing something like

git cat-file -p [SHA] > "/full/path/file.ext"

This restored file types to their original condition and subdirectory structure, including text-based files as well as pdf, tar.gz, and R objects which had been saved from the R workspace.

Thanks for everyone's help. In the end, probably the key step in causing this error was that I interrupted the git process in the middle of Github Desktop doing a stash operation on the 15000+ files while I was switching branches. I assume that caused the git branches to get crossed up.

Upvotes: 1

Related Questions