bulletshot60
bulletshot60

Reputation: 165

Remove File from Bare Repository in Git

A bit of back story, we maintain a submission system that allows students to submit source files to a git repository. There are two options for doing this: for the more advanced students we simply let them use git. For the beginner students, we have a web interface that allows them to upload files to their repository.

The web interface itself is pretty basic, and right now only supports adding files. We would like to also give the students the ability to delete, however, we need to do the delete on the bare repository without cloning. The clone operation is too expensive and requires too much space considering we have hundreds of repositories the submission system interacts with.

We've been able to figure out how to add files directly to the tree without cloning. We haven't been able to figure out the delete part in a bare repo. I tried the following.

rm objects/70/574e5c0d5f1fb820f66fd3fd3a3c0c4ed398bb # blob id of file to be removed
git write-tree # copying output
echo "removing file" | git commit-tree <copied id from previous command> -p <previous HEAD> # copying ouput
git update-ref refs/heads/master <copied id from previous command>

Technically this works, it just removes all the files from the repo which isn't exactly what we want. I'm not exactly sure based on the internals of git how to remove a singular blob from the tree and update the bare repo, keeping the other files.

Any ideas?

Upvotes: 1

Views: 515

Answers (1)

bulletshot60
bulletshot60

Reputation: 165

I think I have found a solution, I don't particularly like it, but it works.

  1. Using git log, get the sha1 id of the current HEAD.
  2. git read-tree --empty to ensure that we can add files we want to keep without keeping the ones we don't
  3. git ls-tree -r HEAD
  4. For each entry returned above except the one you want to remove git update-index -add --cacheinfo <value from ls-tree> <sha1 from ls-tree> <name from ls-tree>
  5. git write-tree saving value
  6. echo 'removing <file>' | git commit-tree <value from previous command> -p <sha1 of current master HEAD> saving value
  7. git update-ref refs/heads/master <value from previous command>

If anyone happens to know of a better way of accomplishing this, I'm all ears. I'll attach a python script (using GitPython) that accomplishes the above shortly.

Edit: Python (w/ GitPython) added

def repo_delete(repo, path: str):
    """Delete the specified file at <path> from the repository."""
    headSha = repo.heads[0].commit.hexsha
    import re
    g = repo.git
    tree = g.ls_tree("-r", "HEAD")
    g.read_tree("--empty")
    for blob in tree.split("\n"):
        blob_parts = re.split("[ \t]", blob)
        if blob_parts[3] != path:
            print(f"adding {blob_parts[3]}")
            g.update_index("--add", "--cacheinfo", blob_parts[0], blob_parts[2], blob_parts[3])
    treeSha = g.write_tree()
    newHeadSha = g.commit_tree(treeSha, "-m", f'"removing {path}"', "-p", headSha)
    g.update_ref("refs/heads/master", newHeadSha)
    print("done")

Upvotes: 2

Related Questions