iclman
iclman

Reputation: 1436

gitlab-ce 12.X : how do I find the hashed storage path of a repository on the server?

With gitlab-ce-12.x, Geo requires the storage path to be hashed (https://docs.gitlab.com/ee/administration/repository_storage_types.html)

For a given repository, the data will therefore be stored in : "@hashed/#{hash[0..1]}/#{hash[2..3]}/#{hash}.git"

From a practical point of view, say I have a repository whose URL is

https://my-gitlab-server/Group1/project1.git

How do I work out the path to the storage on the server ? i.e. how do i find the value of

#{hash[0..1]}/#{hash[2..3]}/#{hash}

Thanks

Upvotes: 14

Views: 6851

Answers (3)

redseven
redseven

Reputation: 1055

The reverse direction looks easier (telling which repo bellongs to a known path). Gitlab stores the "Gitlab path" in the repo, called fullpath eg:

cat /var/opt/gitlab/git-data/repositories/@hashed/ff/2c/ff2ccb6ba423d326bd549ed4cfb76e96976a0dcde05a01996a1cdb9f83422ec4.git/config

output:

[core]
    repositoryformatversion = 0
    filemode = true
    bare = true
[gitlab]
    fullpath = mygroup/myproject

If you don't have too many repos you can go through all of them and make a map:

for GITDIR in $(find /var/opt/gitlab/git-data/repositories/@hashed/ -maxdepth 3 -type d -name '*[0-9a-f].git'); do
   echo "$(cat ${GITDIR}/config | grep fullpath | awk -F " = " '{print $2}')   $GITDIR"
done 

The output is a list off all your repos (except wikis) in a gitlab path - dir path pairs format.

Eg:

mygroup/myproject   /var/opt/gitlab/git-data/repositories/@hashed/ff/2c/ff2ccb6ba423d326bd549ed4cfb76e96976a0dcde05a01996a1cdb9f83422ec4.git

ps: I have ~150 repos now and this little script finishes in no time (~half sec)

Upvotes: 4

mike
mike

Reputation: 2106

As @iclman describes, and since then documented you can calculate the hashed storage path from the sha256 hash of the project ID. Here is how you can do this with Ruby:

proj_id = '<PROJECT_ID>'
hash = Digest::SHA2.new(256).hexdigest proj_id
"/var/opt/gitlab/git-data/repositories/@hashed/#{hash[0..1]}/#{hash[2..3]}/#{hash}.git"

Or with a shell function (bash/zsh):

get-gitlab-project-path() {
    PROJECT_HASH=$(echo -n $1 | openssl dgst -sha256 | sed 's/^.* //')
    echo "/var/opt/gitlab/git-data/repositories/@hashed/${PROJECT_HASH:0:2}/${PROJECT_HASH:2:2}/${PROJECT_HASH}.git"
}

Fish shell function:

function get-project-path --description 'Print the GitLab hashed storage path of a project ID'
  set PROJECT_HASH (echo -n $argv[1] | openssl dgst -sha256 | string trim)
  set B1 (string sub --start=1 --length=2 $PROJECT_HASH)
  set B2 (string sub --start=3 --length=2 $PROJECT_HASH)
  set HASHED_DIR "/var/opt/gitlab/git-data/repositories/@hashed"
  echo $HASHED_DIR/$B1/$B2/$PROJECT_HASH.git
end

Upvotes: 1

iclman
iclman

Reputation: 1436

I found the answer to my question.

In order to get the hashed storage location of a project, you first need to get the project id of the project repository.

Once you get that project id, say your project id is 1, you get the hash this way :

Say project.id is 1

echo -n 1 | sha256sum

=> You get the HASH 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b

The hashed storage location of your repository on the server will therefore be :

server/@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git

This has been discussed by gitlab developers in https://gitlab.com/gitlab-org/gitlab-ce/issues/63250

Upvotes: 21

Related Questions