Reputation:
I know it is possible to fetch then use checkout with the path/to/file to download that specific file.
My issue is that I have a 1 MB data cap per day and git fetch will download all the data anyway even if it does not save them to disc until I use git checkout. I still used my data
Is my understanding of how git fetch/checkout correct? is there a way to download a specific file only to see if there is a new version before proceeding with the download.
Upvotes: 32
Views: 79874
Reputation: 11
As of Git 2.0+ you can initialize a git repo without downloading the files. I used this format below to pull a few files from a submodule in my Gitlab CI/CD pipeline without using "git submodule update" so that I could save time/space when building docker images.
git -C path/to/submodule clone --depth 1 --no-checkout --filter=blob:none https://gitlab-ci-token:[email protected]/your/repo.git
git -C path/to/submodule checkout COMMIT_HASH/BRANCH_NAME -- FILE_NAME
Upvotes: 1
Reputation: 391
This works for me on a local gitlab:
curl http://mylocalgitlab/MYGROUP/-/raw/master/PATH/TO/FILE.EXT -o FILE.EXT
Upvotes: 0
Reputation: 1083
Using python-gitlab:
#!/usr/bin/python3
import gitlab
import sys
def download_file(host, token, project_name, branch_name, file_path, output):
try:
gl = gitlab.Gitlab(host, private_token=token)
pl = gl.projects.list(search=project_name)
for p in pl:
if p.name == project_name:
project = p
break
with open(output, 'wb') as f:
project.files.raw(file_path=file_path, ref=branch_name, streamed=True, action=f.write)
except Exception as e:
print("Error:", e)
num_arguments = len(sys.argv)
if num_arguments < 6:
print('Usage: ./download-gitlab-file.py host token project_name branch_name file_path output')
else:
download_file(
sys.argv[1],
sys.argv[2],
sys.argv[3],
sys.argv[4],
sys.argv[5],
sys.argv[6]
)
Upvotes: 3
Reputation: 1496
To expand on the other answer, it is (now? I don't know when this feature was added) possible to just get a raw file instead of a json with a base64 encoding of the file.
From the documentation:
Endpoint: GET /projects/:id/repository/files/:file_path/raw
Example:
curl --header "PRIVATE-TOKEN: <your_access_token>" "https://gitlab.example.com/api/v4/projects/13083/repository/files/path%2Fto%2Ffile%2Efoo/raw?ref=master"
Note that in the example, the full path to the file is URL encoded leading to path/to/file.foo
Upvotes: 23
Reputation: 1029
Gitlab has a rest API for that.
You can GET a file from repository with curl:
curl https://gitlab.com/api/v4/projects/:id/repository/files/:filename\?ref\=:ref
For example:
curl https://gitlab.com/api/v4/projects/12949323/repository/files/.gitignore\?ref\=master
If your repository isn't public you also need to provide an access token by adding --header 'Private-Token: <your_access_token>'
.
You can check how to find repository api id here.
There is also a python library that uses this api.
Note that this is GitLab specific solution and won't work for other hostings.
Upvotes: 34