Aliaksandra
Aliaksandra

Reputation: 273

How to read csv file(or any file) from GITLAB in python

I have private repo in gitlab and I want to read a CSV file in python. How can I access gitlab from python? I tried this code but it is not working.

import requests
from io import StringIO
import pandas as pd

url = "https://gitlab.com/...../CSV_HOMEWORK.csv"
df = pd.read_csv(StringIO(requests.get(url).text))
print(df.head())

How do I need to pass my credentials?

Error I have enter image description here

Upvotes: 2

Views: 1643

Answers (2)

Michael Rice
Michael Rice

Reputation: 72

To pass your credentials, create a token in GitLab and pass this token as an entry in headers. You will then need to URL encode your file path and provide the project_id and branch name.

To read the resulting csv, you will need to decode to a string as it is Base64 encoded.

import base64
import io
import pandas as pd
import requests

encoded_file_path = requests.utils.quote(file_path, safe='')
url = f"https://gitlab.com/api/v4/projects/{project_id}/repository/files/{encoded_file_path}?ref={branch}"
headers = {
  'PRIVATE-TOKEN': gitlab_token
}
response = requests.get(url, headers=headers)

if response.status_code == 200:

    # decode the file contents
    data = response.json()['content']
    data = base64.b64decode(data).decode('utf-8')
    data = io.StringIO(data) 
    df = pd.read_csv(data)

Upvotes: 0

Christopher Rice
Christopher Rice

Reputation: 106

See the Repository Files API access for GitLab -- that seems like what you're going for, since you're trying to access a file that is protected.

By creating an Access Token on GitLab for a user with access to the repository, you can pass this token alongside API requests for information.

If you wanted to use a library wrapper around the API rather than send/receive using GET/POST/etc, python-gitlab is a good library option.

Upvotes: 2

Related Questions