Read header csv python from GCS bucket

Question

I want to extract header from a csv inside cloud storage GCP. The problem is I extracted header, but I have a csv file with more than 20GB.

I used a library. It works to extract header, but it takes to much memory.

import gcsfs

fs = gcsfs.GCSFileSystem(project=PROJECT)
with fs.open(f'{bucket}/{file}', 'rb') as f:
    schema = f.read().decode("utf-8") 
    # Remove all words after the first new line
    schema = re.sub("(\n).*", "", schema)

I tried this command too but it returns nothing:

fs.read_block('gs://my-bucket/my-file.txt', offset=1000, length=10, delimiter=b'
')

My question is how to read only header not all file.

Read header csv python from GCS bucket

Answers (1)

Related Questions