Tarique
Tarique

Reputation: 711

File truncated on upload to GCS

I am uploading a relatively small(<1 MiB) .jsonl file on Google CLoud storage using the python API. The function I used is from the gcp documentation:

def upload_blob(key_path,bucket_name, source_file_name, destination_blob_name):
    """Uploads a file to the bucket."""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"
    # The path to your file to upload
    # source_file_name = "local/path/to/file"
    # The ID of your GCS object
    # destination_blob_name = "storage-object-name"

    storage_client = storage.Client.from_service_account_json(key_path)
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    blob.upload_from_filename(source_file_name)

    print(
        "File {} uploaded to {}.".format(
            source_file_name, destination_blob_name
        )
    )

The issue I am having is that the .jsonl file is getting truncated at 9500 lines after the upload. In fact, the 9500th line is not complete. I am not sure what the issue is and don't think there would be any limit for this small file. Any help is appreciated.

Upvotes: 1

Views: 571

Answers (1)

Alex Andriati
Alex Andriati

Reputation: 33

I had a similar problem some time ago. In my case the upload to bucket was called inside a with python clause right after the line where I recorded contents to source_file_name, so I just needed to move the upload line outside the with in order to properly recorded and close local file to be uploaded.

Upvotes: 1

Related Questions