Deena
Deena

Reputation: 33

Read compressed CSV (gzip) file from AWS S3 into Panda data frame in Sagemaker

I am trying to read a large compressed CSV file from AWS S3 and convert it to a Panda data frame in Sagemaker. Is there any direct and clean approach to do it?

Upvotes: 2

Views: 2985

Answers (1)

Neil McGuigan
Neil McGuigan

Reputation: 48256

You can use the AWS Wrangler library to do so, easily

It supports GZIP compression, and will read the CSV directly into a Pandas dataframe

(pip install awswranger)

import awswrangler as wr

df = wr.s3.read_csv(path="s3://bucket/path/to/my.csv.gzip")

Upvotes: 2

Related Questions