How to read only 5 records from s3 bucket and return it without getting all data of csv file

Question

Hello guys I know lots of similar questions i'll find here but i have a code which is executing properly which is returning five records also my query is how should i only read the entire file and atlast return with desire rows just supose i have csv file which have size in gb so i don't want to return the entire gb file data for getting only 5 records so please tell me how should i get it....Please if possible explain my code if it is not good why it is not good.. code:

import boto3
from botocore.client import Config
import pandas as pd

ACCESS_KEY_ID = 'something'
ACCESS_SECRET_KEY = 'something'
BUCKET_NAME = 'something'
Filename='dataRepository/source/MergedSeedData(Parts_skills_Durations).csv'

client = boto3.client("s3",
                     aws_access_key_id=ACCESS_KEY_ID,
                     aws_secret_access_key=ACCESS_SECRET_KEY)
obj = client.get_object(Bucket=BUCKET_NAME, Key=Filename)
Data = pd.read_csv(obj['Body'])
# data1 = Data.columns
# return data1
Data=Data.head(5)
print(Data)

This my code which is running fine also getting the 5 records from s3 bucket but i have explained it what i'm looking for any other query feel free to text me...thnxx in advance

Paritosh Singh · Accepted Answer

You can use the pandas capability of reading a file in chunks, just loading as much data as you need.

data_iter = pd.read_csv(obj['Body'], chunksize = 5)
data = data_iter.get_chunk()
print(data)

How to read only 5 records from s3 bucket and return it without getting all data of csv file

Answers (2)

Related Questions