Sumit Khanna
Sumit Khanna

Reputation: 41

Split large CSV to Multiple CSV containing each row

I am using Pandas to split large csv to multiple csv each containing single row. I have a csv having 1 million records and using below code it is taking to much time. For Eg: In the above case there will be 1 million csv created. Anyone can help me how to decrease time in splitting csv.

for index, row in lead_data.iterrows():
    row.to_csv(row['lead_id']+".csv")

lead_data is the dataframe object.

Thanks

Upvotes: 0

Views: 279

Answers (1)

MUNGAI NJOROGE
MUNGAI NJOROGE

Reputation: 1216

You don't need to loop through the data. Filter records by lead_id and the export the data to CSV file. That way you will be able to split the files based on the lead ID (assuming). Example, split all EPL games where arsenal was at home:

data=pd.read_csv('footbal/epl-2017-GMTStandardTime.csv')
print("Selecting Arsenal")
ft=data.loc[data['HomeTeam']=='Arsenal']
print(ft.head())
# Export data to CSV
ft.to_csv('arsenal.csv')
print("Done!")

This way it is much faster than using one record at a time.

Upvotes: 1

Related Questions