Reputation: 5
I have a large file (>500 rows) with multiple data points for each unique item in the list, something like :
cheese | weight | location |
---|---|---|
gouda | 1.4 | AL |
gouda | 2 | TX |
gouda | 1.2 | CA |
cheddar | 5.3 | AL |
cheddar | 6 | MN |
chaddar | 2 | WA |
Havarti | 4 | CA |
Havarti | 4.2 | AL |
I want to make data frames for each cheese to store the relevant data
I have this:
main_cheese_file = pd.read_csv('CheeseMaster.csv')
cut_the_cheese = main_cheese_file.cheese.unique()
melted = {elem: pd.DataFrame() for elem in cut_the_cheese}
for slice in melted.slice():
melted[slice] = main_cheese_file[:][main_cheese_file.cheese == slice]
to split it up on the unique thing I want.
What I want to do with it is make df's that can be exported for each cheese with the cheese name as the file name.
So far I can force it with
melted['Cheddar'].to_csv('Cheddar.csv')
and get the Cheddars ....
but I don't want to have to know and type out each type of cheese on the list of 500 rows...
Is there a way to add this to my loop?
Upvotes: 0
Views: 30
Reputation: 14063
You can just iterate over a groupby object
import pandas as pd
df = pd.read_csv('CheeseMaster.csv')
for k,v in df.groupby('cheese'):
v.to_csv(f'{k}.csv', index=False)
Upvotes: 1