Reputation: 1215
Only just started using Panda's, so please excuse my ignorance.
Say I have a csv file with a number of row and columns:
ID, Name, Number, SomethingElse
1, John, 234234, "word"
2, Dave, 2342423, "word2"
3, John, 54365345, "word3"
I want to create a new csv with only unique values for Name. I am using:
unique = df.Name.unique()
To get a new DataFrame? with all the unique names, but I can't work out how to use this new DF to find each Name and create a new File with all the values for that name:
file1.csv
ID, Name, Number, SomethingElse
1, John, 234234, "word"
3, John, 54365345, "word3"
file2.csv
ID, Name, Number, SomethingElse
2, Dave, 234234, "word2"
Usually I would use a set, then nested loops in Python3, but I think i lack a fundamental understanding of what dataframes actually are.
Upvotes: 1
Views: 64
Reputation: 862511
If possible change filename by name
values for John.csv
or Dave.csv
loop by DataFrame.groupby
object with DataFrame.to_csv
:
for i, g in df.groupby('Name'):
g.to_csv(f'{i}.csv', index=False)
For lowercase filenames add lower()
:
for i, g in df.groupby('Name'):
g.to_csv(f'{i.lower()}.csv', index=False)
Also your solution is possible use with boolean indexing
for filtering:
for v in df.Name.unique():
df[df['Name'] == v].to_csv(f'{v.lower()}.csv', index=False)
Solution for file1.csv
, file2.csv
with enumerate
:
for j, (i, g) in enumerate(df.groupby('Name'), 1):
g.to_csv(f'file{j}.csv', index=False)
Or:
for j, v in enumerate(df.Name.unique(), 1):
df[df['Name'] == v].to_csv(f'file{j}.csv', index=False)
Upvotes: 3