Geordie Wicks
Geordie Wicks

Reputation: 1215

Pandas do something for all unique column values

Only just started using Panda's, so please excuse my ignorance.

Say I have a csv file with a number of row and columns:

ID, Name, Number, SomethingElse
1, John, 234234, "word"
2, Dave, 2342423, "word2"
3, John, 54365345, "word3"

I want to create a new csv with only unique values for Name. I am using:

unique = df.Name.unique()

To get a new DataFrame? with all the unique names, but I can't work out how to use this new DF to find each Name and create a new File with all the values for that name:

file1.csv
ID, Name, Number, SomethingElse
1, John, 234234, "word"
3, John, 54365345, "word3"

file2.csv
ID, Name, Number, SomethingElse
2, Dave, 234234, "word2"

Usually I would use a set, then nested loops in Python3, but I think i lack a fundamental understanding of what dataframes actually are.

Upvotes: 1

Views: 64

Answers (1)

jezrael
jezrael

Reputation: 862511

If possible change filename by name values for John.csv or Dave.csv loop by DataFrame.groupby object with DataFrame.to_csv:

for i, g in df.groupby('Name'):
    g.to_csv(f'{i}.csv', index=False)

For lowercase filenames add lower():

for i, g in df.groupby('Name'):
    g.to_csv(f'{i.lower()}.csv', index=False)

Also your solution is possible use with boolean indexing for filtering:

for v in df.Name.unique():
    df[df['Name'] == v].to_csv(f'{v.lower()}.csv', index=False)

Solution for file1.csv, file2.csv with enumerate:

for j, (i, g) in enumerate(df.groupby('Name'), 1):
    g.to_csv(f'file{j}.csv', index=False)

Or:

for j, v in enumerate(df.Name.unique(), 1):
    df[df['Name'] == v].to_csv(f'file{j}.csv', index=False)

Upvotes: 3

Related Questions