Reputation: 55
I've struggling trying to insert a sorting in this code:
all_files = glob.glob(path + "/*.CSV") # To get all csv files disorganized
all_csv = [pd.read_csv(f, sep=',') for f in all_files] # List of dataframes
# I want to sort it by the values of the first column of each dataframe in the all_csv list.
for f in all_csv:
goal = pd.DataFrame.sort_values(by=(f.iloc[:,0])) #Maybe something like this??
So, Anyone has an idea how can I do this? I've looking on other post but does not apply to a undefined column name (a.k.a. f.iloc[:,0] ) or a list of dataframes (I also thought of using dictionaries but I'd like to see if is posible to use with lists).
Thank you :)
May be useful this ideas: link, link
Upvotes: 0
Views: 50
Reputation:
You can index df.columns
for individual dataframes:
goal = df.sort_values(by=df.columns[0])
For the entire list of dataframes, you can use list comprehension:
all_csv = [df.sort_values(by=df.columns[0]) for df in all_csv]
Suppose you had a dataframe that looked like:
a b
0 2 1
1 3 2
2 1 3
Then when you run:
df = df.sort_values(by=df.columns[0])
df
becomes:
a b
2 1 3
0 2 1
1 3 2
Upvotes: 0
Reputation: 644
This uses enke's code to sort each dataframe by the first column, but returns all dataframes in a list as you requested:
all_csv = [df.sort_values(by=df.columns[0]) for df in all_csv]
Upvotes: 1