Reputation: 117
I have several excel data files, each one referring to a different timing (i.e. 0h, 24h, 48h, ...), and the columns with the data of interest are named: 'Product' and 'Value'. I have concatenated those files by using the following:
result = pd.concat([pd.read_excel(file) for file in filenames], keys=t_list, names=['t'])
where filenames is a list containing the excel files, t_list is a list containing the timings, 't' is the name of the newly created column with the timings. So far so good, and I get a new dataframe with the following structure:
Concatenated Dataframe:
But then, if I sort with the following:
result['Product'].astype(str)
result.sort_values('Product', ascending=True)
I found the correct order for some items, but not for others. Indeed, I get something like the following:
Dataframe not properly sorted:
Can someone shed a light on this?
Upvotes: 0
Views: 548
Reputation: 36370
Can someone shed a light on this?
Default sorting algorithm of sort_values is not stable, therefore order of elements with equal Product
might be different than before sorting. To avoid that you might elect to use kind="mergesort"
in sort_values
.
Upvotes: 1
Reputation: 184
Instead of passing 'Products'
, pass a list of the columns you want to sort by.
If you want to sort by 'Product' and then by 'Value', it looks like this:
result.sort_values(['Product', 'Value'], ascending=True)
Upvotes: 0