Reputation: 1869
I don't understand why the following code is not working. I have the following dataframe:
ind = pd.MultiIndex.from_tuples([(2, 9), (2, 0), (3, 15), (3, 8), (2, 28), (2, 15), (2, 10), (3, 9)], names=['A','B'])
values = [0.2719, 0.2938, 0.3281, 0.3310, 0.3323, 0.3640, 0.3647, 0.5218]
df = pd.DataFrame(data = values, index=ind, columns = ['values'])
applying a groupby sort_values doesn't do anything:
df.groupby('A').apply(lambda x: x.sort_values(by='values'))
Note that the values are already globally sorted.
Now when i just swap two rows, and thereby destroy the global prior sorting, then it suddenly works:
df1 = df.iloc[np.r_[1,0,2:len(df)]]
df1.groupby('A').apply(lambda x: x.sort_values(by='values'))
This is the result I would expect from the other code also.
Upvotes: 2
Views: 1008
Reputation: 1446
It doesn't say a great deal about the combine
part of the split-apply-combine
in the docs:
GroupBy will examine the results of the apply step and try to return a sensibly combined result.
Since you're not changing the number of rows or their order in the first example, apply
functions more like transform
which returns a "like-indexed object".
I think if what you want is a nested sort, you can just pass a list to sort_values
directly, like so:
df.sort_values(["A", "values"])
values
A B
2 9 0.2719
0 0.2938
28 0.3323
15 0.3640
10 0.3647
3 15 0.3281
8 0.3310
9 0.5218
Upvotes: 1