Karl Birkkar
Karl Birkkar

Reputation: 125

Sorting a concatenated pandas dataframe

I want to concatenate two pandas dataframes A and B und afterwards sort them by two columns 'geohash' and 'timestamp'

A
    geohash  timestamp
0   a2a      15
1   b3a      14

B
    geohash  timestamp
0   a2b      15
1   b3b      14

After

AB = pd.concat([A,B],ignore_index=True)
AB.sort_values(['geohash','timestamp'])

I expect

AB
    geohash  timestamp
0   a2a      15
1   a2b      15
2   b3a      14
3   b3b      14

But I get

AB
    geohash  timestamp
0   a2a      15
1   b3a      14
2   a2b      14
3   b3b      15

Why does'nt pandas sort the whole dataframe AB?

Upvotes: 4

Views: 8778

Answers (1)

johnchase
johnchase

Reputation: 13705

sort_values does not happen in place. So when you run:

AB.sort_values(['geohash','timestamp'])

It is not updating AB rather returning a copy

AB.sort_values(['geohash','timestamp'], inplace=True)

Will update AB

Alternatively you can assign the sorted dataframe to a new variable

AB_sorted = AB.sort_values(['geohash','timestamp'])
AB_sorted 

geohash timestamp
0   a2a 15
2   a2b 15
1   b3a 14
3   b3b 15

Upvotes: 6

Related Questions