Alex
Alex

Reputation: 198

Sort Dataframe by Descending Rows AND Columns at the Same Time

Currently have a dataframe that is countries by series, with values ranging from 0-25

I want to sort the df so that the highest values appear in the top left (first), while the lowest appear in the bottom right (last).

FROM

        A   B   C   D  ...

USA     4   0   10  16
CHN     2   3   13  22
UK      2   1   8   14
...

TO

      D   C   A   B   ...

CHN   22  13  2   3
USA   16  10  4   0
UK    14  8   2   1

...

In this, the column with the highest values is now first, and the same is true with the index.

I have considered reindexing, but this loses the 'Countries' Index.

    D   C   A   B   ...

0   22  13  2   3
1   16  10  4   0
2   14  8   2   1
...

I have thought about creating a new column and row that has the Mean or Sum of values for that respective column/row, but is this the most efficient way?

How would I then sort the DF after I have the new rows/columns??

Is there a way to reindex using...

df_mv.reindex(df_mv.mean(or sum)().sort_values(ascending = False).index, axis=1)

... that would allow me to keep the country index, and simply sort it accordingly?

Thanks for any and all advice or assistance.

EDIT

Intended result organizes columns AND rows from largest to smallest.

Regarding the first row of the A and B columns in the intended output, these are supposed to be 2, 3 respectively. This is because the intended result interprets the A column as greater than the B column in both sum and mean (even though either sum or mean can be considered for the 'value' of a row/column).

By saying the higher numbers would be in the top left, while the lower ones would be in the bottom right, I simply meant this as a general trend for the resulting df. It is the columns and rows as whole however, that are the intended focus. I apologize for the confusion.

Upvotes: 4

Views: 1389

Answers (4)

Mykola Zotko
Mykola Zotko

Reputation: 17804

Using numpy:

arr = df.to_numpy()
arr = arr[np.max(arr, axis=1).argsort()[::-1], :]
arr = np.sort(arr, axis=1)[:, ::-1]
df1 = pd.DataFrame(arr, index=df.index, columns=df.columns)

print(df1)

Output:

      A   B  C  D
USA  22  13  3  2
CHN  16  10  4  0
UK   14   8  2  1

Upvotes: 1

gosuto
gosuto

Reputation: 5741

Here's another way, this time without transposing but using axis=1 as an argument:

df = df.sort_values(df.max().idxmax(), ascending=False)
df = df.sort_values(df.index[0], axis=1, ascending=False)

Upvotes: 1

ansev
ansev

Reputation: 30920

You could use:

rows_index=df.max(axis=1).sort_values(ascending=False).index
col_index=df.max().sort_values(ascending=False).index
new_df=df.loc[rows_index,col_index]
print(new_df)

      D   C  A  B
CHN  22  13  2  3
USA  16  10  4  0
UK   14   8  2  1

Upvotes: 5

gosuto
gosuto

Reputation: 5741

Use .T to transpose rows to columns and vice versa:

df = df.sort_values(df.max().idxmax(), ascending=False)
df = df.T
df = df.sort_values(df.columns[0], ascending=False).T

Result:

>>> df
      D   C  B  A
CHN  22  13  3  2
USA  16  10  0  4
UK   14   8  1  2

Upvotes: 3

Related Questions