Jakeeln
Jakeeln

Reputation: 353

R sorting: put each row in order independently

I am fairly knew to R I have data that looks like this

dataset:

    a   b   c   d
r1  1   3   4   6
r2  12  13  11  4
r3  12  94  12  0
r4  0   2   5   0
r5  3   1   4   1

I would like to know the column that has the highest value in each row

r1: d
r2: b
r3: b
r4: c
r5: c

Also, how would I extend this, if I had a larger dataset, and if I wanted to find the largest 5 columns (in order) and lowest 5 columns (in order)

Upvotes: 0

Views: 171

Answers (2)

alistaire
alistaire

Reputation: 43334

Subsetting names with max.col is handy:

# a matrix makes sense for this data
x <- structure(c(1L, 12L, 12L, 0L, 3L, 3L, 13L, 94L, 2L, 1L, 4L, 11L, 
    12L, 5L, 4L, 6L, 4L, 0L, 0L, 1L), .Dim = c(5L, 4L), .Dimnames = list(
    c("r1", "r2", "r3", "r4", "r5"), c("a", "b", "c", "d")))

# column name of row maximum
colnames(x)[max.col(x)]
#> [1] "d" "b" "b" "c" "c"

# column name of row minimum; note ties return the first occurrence
colnames(x)[max.col(-x)]
#> [1] "a" "d" "d" "a" "d"

# row name of column maximum
rownames(x)[max.col(t(x))]
#> [1] "r2" "r3" "r3" "r1"

Upvotes: 1

d.b
d.b

Reputation: 32548

use apply to check which element is the maximum in each row and then obtain the corresponding column name

apply(df, 1, function(x) colnames(df)[which.max(x)])
# r1  r2  r3  r4  r5 
#"d" "b" "b" "c" "c"

For columns corresponding to top two values

apply(X = df, MARGIN = 1, function(x)
    colnames(df)[order(x, decreasing = TRUE)[1:2]]) # = FALSE for lowest two values
#     r1  r2  r3  r4  r5 
#[1,] "d" "b" "b" "c" "c"
#[2,] "c" "a" "a" "b" "a"

DATA

df = structure(list(a = c(1L, 12L, 12L, 0L, 3L), b = c(3L, 13L, 94L, 
2L, 1L), c = c(4L, 11L, 12L, 5L, 4L), d = c(6L, 4L, 0L, 0L, 1L
)), .Names = c("a", "b", "c", "d"), class = "data.frame", row.names = c("r1", 
"r2", "r3", "r4", "r5"))

Upvotes: 4

Related Questions