Reputation: 449

Find the sequence of columns that has maximum values

I have got a data frame where each column has values. Is there a way to create a new column and see the sequence of Max values columns. For example. Please refer my expected output

df
COlA    COlB    COLC    COLD
 34      40      4.5      50
 35      70      2.0      30
 90      40      4.0      10

Expected Output

df
COlA    COlB    COLC    COLD            NewCOl
 34      40      4.5      50      COLD  > COLB > COLA > COLC
 35      70      2.0      30      COLB > COLA > COLD > COLC
 90      40      4.0      10      COLA > COLB  > COLD > COLC

Upvotes: 0

Answers (2)

akrun

Reputation: 887501

We can reshape into 'long' format after creating a row number column, then create a new column by pasteing the 'name' column ordered based on descending values in 'value' column, reshape back into 'wide' format and select the columns

library(dplyr)
library(tidyr) #v 1.0.0    
library(stringr)
df %>%
   mutate(rn = row_number()) %>%
   pivot_longer(cols = -rn) %>%
   # or use gather for older versions
   # gather(name, value, -rn) %>%       
   group_by(rn) %>% 
   mutate(NewCol = str_c(name[order(-value)], collapse=' > ')) %>% 
   pivot_wider(names_from = name, values_from = value) %>%
   # or use spread for older versions
   # spread(name, value) %>%
   ungroup %>%
   select(names(df), NewCol)
# A tibble: 3 x 5
#   COlA  COlB  COLC  COLD NewCol                   
#  <dbl> <dbl> <dbl> <dbl> <chr>                    
#1    34    40   4.5    50 COLD > COlB > COlA > COLC
#2    35    70   2      30 COlB > COlA > COLD > COLC
#3    90    40   4      10 COlA > COlB > COLD > COLC

Or another option is pmap

library(purrr)
df %>% 
   mutate(NewCol =pmap_chr(., ~ c(...) %>%
            {names(.)[order(-.)]} %>%
            str_c(collapse=" > ")))

If there are character columns, use select_if

df %>%
     mutate(NewCol = pmap_chr(select_if(., is.numeric), ~ c(...) %>%
              {names(.)[order(-.)]} %>%
            str_c(collapse=" > ")))

data

df <-  structure(list(COlA = c(34L, 35L, 90L), COlB = c(40L, 70L, 40L
), COLC = c(4.5, 2, 4), COLD = c(50L, 30L, 10L)),
class = "data.frame", row.names = c(NA, 
-3L))

Upvotes: 1

utubun

Reputation: 4505

In base R you could do it this way (correct solution only if there are no ties):

dat$NewCol <- by(
  unlist(dat),
  row(dat),
  function(x) paste(names(dat)[order(x, decreasing = T)], collapse = ' > ')
  )

dat

#   COlA COlB COLC COLD                    NewCol
# 1   34   40  4.5   50 COLD > COlB > COlA > COLC
# 2   35   70  2.0   30 COlB > COlA > COLD > COLC
# 3   90   40  4.0   10 COlA > COlB > COLD > COLC

Data:

dat <- structure(
  list(
    COlA = c(34L, 35L, 90L),
    COlB = c(40L, 70L, 40L),
    COLC = c(4.5, 2, 4),
    COLD = c(50L, 30L, 10L)
  ),
  class = "data.frame",
  row.names = c(NA,-3L)
)

Upvotes: 1

Find the sequence of columns that has maximum values

Answers (2)

data

Related Questions