SWETA SWAIN
SWETA SWAIN

Reputation: 45

extract the row having column values length equal to 1 in R

my input data is df

     anger sad joy happy trust disgust
1     1   0   1     2     3       0
2     2   0   0     2     0       3
3     2   2   1     1     1       1
4     0   1   1     1     0       1

I want output like this

mydata

    anger sad joy happy trust disgust   col
1     1   0   1     2     3       0     trust
2     2   0   0     2     0       3     disgust

I want to extract max value colname from each row but output only those rows having only one max value colname and discard all other row with more than one colname.

i tried this

d1 <- df[!apply(df[-1], 1, function(x) anyDuplicated(x[x == max(x)])),]

but i am getting this

    anger sad joy happy trust disgust
1     1   0   1     2     3       0
2     2   0   0     2     0       3
3     2   2   1     1     1       1

I don't want third row in the output.

Thanks for help in advance.

Upvotes: 1

Views: 210

Answers (2)

akrun
akrun

Reputation: 887108

We can use max.col to get the index of columns for each row after subsetting the rows

d1 <- mydata[!apply(mydata[-1], 1, anyDuplicated),]
d1$out <- names(d1)[-1][max.col(d1[-1], 'first')]
d1
#   zone_id v1 v2 v3 v4 out
#1       1 12 15 18 20  v4
#3       3 31 28 14  2  v1
#4       4 12 16  9  5  v2
#5       5  5 18 10 12  v2

Update

If the OP wanted to remove only the duplicate values of max values, then replace the first line with

d1 <- mydata[!apply(mydata[-1], 1, function(x) anyDuplicated(x[x == max(x)])),]

Update2

Based on the newdataset by the OP, we don't need to remove the first column as it is not an id column

d2 <- mydata1[!apply(mydata1, 1, function(x) anyDuplicated(x[x == max(x)])),]

d2$out <- names(d2)[max.col(d2, 'first')]
d2
#  anger sad joy happy trust disgust     out
#1     1   0   1     2     3       0   trust
#2     2   0   0     2     0       3 disgust

data

mydata1 <- structure(list(anger = c(1L, 2L, 2L, 0L), sad = c(0L, 0L, 2L,
  1L), joy = c(1L, 0L, 1L, 1L), happy = c(2L, 2L, 1L, 1L), trust = c(3L,
  0L, 1L, 0L), disgust = c(0L, 3L, 1L, 1L)), .Names = c("anger", "sad", 
 "joy", "happy", "trust", "disgust"), row.names = c(NA, 4L), 
 class = "data.frame")

Upvotes: 1

Aramis7d
Aramis7d

Reputation: 2496

you can try:

  mydata %>%
  select(-zone_id) %>%
  mutate(mx = do.call(pmax, (.))) %>%
  select(mx) %>%
  cbind(mydata) %>%
  mutate( flg = rowSums(. == mx)) %>%
  filter(flg ==2) %>%
  select(-flg) %>%
  gather(key = out, value= v, -mx, -zone_id) %>%
  filter(mx == v) %>%
  select(zone_id, mx, out) %>%
  left_join(mydata)

which gives:

  zone_id mx out v1 v2 v3 v4
1       3 31  v1 31 28  2  2
2       4 16  v2  1 16  9  1
3       5 18  v2  5 18 10 12
4       1 20  v4 12 15 18 20

Upvotes: 0

Related Questions