Prabhu
Prabhu

Reputation: 5466

R how to merge different columns of data frame in one

I have a data frame in R which looks like this

data
x1  x2  x3a x3b x3c x3d x4
A   43  0   0   0   1   P
B   23  0   1   0   0   Q
C   11  0   0   0   0   R 
D   66  0   0   1   0   S

Now I want to merge columns x3a, x3b, x3c, x3d to single column. The expected single column will contain column number having value 1 among x3a,x3b,x3c,x3d. The value should be numbered (x3a=1,x3b=2,x3c=3,x3d=4). Expect result will be like following

x3
[1] 4    2    0    3

I tried reshape() function but could not get what I actually wanted

q<-data[,3:6]
r<-reshape(q,varying=c("x3a","x3b","x3c","x3d"),v.names="x3",direction="long",times=c("x3a","x3b","x3c","x3d"))
final<-r[r$x3!=0,][,3]

But this did not give expected result. It missed out the value 0 between 2 and 3:

final
[1]4    2    3 

Upvotes: 4

Views: 751

Answers (6)

akrun
akrun

Reputation: 886938

Another way would be to use unite from tidyr

library(dplyr)
library(tidyr) 

 dat1 <- data

 data$x3 <- unite(data[,3:6], x3, sep="") %>% #I could use starts_with("x3"), but it adds more characters
                mutate(x3 = sub("-\\d", "0", paste(gregexpr("[^0]", x3)))) #x3 is character column
 data
    x1 x2 x3a x3b x3c x3d x4 x3  
 #1  A 43   0   0   0   1  P  4
 #2  B 23   0   1   0   0  Q  2
 #3  C 11   0   0   0   0  R  0
 #4  D 66   0   0   1   0  S  3

Suppose, you have more than one match per row and value is other than 0

 dat1$x3c[2] <- 3
 dat1$x3 <- unite(dat1[,3:6], x3, sep="") %>% 
                      mutate(x3 = sub("-\\d", "0", paste(gregexpr("[^0]", x3))))


 dat1
 #  x1 x2 x3a x3b x3c x3d x4  x3
 #1  A 43   0   0   0   1  P   4
 #2  B 23   0   1   3   0  Q 2:3
 #3  C 11   0   0   0   0  R   0
 #4  D 66   0   0   1   0  S   3

Upvotes: 1

nicola
nicola

Reputation: 24480

Even if the question had received an optimal solution, I'd just add an answer based on the maybe not much known max.col function, that gives the column index of the maximum element of a row. In this case:

  data$x3 <-  max.col(data[,3:6])*as.logical(rowSums(data[,3:6]))

This call gives 0 when all the elements are 0s (as required) and the maximum index otherwise. It has the advantage that the values can be any. If there is a tie, max.col chooses a random column by default; you can set it also the first or the last maximum found.

Upvotes: 4

Rich Scriven
Rich Scriven

Reputation: 99321

One more: You can use match across the rows, with the nomatch argument set to zero

apply(df[-c(1,2,length(df))] == 1, 1, match, x = TRUE, nomatch = 0L)
# [1] 4 2 0 3

Upvotes: 1

thelatemail
thelatemail

Reputation: 93813

Using row and col indexing. Should be quick as you only assign once.

data$new <- 0
tmp <- data[3:6]==1
data$new[ row(tmp)[tmp] ] <- col(tmp)[tmp]

data

#  x1 x2 x3a x3b x3c x3d x4 new
#1  A 43   0   0   0   1  P   4
#2  B 23   0   1   0   0  Q   2
#3  C 11   0   0   0   0  R   0
#4  D 66   0   0   1   0  S   3

tmp can be changed to accommodate whatever logical comparison is required.

Upvotes: 5

Barranka
Barranka

Reputation: 21047

This worked:

data <- data.frame(
  x1 = c('A','B','C','D'),
  x2 = c(43,23,11,66),
  x3a = c(0,0,0,0),
  x3b = c(0,1,0,0),
  x3c = c(0,0,0,1),
  x3d = c(1,0,0,0),
  x4 = c('P','Q','R','S')
)
data$x3 <- as.matrix(data[,c('x3a','x3b','x3c','x3d')]) %*% c(1,2,3,4)

The result:

  x1 x2 x3a x3b x3c x3d x4 x3
1  A 43   0   0   0   1  P  4
2  B 23   0   1   0   0  Q  2
3  C 11   0   0   0   0  R  0
4  D 66   0   0   1   0  S  3

Chase made a comment that is relevant: What if x3a ... x3d are different than zero or one? You can use ifelse() to consider that scenario:

data$x3 <- as.matrix(ifelse(data[,c('x3a','x3b','x3c','x3d')] > 0, 1, 0)) %*% c(1,2,3,4)

Upvotes: 10

David Arenburg
David Arenburg

Reputation: 92282

@Barrankas answer is very clever and also vectorized, here's a less clever/vectorized option

as.numeric(apply(data[, 3:6], 1, function(x) which(x == 1)))
## [1]  4  2 NA  3

Upvotes: 5

Related Questions