Reputation: 5466
I have a data frame in R which looks like this
data
x1 x2 x3a x3b x3c x3d x4
A 43 0 0 0 1 P
B 23 0 1 0 0 Q
C 11 0 0 0 0 R
D 66 0 0 1 0 S
Now I want to merge columns x3a, x3b, x3c, x3d
to single column. The expected single column will contain column number having value 1 among x3a,x3b,x3c,x3d
. The value should be numbered (x3a=1,x3b=2,x3c=3,x3d=4)
. Expect result will be like following
x3
[1] 4 2 0 3
I tried reshape()
function but could not get what I actually wanted
q<-data[,3:6]
r<-reshape(q,varying=c("x3a","x3b","x3c","x3d"),v.names="x3",direction="long",times=c("x3a","x3b","x3c","x3d"))
final<-r[r$x3!=0,][,3]
But this did not give expected result. It missed out the value 0
between 2
and 3
:
final
[1]4 2 3
Upvotes: 4
Views: 751
Reputation: 886938
Another way would be to use unite
from tidyr
library(dplyr)
library(tidyr)
dat1 <- data
data$x3 <- unite(data[,3:6], x3, sep="") %>% #I could use starts_with("x3"), but it adds more characters
mutate(x3 = sub("-\\d", "0", paste(gregexpr("[^0]", x3)))) #x3 is character column
data
x1 x2 x3a x3b x3c x3d x4 x3
#1 A 43 0 0 0 1 P 4
#2 B 23 0 1 0 0 Q 2
#3 C 11 0 0 0 0 R 0
#4 D 66 0 0 1 0 S 3
Suppose, you have more than one match per row and value is other than 0
dat1$x3c[2] <- 3
dat1$x3 <- unite(dat1[,3:6], x3, sep="") %>%
mutate(x3 = sub("-\\d", "0", paste(gregexpr("[^0]", x3))))
dat1
# x1 x2 x3a x3b x3c x3d x4 x3
#1 A 43 0 0 0 1 P 4
#2 B 23 0 1 3 0 Q 2:3
#3 C 11 0 0 0 0 R 0
#4 D 66 0 0 1 0 S 3
Upvotes: 1
Reputation: 24480
Even if the question had received an optimal solution, I'd just add an answer based on the maybe not much known max.col
function, that gives the column index of the maximum element of a row. In this case:
data$x3 <- max.col(data[,3:6])*as.logical(rowSums(data[,3:6]))
This call gives 0 when all the elements are 0s (as required) and the maximum index otherwise. It has the advantage that the values can be any. If there is a tie, max.col
chooses a random column by default; you can set it also the first or the last maximum found.
Upvotes: 4
Reputation: 99321
One more: You can use match
across the rows, with the nomatch
argument set to zero
apply(df[-c(1,2,length(df))] == 1, 1, match, x = TRUE, nomatch = 0L)
# [1] 4 2 0 3
Upvotes: 1
Reputation: 93813
Using row
and col
indexing. Should be quick as you only assign once.
data$new <- 0
tmp <- data[3:6]==1
data$new[ row(tmp)[tmp] ] <- col(tmp)[tmp]
data
# x1 x2 x3a x3b x3c x3d x4 new
#1 A 43 0 0 0 1 P 4
#2 B 23 0 1 0 0 Q 2
#3 C 11 0 0 0 0 R 0
#4 D 66 0 0 1 0 S 3
tmp
can be changed to accommodate whatever logical comparison is required.
Upvotes: 5
Reputation: 21047
This worked:
data <- data.frame(
x1 = c('A','B','C','D'),
x2 = c(43,23,11,66),
x3a = c(0,0,0,0),
x3b = c(0,1,0,0),
x3c = c(0,0,0,1),
x3d = c(1,0,0,0),
x4 = c('P','Q','R','S')
)
data$x3 <- as.matrix(data[,c('x3a','x3b','x3c','x3d')]) %*% c(1,2,3,4)
The result:
x1 x2 x3a x3b x3c x3d x4 x3
1 A 43 0 0 0 1 P 4
2 B 23 0 1 0 0 Q 2
3 C 11 0 0 0 0 R 0
4 D 66 0 0 1 0 S 3
Chase made a comment that is relevant: What if x3a ... x3d
are different than zero or one? You can use ifelse()
to consider that scenario:
data$x3 <- as.matrix(ifelse(data[,c('x3a','x3b','x3c','x3d')] > 0, 1, 0)) %*% c(1,2,3,4)
Upvotes: 10
Reputation: 92282
@Barrankas answer is very clever and also vectorized, here's a less clever/vectorized option
as.numeric(apply(data[, 3:6], 1, function(x) which(x == 1)))
## [1] 4 2 NA 3
Upvotes: 5