Reputation: 33
I have a dataframe called imp2
(with about 6,000 rows) for which there are 9 columns labeled 'savres1'...'savres9'
. Values in each of these columns are either 1 or 0. For each row, the value of only 1 of these columns is 1 (rest are 0).
I'd like to create a new column called 'savres'
into which I would put a value between 1 and 9 that matches the index of the column (within this set of 9 columns) which holds the 1 value. E.g. if 'savres7'
is 1 and the rest of these columns are 0, then savres
should equal 7.
I've used several variations of which
, match
and %in%
to find the value, but I'm not getting the output I expect.
Below is a recent trial function and a sample set of rows that are created in the dataframe.
Recent attempted solution:
imp2 <- within(imp2, savres <- which(c(savres1, savres2, savres3, savres4, savres5, savres6, savres7, savres8, savres9) == 1) %% 9 + 1)
Results:
(Note, I didn't include all columns, but from the 4 here, you can see that the function isn't working - the 'savres'
value in row 4 should be 3 and those for rows 1 and 2 should not be 4 or 2)
sample <- head(imp2[c('savres1','savres2','savres3','savres4')],4)
savres | savres1 | savres2 | savres3 | savres4
4 | 0 | 0 | 0 | 0
2 | 0 | 0 | 0 | 0
1 | 1 | 0 | 0 | 0
1 | 0 | 0 | 1 | 0
Can anyone recommend a solution?
I thought this should be a fairly simple operation, but I've been frustrated with a number of approaches I've tried.
Upvotes: 1
Views: 828
Reputation: 887193
We can use max.col
i1 <- grep("^savre", names(df1))
transform(df1, savre = (max.col(df1[i1], "first"))* !!rowSums(df1[i1]))
# col1 col2 savres1 savres2 savres3 savres4 savre
#1 1 2 0 0 0 0 0
#2 2 3 0 0 0 0 0
#3 3 4 1 0 0 0 1
#4 4 5 0 0 1 0 3
df1 <- data.frame(col1 = 1:4, col2 = 2:5, savres1 = c(0, 0, 1,0),
savres2 = 0, savres3 = c(0, 0, 0, 1), savres4 = 0)
Upvotes: 1