Reputation: 25
I have a vector and a data set that are similar to:
id_vector <- as.character(c("n01", "n02", "n03"))
df_1 <- data.frame("id" = c("n01", "n02", "n02", "n03"), "n01" = NA, "n02" = NA, "n03" = NA)
df_1$id <- as.character(df_1$id)
And I want the data set to be:
df_2 <- data.frame("id" = c("n01", "n02", "n02", "n03"), "n01" = c(1, NA, NA, NA), "n02" = c(NA, 1, 1, NA), "n03" = c(NA, NA, NA, 1))
The solution should be simple, something like:
for (i in id_vector){
df_1[i][df_1$id == i] <- 1
}
However, I can't use two []s. The error is:
Error in `[<-.data.frame`(`*tmp*`, df_1$id == i, value = 1) :
duplicate subscripts for columns
Any help?
Thanks!
Upvotes: 1
Views: 35
Reputation: 388907
You can create a row/column matrix to change value to 1.
df_1[id_vector][cbind(seq_len(nrow(df_1)), match(df_1$id, id_vector))] <- 1
df_1
# id n01 n02 n03
#1 n01 1 NA NA
#2 n02 NA 1 NA
#3 n02 NA 1 NA
#4 n03 NA NA 1
To explain he above, we use match
to get column numbers to replace whereas seq_len(nrow(df_1))
gives us a sequence 1:nrow(df)
. Using cbind
we turn them to matrix.
cbind(seq_len(nrow(df_1)), match(df_1$id, id_vector))
# [,1] [,2]
#[1,] 1 1
#[2,] 2 2
#[3,] 3 2
#[4,] 4 3
Now we subset only id_vector
columns, subset the dataframe based on the above matrix and assign the values to 1.
Upvotes: 0
Reputation: 887048
Here, we can subset the vector with [[
. df_1[1]
is still a data.frame
with a single column
for (i in id_vector){
df_1[[i]][df_1$id == i] <- 1
}
identical(df_1, df_2)
#[1] TRUE
Upvotes: 1