pricasari
pricasari

Reputation: 25

Loop to assign a value if an observation in one column is equal to other column's name in R

I have a vector and a data set that are similar to:

id_vector <- as.character(c("n01", "n02", "n03"))
df_1 <- data.frame("id" = c("n01", "n02", "n02", "n03"), "n01" = NA, "n02" = NA,  "n03" = NA)
df_1$id <- as.character(df_1$id)

And I want the data set to be:

df_2 <- data.frame("id" = c("n01", "n02", "n02", "n03"), "n01" = c(1, NA, NA, NA), "n02" = c(NA, 1, 1, NA),  "n03" = c(NA, NA, NA, 1))

The solution should be simple, something like:

for (i in id_vector){
  df_1[i][df_1$id == i] <- 1
}

However, I can't use two []s. The error is:

Error in `[<-.data.frame`(`*tmp*`, df_1$id == i, value = 1) : 
duplicate subscripts for columns 

Any help?

Thanks!

Upvotes: 1

Views: 35

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388907

You can create a row/column matrix to change value to 1.

df_1[id_vector][cbind(seq_len(nrow(df_1)), match(df_1$id, id_vector))] <- 1
df_1

#   id n01 n02 n03
#1 n01   1  NA  NA
#2 n02  NA   1  NA
#3 n02  NA   1  NA
#4 n03  NA  NA   1

To explain he above, we use match to get column numbers to replace whereas seq_len(nrow(df_1)) gives us a sequence 1:nrow(df). Using cbind we turn them to matrix.

cbind(seq_len(nrow(df_1)), match(df_1$id, id_vector))
#     [,1] [,2]
#[1,]    1    1
#[2,]    2    2
#[3,]    3    2
#[4,]    4    3

Now we subset only id_vector columns, subset the dataframe based on the above matrix and assign the values to 1.

Upvotes: 0

akrun
akrun

Reputation: 887048

Here, we can subset the vector with [[. df_1[1] is still a data.frame with a single column

for (i in id_vector){
   df_1[[i]][df_1$id == i] <- 1
  }

identical(df_1, df_2)
#[1] TRUE

Upvotes: 1

Related Questions