Reputation: 957
I have a dataset containing n observation and a column containing observation indices, e.g.
col1 col2 col3 ID
12 0 4 1
6 5 3 1
5 21 42 2
and want to create a new column based on my index like
col1 col2 col3 ID col_new
12 0 4 1 12
6 5 3 1 6
5 21 42 2 21
without for loops. Actually I'm doing
col_new <- rep(NA, length(ID))
for (i in 1:length(ID))
{
col_new[i] <- df[i, ID[i]]
}
Is there a better or (tidyverse
) way?
Upvotes: 7
Views: 3467
Reputation: 4824
Another tidyverse approach, this time that uses only tidyr
and dplyr
:
df %>%
gather(column, col_new, -ID) %>%
filter(paste0('col', ID) == column) %>%
select(col_new) %>%
cbind(df, .)
It's longer than @markdly's elegant one-liner but if you're like me and get confused by purrr
most of the time, this might read easier.
Upvotes: 1
Reputation: 4534
For a possible tidyverse
approach, how about using dplyr::mutate
combined with purrr::map2_int
.
library(dplyr)
library(purrr)
mutate(df, new_col = map2_int(row_number(), ID, ~ df[.x, .y]))
#> col1 col2 col3 ID new_col
#> 1 12 0 4 1 12
#> 2 6 5 3 1 6
#> 3 5 21 42 2 21
Data
df <- read.table(text = "col1 col2 col3 ID
12 0 4 1
6 5 3 1
5 21 42 2", header = TRUE)
Upvotes: 5
Reputation: 886938
We can use row/column
indexing from base R
which should be very fast
df1$col_new <- df1[1:3][cbind(seq_len(nrow(df1)), df1$ID)]
df1$col_new
#[1] 12 6 21
Upvotes: 5
Reputation: 28309
Solution using data.table
:
library(data.table)
# Using OPs data
setDT(df)
df[, col_new := get(paste0("col", ID)), 1:nrow(df)]
# df
col1 col2 col3 ID col_new
1: 12 0 4 1 12
2: 6 5 3 1 6
3: 5 21 42 2 21
Explanation:
1:nrow(df)
ID
: get(paste0("col", ID))
col_new :=
Upvotes: 2