P. Z
P. Z

Reputation: 99

Take a variable from a data frame row based on index found on a column

I want to add a new column in a data frame showing the variable based on the index shown in the last column of the data frame.

My data frame is something like this:

 v1 v2 v3 v4 v5
1  A  K  F  W  2
2  B  O  J  Q  4
3  C  M  T  A  3
4  D  Z  R  B  2

so want to get this

  v1 v2 v3 v4 v5 v6
1  A  K  F  W  2  K
2  B  O  J  Q  4  Q
3  C  M  T  A  3  T
4  D  Z  R  B  2  Z

at the end.

Has anyone any ideas on how to do this??

Upvotes: 3

Views: 118

Answers (3)

www
www

Reputation: 39154

Another option in base R. We can use sapply to loop through every rows in the data frame to get the string based on the index in the 5th column.

dat$V6 <- sapply(1:nrow(dat), function(x) dat[-5][x, dat[[5]][x]])
dat
#   v1 v2 v3 v4 v5 V6
# 1  A  K  F  W  2  K
# 2  B  O  J  Q  4  Q
# 3  C  M  T  A  3  T
# 4  D  Z  R  B  2  Z

DATA

dat <- read.table(text = " v1 v2 v3 v4 v5
1  A  K  F  W  2
                  2  B  O  J  Q  4
                  3  C  M  T  A  3
                  4  D  Z  R  B  2",
                  header = TRUE, stringsAsFactors = FALSE)

Upvotes: 1

Param
Param

Reputation: 57

tried the following with dplyr, added a group variable to identify the row. the mutate isn't perfect as I am reffering df again in it - if anyone can correct it pls share :). the grp var is requried for the row index.

dfNew = df %>% mutate(grp = seq(1:nrow(df))) %>% group_by(grp) %>% mutate(v6 = df[grp,v5]) %>% ungroup() %>% select(-grp)

Got the result but with a few warnings too (think they were because of the char encoding). Agree to the comment above please add a line of code for the data creation

dfNew
# A tibble: 4 x 6
      v1     v2     v3     v4    v5    v6
    <fctr> <fctr> <fctr> <fctr> <dbl> <chr>
1      a      k      f      w     2     k
2      b      o      j      q     4     q
3      c      m      t      a     3     t
4      d      z      r      b     2     z

Upvotes: 0

akrun
akrun

Reputation: 886938

We can use row/column indexing by cbinding the 'v5' i.e. column index with row index (1:nrow(df1) or seq_len(nrow(df1))) to extract the elements corresponding to the first 4 columns of dataset and assign it to 'v6'

df1$v6 <- df1[-5][cbind(1:nrow(df1), df1$v5)]
df1
#  v1 v2 v3 v4 v5 v6
#1  A  K  F  W  2  K
#2  B  O  J  Q  4  Q
#3  C  M  T  A  3  T
#4  D  Z  R  B  2  Z

Upvotes: 4

Related Questions