Andreas Hoff
Andreas Hoff

Reputation: 3

Using R, how to select values from different columns depending on one column with the name of the column to select?

I need to create a new column in my data frame, with value value chosen from another column, with a different name per row.

E.g., I have something like this data frame, df, describing what happened (events like "a", "h" and so on) to some persons (id 10,12,13..23) on seven days in a week:

id  day mon tue wed thu fri sat sun
10  wed a   y   b   j   j   b   a
12  wed b   e   h   y   b   h   b
13  tue h   y   j   b   h   j   h
14  thu j   u   b   h   j   b   j
16  thu y   i   h   j   y   h   y
19  fri e   y   j   y   a   j   e
20  sun y   e   y   a   b   y   y
21  mon u   y   a   b   h   a   u
23  mon i   u   b   h   j   b   i

I need a new column "val", showing the value on the day mentioned in the "day" variable.

Hence, like this:

id  day val mon tue wed thu fri sat sun
10  wed b   a   y   b   j   j   b   a
12  wed h   b   e   h   y   b   h   b
13  tue y   h   y   j   b   h   j   h
14  thu h   j   u   b   h   j   b   j
16  thu j   y   i   h   j   y   h   y
19  fri a   e   y   j   y   a   j   e
20  sun y   y   e   y   a   b   y   y
21  mon u   u   y   a   b   h   a   u
23  mon i   i   u   b   h   j   b   i

I tried making a function I could apply on a column to produce a new column

lookupfunction <- function(x) {
  rownumberofx <- which(x=x)
  dayvalue <- df[rownumberofx,"day"]
  dayvalue
    rownumberofx <- NULL

}
df$val <- lookupfunction(df$day)

I hope to learn a code to produce the column "val"

Upvotes: 0

Views: 617

Answers (2)

davsjob
davsjob

Reputation: 1950

Usually having weekdays, or dates etc as columns make the anlysis harder. Oftern converting the data frame to "long" helps. Try:

Code

library(dplyr)
library(tidyr)

df %>% 
  gather(weekday, letter, -id, -day) %>% 
  group_by(id) %>% 
  mutate(val = letter[day == weekday]) %>% 
  spread(weekday, letter)

Result

     id day   val   fri   mon   sat   sun   thu   tue   wed  
  <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1    10 wed   b     j     a     b     a     j     y     b    
2    12 wed   h     b     b     h     b     y     e     h    
3    13 tue   y     h     h     j     h     b     y     j    
4    14 thu   h     j     j     b     j     h     u     b    
5    16 thu   j     y     y     h     y     j     i     h    
6    19 fri   a     a     e     j     e     y     y     j    
7    20 sun   y     b     y     y     y     a     e     y    
8    21 mon   u     h     u     a     u     b     y     a    
9    23 mon   i     j     i     b     i     h     u     b   

Data

df <- structure(list(id = c(10L, 12L, 13L, 14L, 16L, 19L, 20L, 21L, 
                            23L), day = c("wed", "wed", "tue", "thu", "thu", "fri", "sun", 
                                          "mon", "mon"), mon = c("a", "b", "h", "j", "y", "e", "y", "u", 
                                                                 "i"), tue = c("y", "e", "y", "u", "i", "y", "e", "y", "u"), wed = c("b", 
                                                                                                                                     "h", "j", "b", "h", "j", "y", "a", "b"), thu = c("j", "y", "b", 
                                                                                                                                                                                      "h", "j", "y", "a", "b", "h"), fri = c("j", "b", "h", "j", "y", 
                                                                                                                                                                                                                             "a", "b", "h", "j"), sat = c("b", "h", "j", "b", "h", "j", "y", 
                                                                                                                                                                                                                                                          "a", "b"), sun = c("a", "b", "h", "j", "y", "e", "y", "u", "i"
                                                                                                                                                                                                                                                          )), .Names = c("id", "day", "mon", "tue", "wed", "thu", "fri", 
                                                                                                                                                                                                                                                                         "sat", "sun"), row.names = c(NA, -9L), class = c("data.table", 
                                                                                                                                                                                                                                                                                                                          "data.frame"))

Upvotes: 0

Roland
Roland

Reputation: 132576

You can use subsetting with an index matrix (see help("[")):

#make sure that factor levels are in the same order as the columns
DF$day <- factor(DF$day, levels = names(DF)[-(1:2)])

#index matrix (does as.integer(DF$day) automatically)
ind <- cbind(seq_len(nrow(DF)), DF$day)
#     [,1] [,2]
#[1,]    1    3
#[2,]    2    3
#[3,]    3    2
#[4,]    4    4
#[5,]    5    4
#[6,]    6    5
#[7,]    7    7
#[8,]    8    1
#[9,]    9    1

#subset
DF[,-(1:2)][ind]
#[1] "b" "h" "y" "h" "j" "a" "y" "u" "i"

Upvotes: 1

Related Questions