Dylan Anderson
Dylan Anderson

Reputation: 31

Pulling value using column position within dataframe in R

I'm looking to find the value of a number in a certain column, designated by the values in another column. So very similar to index matching in Excel in a way.

Here is my reproducible data

set.seed(1)
my.df <- data.frame(x2 = sample(c(TRUE, FALSE), 100, replace = TRUE),
                    party1 = sample(100), 
                    party2 = sample(100),
                    party3 = sample(100),
                    fav.party = sample(c(1,2,3), 100, replace = TRUE))
head(my.df)

     x2 party1 party2 party3 fav.party
1  TRUE     39     61     55         2
2  TRUE     51     92     46         2
3 FALSE     89     30     20         1
4 FALSE     18     34     80         1
5 FALSE      3     85     36         1
6  TRUE     86     29     15         3

Basically I want to create a new column that matches the fav.party value (1, 2 or 3) with the party. Since my party values are stuck inside a huge dataset I have put the columns I want to extract from in 2:4 in this case, with column 1 being a placeholder.

For example the output would be

fav.party.score
61
92
89
18
3
15

Have tried apply, which, match, and just indexing all within the my.df[, 2:4] subset, but can't figure out how to use the fav.party value to denote the position. Any help with this would be super appreciated, this has been ruining my progress for ages now and I feel like its a simple answer.

Upvotes: 3

Views: 79

Answers (3)

akrun
akrun

Reputation: 887901

We can use row/column indexing in base R

my.df[startsWith(names(my.df), 'party')][cbind(seq_len(nrow(my.df)), my.df$fav.party)]

Upvotes: 1

GKi
GKi

Reputation: 39717

You can use a matrix for subsetting:

my.df[matrix(c(seq_len(nrow(my.df)), my.df[,5]+1), ncol = 2)]
#  [1]  91  48  33  45   1  41  29  73  64  23  68  99  57  60  51  50  10   1
# [19]  43  59  78  97  62  37  24  42  71  73  76  53  85  92  82  25   6   4
# [37]  46  30  52  75  37  69  49   3  51  46  77  28  55  93  96  58  42  89
# [55] 100  54  91  29  16  36 100   2  34  13  33  41  55  59  24   5  11  70
# [73]  66  14  17  76  18  63   8  79  52  26  24  14  82   9  97  10  43   2
# [91]   4  45   3  73  56  23  70  71  83  28

or use mapply with [ to get the values of my.df[,2:4] in the column given in my.df[,5].

mapply("[", asplit(my.df[,2:4], 1), my.df[,5])
#party2 party1 party1 party1 party3 party2 party2 party1 party2 party3 party3 
#    91     48     33     45      1     41     29     73     64     23     68 
#party1 party2 party1 party1 party2 party1 party1 party1 party1 party2 party2 
#    99     57     60     51     50     10      1     43     59     78     97 
#party3 party2 party1 party1 party2 party2 party1 party1 party3 party3 party1 
#    62     37     24     42     71     73     76     53     85     92     82 
#party3 party2 party2 party3 party3 party1 party3 party1 party2 party1 party1 
#    25      6      4     46     30     52     75     37     69     49      3 
#party3 party2 party2 party2 party1 party3 party3 party3 party2 party2 party2 
#    51     46     77     28     55     93     96     58     42     89    100 
#party1 party3 party3 party1 party3 party3 party1 party2 party1 party2 party3 
#    54     91     29     16     36    100      2     34     13     33     41 
#party3 party2 party2 party2 party1 party2 party1 party1 party2 party2 party2 
#    55     59     24      5     11     70     66     14     17     76     18 
#party3 party1 party3 party2 party3 party3 party3 party2 party2 party1 party3 
#    63      8     79     52     26     24     14     82      9     97     10 
#party2 party3 party1 party2 party2 party3 party1 party2 party1 party1 party2 
#    43      2      4     45      3     73     56     23     70     71     83 
#party3 
#    28 

Upvotes: 3

Martin Gal
Martin Gal

Reputation: 16998

Using dplyr:

my.df %>%
  rowwise() %>%
  mutate(score=get(paste0("party", fav.party)))

gives

# A tibble: 100 x 6
# Rowwise: 
   x2    party1 party2 party3 fav.party score
   <lgl>  <int>  <int>  <int>     <dbl> <int>
 1 TRUE      34     85     29         1    34
 2 TRUE      32     29     92         1    32
 3 FALSE     46      7     84         1    46
 4 FALSE     12     35     96         2    35
 5 TRUE      59     41     99         3    99
 6 TRUE      41     79     28         3    28
 7 FALSE     74     14     63         3    63
 8 TRUE      69     49     40         2    49
 9 TRUE      55      6     20         3    20
10 FALSE     68     44     45         2    44
# ... with 90 more rows

Upvotes: 1

Related Questions