Reputation: 105
I have a data.frame
of 14 columns made up of test scores at 13 time periods, all numeric. The last column, say X, denotes the specific time point that each student (rows) received a failing grade. I would like to create a separate column that has each student's failing test score from their specific failing time point.
dataframe<-data.frame(TestA=c(58,92,65,44,88),
TestB=c(17,22,58,46,98),
TestC=c(88,98,2,45,80), TestD=c(33,25,65,66,5),
TestE=c(98,100,100,100,100), X=c(2,2,3,NA,4))
Above is a condensed version with mock data. The first student failed at time point two, etc., but the fourth student never failed. The resulting column should be 17,2 2, 2, NA, 5. How can I accomplish this?
Upvotes: 0
Views: 1016
Reputation: 16121
Two alternative solutions.
One using map
function from purrr
package
library(tidyverse)
dataframe %>%
group_by(student_id = row_number()) %>%
nest() %>%
mutate(fail_score = map(data, ~c(.$TestA, .$TestB, .$TestC, .$TestD, .$TestE)[.$X])) %>%
unnest()
# # A tibble: 5 x 8
# student_id fail_score TestA TestB TestC TestD TestE X
# <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 17 58 17 88 33 98 2
# 2 2 22 92 22 98 25 100 2
# 3 3 2 65 58 2 65 100 3
# 4 4 NA 44 46 45 66 100 NA
# 5 5 5 88 98 80 5 100 4
And the other one uses rowwise
dataframe %>%
rowwise() %>%
mutate(fail_score = c(TestA, TestB, TestC, TestD, TestE)[X]) %>%
ungroup()
# # A tibble: 5 x 7
# TestA TestB TestC TestD TestE X fail_score
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 58 17 88 33 98 2 17
# 2 92 22 98 25 100 2 22
# 3 65 58 2 65 100 3 2
# 4 44 46 45 66 100 NA NA
# 5 88 98 80 5 100 4 5
I'm posting both because I have a feeling that the map
approach would be faster if you have many students (i.e. rows) and tests (i.e. columns).
Upvotes: 0
Reputation: 26343
You can try
dataframe[cbind(1:nrow(dataframe), dataframe$X)]
#[1] 17 22 2 NA 5
From ?`[`
A third form of indexing is via a numeric matrix with the one column for each dimension: each row of the index matrix then selects a single element of the array, and the result is a vector. Negative indices are not allowed in the index matrix. NA and zero values are allowed: rows of an index matrix containing a zero are ignored, whereas rows containing an NA produce an NA in the result.
Upvotes: 3