Reputation: 21
I have a dataset with 100 questions (below I have a subset with 3 questions). I want to replace all the answer IDs with the actual answers provided in the "answer" dataset. The final result is shown in the "result" data frame.
data
name q1 q2 q3
1 a 1 3 7
2 a 8 3 1
3 a 3 9 2
4 b 4 4 3
answer
id str
1 TRUE
2 FALSE
3 YES
4 NO
5 LESS
6 MORE
7 GREATER
8 LESS
9 NONE
10 DAILY
result
name q1 q2 q3
1 a TRUE YES GREATER
2 a LESS YES TRUE
3 a YES NONE FALSE
4 b NO NO YES
Upvotes: 2
Views: 455
Reputation: 28441
Or use indexing:
data[-1] <- sapply(data[-1], function(x) answer$str[x])
# name q1 q2 q3
# 1 a TRUE YES GREATER
# 2 a LESS YES TRUE
# 3 a YES NONE FALSE
# 4 b NO NO YES
Larger tasks can be broken down to simplified examples to test methods. Create a vector with q1
values only. v <- c(1,8,3,4)
If we can replace these four, it is quite possible to scale the operation:
answer$str[v]
[1] TRUE LESS YES NO
This creates the first question column. The remainder of the code is repeating that process for each column.
Edit
A quicker way without sapply
. It will work as long as the lookup list is in order and is non-repeating:
data[-1] <- answer$str[as.matrix(data[-1])]
# name q1 q2 q3
# 1 a TRUE YES GREATER
# 2 a LESS YES TRUE
# 3 a YES NONE FALSE
# 4 b NO NO YES
Upvotes: 1
Reputation: 887058
We can match
the elements of the dataset ('df1', without the 'name' column) with the 'id' from 'answer' to get the numeric index (in this case we don't need match
. In general, it may be safer to use match
) and get the corresponding 'str'.
df1[-1] <- answer$str[match(as.matrix(df1[-1]), answer$id)]
df1
# name q1 q2 q3
#1 a TRUE YES GREATER
#2 a LESS YES TRUE
#3 a YES NONE FALSE
#4 b NO NO YES
Or use lookup
from qdapTools
which can take key/value
columns as a 'data.frame' (ie. 'answer') and get the matching values
library(qdapTools)
df1[-1] <- lookup(unlist(df1[-1]), answer)
Or
df1[-1] <- with(answer, setNames(str, id))[as.character(unlist(df1[-1]))]
Upvotes: 5