Reputation: 3207
I have 2 data frames like this:
mydf <- data.frame(a=sample(1:10, 20, replace=T), b=sample(1:10, 20, replace=T), c=sample(1:10, 20, replace=T))
mydf
a b c
1 10 2 5
2 9 3 10
3 5 3 6
4 5 7 8
5 9 4 3
6 10 10 2
7 6 10 7
8 9 9 5
9 7 5 8
10 10 3 2
11 9 10 4
12 3 4 7
13 7 6 5
14 5 9 7
15 9 9 9
16 5 5 2
17 9 8 4
18 1 9 9
19 7 8 7
20 2 10 7
codetable <- data.frame(code=1:10, translation=LETTERS[1:10])
codetable
code translation
1 1 A
2 2 B
3 3 C
4 4 D
5 5 E
6 6 F
7 7 G
8 8 H
9 9 I
10 10 J
I do not know beforehand how many columns the first data frame might have...
What I want to do here is replace the numbers in mydf
with the corresponding translation
letters, guided by codetable
.
I thought of merging the 2 data frames for each column in mydf
, but there might be a better solution with apply
.
This is my attempt:
for (n in dim(mydf)[2]){
mydf <- merge(mydf, codetable, by.x=names(mydf)[n], by.y='code')
}
mydf
c a b translation
1 1 8 1 A
2 1 6 9 A
3 2 3 7 B
4 2 3 10 B
5 2 1 10 B
6 3 10 7 C
7 3 7 9 C
8 4 7 10 D
9 5 10 3 E
10 5 9 2 E
11 6 9 4 F
12 7 8 6 G
13 7 6 4 G
14 7 4 6 G
15 8 5 6 H
16 8 9 5 H
17 8 4 9 H
18 8 3 4 H
19 8 3 6 H
20 9 5 6 I
But the problem I have is that instead of 3 translation
columns for a
, b
, and c
, I end up with one...
How can I do this in an efficient way that actually works? Thanks!
Upvotes: 1
Views: 28
Reputation: 50668
Similar to Ronak's answer but using dplyr::mutate_all
library(dplyr)
mydf %>% mutate_all(funs(codetable$translation[match(., codetable$code)]))
Upvotes: 1
Reputation: 323226
Using stack
+ unstack
with match
mydf = stack(mydf)
mydf$values = codetable$translation[match(mydf$values,codetable$code)]
mydf = unstack(mydf)
mydf
a b c
1 A D G
2 E C H
3 G G D
4 G H I
5 A D J
6 C F E
7 I A J
8 E E H
9 B F F
10 B I G
11 F B G
12 A H H
13 B D C
14 C H G
15 I H D
16 F B G
17 G I D
18 G A I
19 F G A
20 J A D
Upvotes: 1
Reputation: 388817
We could use lapply
over all the columns of mydf
and match
the value in mydf
with that code
column in codetable
and get corresponding translation
.
mydf[] <- lapply(mydf, function(x) codetable$translation[match(x, codetable$code)])
mydf
# a b c
#1 H F F
#2 H F J
#3 A I I
#4 H G C
#5 A E G
#6 G D H
#7 J G C
#8 H B G
#9 J I F
#10 C I B
#.....
This could be extended to any number of columns without any change of code.
data
set.seed(234)
mydf <- data.frame(a=sample(1:10, 20, replace=T),
b=sample(1:10, 20, replace=T),
c=sample(1:10, 20, replace=T))
Upvotes: 3