DaniCee
DaniCee

Reputation: 3207

merge 2 data frames in a loop for each column in one of them

I have 2 data frames like this:

mydf <- data.frame(a=sample(1:10, 20, replace=T), b=sample(1:10, 20, replace=T), c=sample(1:10, 20, replace=T))
mydf
    a  b  c
1  10  2  5
2   9  3 10
3   5  3  6
4   5  7  8
5   9  4  3
6  10 10  2
7   6 10  7
8   9  9  5
9   7  5  8
10 10  3  2
11  9 10  4
12  3  4  7
13  7  6  5
14  5  9  7
15  9  9  9
16  5  5  2
17  9  8  4
18  1  9  9
19  7  8  7
20  2 10  7
codetable <- data.frame(code=1:10, translation=LETTERS[1:10])
codetable
   code translation
1     1           A
2     2           B
3     3           C
4     4           D
5     5           E
6     6           F
7     7           G
8     8           H
9     9           I
10   10           J

I do not know beforehand how many columns the first data frame might have...

What I want to do here is replace the numbers in mydf with the corresponding translation letters, guided by codetable.

I thought of merging the 2 data frames for each column in mydf, but there might be a better solution with apply.

This is my attempt:

for (n in dim(mydf)[2]){
  mydf <- merge(mydf, codetable, by.x=names(mydf)[n], by.y='code')
 }
 mydf
   c  a  b translation
1  1  8  1           A
2  1  6  9           A
3  2  3  7           B
4  2  3 10           B
5  2  1 10           B
6  3 10  7           C
7  3  7  9           C
8  4  7 10           D
9  5 10  3           E
10 5  9  2           E
11 6  9  4           F
12 7  8  6           G
13 7  6  4           G
14 7  4  6           G
15 8  5  6           H
16 8  9  5           H
17 8  4  9           H
18 8  3  4           H
19 8  3  6           H
20 9  5  6           I

But the problem I have is that instead of 3 translation columns for a, b, and c, I end up with one...

How can I do this in an efficient way that actually works? Thanks!

Upvotes: 1

Views: 28

Answers (3)

Maurits Evers
Maurits Evers

Reputation: 50668

Similar to Ronak's answer but using dplyr::mutate_all

library(dplyr)
mydf %>% mutate_all(funs(codetable$translation[match(., codetable$code)]))

Upvotes: 1

BENY
BENY

Reputation: 323226

Using stack + unstack with match

mydf = stack(mydf)
mydf$values = codetable$translation[match(mydf$values,codetable$code)]
mydf = unstack(mydf)
mydf
   a b c
1  A D G
2  E C H
3  G G D
4  G H I
5  A D J
6  C F E
7  I A J
8  E E H
9  B F F
10 B I G
11 F B G
12 A H H
13 B D C
14 C H G
15 I H D
16 F B G
17 G I D
18 G A I
19 F G A
20 J A D

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388817

We could use lapply over all the columns of mydf and match the value in mydf with that code column in codetable and get corresponding translation.

mydf[] <- lapply(mydf, function(x) codetable$translation[match(x, codetable$code)])

mydf
#   a b c
#1  H F F
#2  H F J
#3  A I I
#4  H G C
#5  A E G
#6  G D H
#7  J G C
#8  H B G
#9  J I F
#10 C I B
#.....

This could be extended to any number of columns without any change of code.

data

set.seed(234)
mydf <- data.frame(a=sample(1:10, 20, replace=T), 
                   b=sample(1:10, 20, replace=T), 
                   c=sample(1:10, 20, replace=T))

Upvotes: 3

Related Questions