group413
group413

Reputation: 159

How to merge data from different tables in R

Imagine that one table has a column in which there are different numbers from one to ten. At another table there is a decoding what these numbers mean. How to add a column in the first table with interpretation of figures? The figures are not in order and repeat.

Upvotes: 1

Views: 98

Answers (3)

Saatetyi
Saatetyi

Reputation: 41

I think merge is what you need! Check this sample:

> d1<-data.frame(cbind(c(1,2,3,2,4), c("aa", "bb", "cc", "dd", "ee")))
> d2<-data.frame(cbind(c(1,2,3,4,5), c("one", "two", "three", "four", "five")))
> d3<-merge(d1, d2, by.x="X1", by.y="X1")
> d3

Upvotes: 1

Matthew Lundberg
Matthew Lundberg

Reputation: 42629

merge does this easily. The problem with using merge is that it changes the order of the data. If that's a problem, you can easily create a new factor with the proper values.

Here's an example. Say A is your first table, B is the lookups.

A <- data.frame(X=sample(10, 10, replace=T))
B <- data.frame(X=1:10, Y=LETTERS[11:20])

A$X
##  [1] 3 6 3 2 7 6 3 9 6 3

The lookups, in order of 1 to 10:

B$Y
##  [1] K L M N O P Q R S T
## Levels: K L M N O P Q R S T

Now create a new factor in A with the matching labels from B$Y:

A$Y <- factor(A$X, levels=B$X, labels=B$Y)

A
##    X Y
## 1  3 M
## 2  6 P
## 3  3 M
## 4  2 L
## 5  7 Q
## 6  6 P
## 7  3 M
## 8  9 S
## 9  6 P
## 10 3 M

Comparison to Merge

Here's what merge will do. Note that the numbers line up, there are the correct number of each, but the order is different. For most uses, this won't matter, but the change in order is the point of this answer.

A$Y <- NULL # Remove factor added above
merge(A, B)
##    X Y
## 1  2 L
## 2  3 M
## 3  3 M
## 4  3 M
## 5  3 M
## 6  6 P
## 7  6 P
## 8  6 P
## 9  7 Q
## 10 9 S

merge sorts by default. So what if we just don't sort? Nope, that doesn't help either. The unique values of X are all grouped together.

merge(A, B, sort=FALSE)
##    X Y
## 1  3 M
## 2  3 M
## 3  3 M
## 4  3 M
## 5  6 P
## 6  6 P
## 7  6 P
## 8  2 L
## 9  7 Q
## 10 9 S

Upvotes: 2

Troy
Troy

Reputation: 8691

x<-data.frame(n=sample(1:10,20,T))
y<-data.frame(n=sample(1:10,10),L=sample(LETTERS,10))
merge(x,y)
n L
1   1 G
2   1 G
3   2 A
4   3 B
5   3 B
6   4 S
7   5 J
8   5 J
9   6 I
10  6 I
11  6 I
12  8 L
13  9 O
14  9 O
15  9 O
16  9 O
17  9 O
18  9 O
19 10 V
20 10 V

Upvotes: 1

Related Questions