user2822260
user2822260

Reputation:

Merging columns of dataset when they have diff number of rows

I need to 'merge' two different data.frames with one another of unequal size but with the same unique identifier (ID) and I want to retain the # of rows of the larger data.frame.

More importantly, I want the value of variable x in data.frame.1 (the larger one) to be summed for each unique ID such that in data.frame.3 (the merged dataset) each observation for variable x is the sum of the observations with the same unique identifier originally found in data.frame.1.

Essentially, I want my merged dataset to have the row dimensions of my smaller dataset (data.frame.2) -i.e. same # of observations -but I want the column from the larger df (data.frame.1) merged to the column of the smaller df (data.frame.2) and I want its values aggregated like stated above (sum).

I hope this is clear so the charts below make it more clear: there are three total Unique ID's (a,b,c) but in data.frame.1 these repeated -i want these repeated values summed when the merger takes place.

ID x data.frame.1
a 1 
a 8 
a 10 
b 2 
b 1 
c 4 

ID y data.frame.2
a 3 
b 7 
c 9 

ID y x data.frame.3
a 3 19
b 7 3
c 9 4

Upvotes: 1

Views: 1038

Answers (1)

David Arenburg
David Arenburg

Reputation: 92302

data.frame1 <- data.frame(ID = c(rep("a",3), rep("b",2), "c"),
                         x = c(1,8,10,2,1,4))
data.frame2 <- data.frame(ID = c("a", "b", "c"),
                         y = c(3, 7, 9))

data.frame1 <- aggregate(x ~ ID, data.frame1, sum)
data.frame3 <- merge(data.frame2, data.frame1, by = "ID")

Upvotes: 1

Related Questions